| On updates that constrain the features' connections during learning |
| Full text |
Pdf
(305 KB)
|
Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Las Vegas, Nevada, USA
SESSION: Research papers
table of contents
Pages 515-523
Year of Publication: 2008
ISBN:978-1-60558-193-4
|
|
Authors
|
|
Omid Madani
|
SRI International, Menlo Park, CA, USA
|
|
Jian Huang
|
Pennsylvania State University, University Park, PA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 109, Citation Count: 1
|
|
|
ABSTRACT
In many multiclass learning scenarios, the number of classes is relatively large (thousands,...), or the space and time efficiency of the learning system can be crucial. We investigate two online update techniques especially suited to such problems. These updates share a sparsity preservation capacity: they allow for constraining the number of prediction connections that each feature can make. We show that one method, exponential moving average, is solving a "discrete" regression problem for each feature, changing the weights in the direction of minimizing the quadratic loss. We design the other method to improve a hinge loss subject to constraints, for better accuracy. We empirically explore the methods, and compare performance to previous indexing techniques, developed with the same goals, as well as other online algorithms based on prototype learning. We observe that while the classification accuracies are very promising, improving over previous indexing techniques, the scalability benefits are preserved.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
B. D. Davison and H. Hirsh. Predicting sequences of user actions. In AAAI-98/ICML'98 Workshop on Predicting the Future: AI Approaches to Time Series Analysis, 1998.
|
| |
4
|
|
 |
5
|
|
| |
6
|
B. S. Everitt. Cambridge Dictionary of Statistics. Cambridge University Press, 2nd edition edition, 2003.
|
| |
7
|
D. A. Forsyth and J. Ponce. Computer Vision. Prentice Hall, 2003.
|
| |
8
|
D. A. Forsyth and J. Ponce. Computer Vision. Prentice Hall, 2003.
|
| |
9
|
C. Genest and J. V. Zidek. Combining probability distributions: A critique and an annotated bibliography. Statistical Science, 1(1):114--148, 1986.
|
| |
10
|
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer-Verlag, 2001.
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
K. Lang. Newsweeder: Learning to filter netnews. In Proceedings of the Twelfth International Conference on Machine Learning, pages 331--339, 1995.
|
| |
15
|
|
 |
16
|
Tie-Yan Liu , Yiming Yang , Hao Wan , Hua-Jun Zeng , Zheng Chen , Wei-Ying Ma, Support vector machines classification with a very large-scale taxonomy, ACM SIGKDD Explorations Newsletter, v.7 n.1, p.36-43, June 2005
[doi> 10.1145/1089815.1089821]
|
| |
17
|
O. Madani. Exploring massive learning via a prediction system. In AAAI Fall Symposium Series: Computational Approaches to Representation Change During Learning and Development, 2007.
|
| |
18
|
O. Madani and M. Connor. Large-scale many-class learning. In SIAM Conference on Data Mining (SDM), 2008.
|
| |
19
|
O. Madani, W. Greiner, D. Kempe, and M. Salavatipour. Recall systems: Efficient learning and use of category indices. In AISTATS, 2007.
|
| |
20
|
O. Madani and J. Huang. On updates that constrain the features? connections during learning. Technical report, SRI International, AI Center, 2008. In preparation.
|
| |
21
|
C. Mesterharm. A multi-class linear learning algorithm related to Winnow. In NIPS, 2000.
|
| |
22
|
J. Rennie, L. Shih, J. Teevan, and D. Karger. Tackling the poor assumption of Naive Bayes text classifiers. In ICML, 2003.
|
| |
23
|
|
| |
24
|
R. Rosenfeld. Two decades of statistical language modeling: Where do we go from here? IEEE, 88(8), 2000.
|
 |
25
|
|
| |
26
|
|
|