| Online learning of conditionally I.I.D. data |
| Full text |
Pdf
(158 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 69
archive
Proceedings of the twenty-first international conference on Machine learning
table of contents
Banff, Alberta, Canada
Page: 92
Year of Publication: 2004
ISBN:1-58113-828-5
|
|
Author
|
|
Daniil Ryabko
|
University of London, Egham Hill, Egham, Surrey, UK
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 5, Downloads (12 Months): 21, Citation Count: 1
|
|
|
ABSTRACT
In this work we consider the task of relaxing the i.i.d assumption in online pattern recognition (or classification), aiming to make existing learning algorithms applicable to a wider range of tasks. Online pattern recognition is predicting a sequence of labels based on objects given for each label and on examples (pairs of objects and labels) learned so far. Traditionally, this task is considered under the assumption that examples are independent and identically distributed. However, it turns out that many results of pattern recognition theory carry over under a much weaker assumption. Namely, under the assumption of conditional independence and identical distribution of objects only, while the only condition on the distribution of labels is that the rate of occurrence of each label should be above some positive threshold.We find a broad class of learning algorithms for which estimations of the probability of a classification error achieved under the classical i.i.d. assumption can be generalised to the similar estimates for the case of conditionally i.i.d. distributed examples.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Bottou L., LeCun Y. (2003). Large Scale Online Learning. Advances in Neural Information Processing Systems 16 (proceedings of NIPS 2003)
|
| |
3
|
Devroye L., Györfi G., Lugosi G (1996). A probabilistic theory of pattern recognition. New York: Springer.
|
| |
4
|
|
| |
5
|
Morvai G., Yakowitz S. J., Algoet P. (1997). Weakly Convergent Nonparametric Forecasting of Stationary Time Series IEEE Transactions on Information Theory, Vol. 43, No. 2, pp. 483--498.
|
| |
6
|
|
| |
7
|
Vapnik, V. and Chervonenkis, A. (1974) Ordered risk minimisation I. Automation and Remote Control, 35: 1226--1235.
|
| |
8
|
Vapnik, V. and Chervonenkis, A. (1974). Ordered risk minimisation II. Automation and Remote Control, 35: 1403--1412.
|
| |
9
|
Vapnik, V. and Chervonenkis, A. (1974) Theory of Pattern Recognition Nauka, Moscow. (in Russian); German translation: Theorie der Zeichenerkennung, Akademie Verlag, Berlin 1979.
|
| |
10
|
|
| |
11
|
|
|