ACM Home Page
Please provide us with feedback. Feedback
Information theoretic regularization for semi-supervised boosting
Full text MovMov (12:10),  PdfPdf (754 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Paris, France
SESSION: Research track papers table of contents
Pages 1017-1026  
Year of Publication: 2009
ISBN:978-1-60558-495-9
Authors
Lei Zheng  Wright State University, Dayton, OH, USA
Shaojun Wang  Wright State University, Dayton, OH, USA
Yan Liu  Wright State University, Dayton, OH, USA
Chi-Hoon Lee  Yahoo! Lab, Santa Clara, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 49,   Downloads (12 Months): 162,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557019.1557129
What is a DOI?

ABSTRACT

We present novel semi-supervised boosting algorithms that incrementally build linear combinations of weak classifiers through generic functional gradient descent using both labeled and unlabeled training data. Our approach is based on extending information regularization framework to boosting, bearing loss functions that combine log loss on labeled data with the information-theoretic measures to encode unlabeled data. Even though the information-theoretic regularization terms make the optimization non-convex, we propose simple sequential gradient descent optimization algorithms, and obtain impressively improved results on synthetic, benchmark and real world tasks over supervised boosting algorithms which use the labeled data alone and a state-of-the-art semi-supervised boosting algorithm.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
D. Bertsekas. Nonlinear Programming, 2nd Edition, Athena Scientific, 1999.
3
 
4
 
5
 
6
V. Castelli and T. Cover. The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter. IEEE Trans. on Information Theory, 42(6):2102--2117, 1996.
 
7
 
8
O. Chapelle, B. Scholköpf and A. Zien. Semi-Supervised Learning, MIT Press, 2006.
 
9
K. Chen and S. Wang. Regularized boost for semi-supervised learning. Advances in Neural Information Processing Systems 20, 2007.
 
10
I. Cohen and F. Cozman. Risks of semi-supervised learning. Semi-Supervised Learning, O. Chapelle, B. Scholköpf and A. Zien,55--70, MIT Press, 2006.
 
11
 
12
A. Corduneanu and T. Jaakkola. Data dependent regularization. Semi-Supervised Learning, O. Chapelle, B. Scholköpf and A. Zien, 163--182, MIT Press, 2006.
 
13
 
14
F. d'Alché-Buc, Y. Grandvalet and C. Ambroise. Semi-supervised marginBoost. Advances in Neural Information Processing Systems 14, 553--560, 2002.
 
15
 
16
Y. Freund and R. Schapire. Experiments with a new boosting algorithm. The Thirteenth International Conference on Machine Learning, 148--156, 1996.
 
17
 
18
J. Friedman, T.Hastie and R. Tibshirani. Additive logistic regression: A statistical view of boosting. The Annals of Statistics, 28(2):337--407, 2000.
 
19
Y. Grandvalet and Y. Bengio. Semi-supervised learning by entropy minimization. Advances in Neural Information Processing Systems, 17:529--536, 2004.
20
 
21
T. Hastie, R. Tibshirani, J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition, Springer, 2009.
22
 
23
 
24
G. Lebanon and J. Lafferty. Boosting and maximum likelihood for exponential models. Advances in Neural Information Processing Systems 14, 447--454, 2002.
 
25
C. Lee, S. Wang, F. Jiao, D. Schuurmans and R. Greiner. Learning to model spatial dependency: Semi-supervised discriminative random fields. Advances in Neural Information Processing, 19, 793--800, 2007.
 
26
L. Mason, J. Baxter, P. Bartlett and M. Frean. Functional gradient techniques for combining hypotheses. In Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Scholköpf and D. Schuurmans, editors, 221--246, MIT Press, 2000.
 
27
 
28
S. Roberts, R. Everson and I. Rezek. Maximum certainty data partitioning. Pattern Recognition, 33(5):833--839, 2000.
 
29
 
30
 
31
Y. Wang, G. Haffari, S. Wang and G. Mori. Rate distortion based semi-supervised discriminative learning. Technical Report, 2009.
 
32
D. Zhou, O. Bousquet, T. Navin Lal, J. Weston and B. Schölkopf. Learning with local and global consistency. Advances in Neural Information Processing Systems, 16:321--328, 2004.
 
33
J. Zhu, S. Rosset, H. Zhou and T. Hastie. Multiclass AdaBoost. Technical Report, 2005.
 
34
X. Zhu, Z. Ghahramani and J. Lafferty. Semi-supervised learning using Gaussian fields and harmonic functions. The 20th International Conference on Machine Learning, 912--919, 2003.

Collaborative Colleagues:
Lei Zheng: colleagues
Shaojun Wang: colleagues
Yan Liu: colleagues
Chi-Hoon Lee: colleagues