| Knowledge transfer via multiple model local structure mapping |
| Full text |
Pdf
(424 KB)
|
Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Las Vegas, Nevada, USA
SESSION: Research papers
table of contents
Pages 283-291
Year of Publication: 2008
ISBN:978-1-60558-193-4
|
|
Authors
|
|
Jing Gao
|
University of Illinois, Urbana-Champaign, Urbana, IL, USA
|
|
Wei Fan
|
IBM T.J. Watson Resear h Center, Hawthorn, NY, USA
|
|
Jing Jiang
|
University of Illinois, Urbana-Champaign, Urbana, IL, USA
|
|
Jiawei Han
|
University of Illinois, Urbana-Champaign, Urbana, IL, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 16, Downloads (12 Months): 321, Citation Count: 3
|
|
|
ABSTRACT
The effectiveness of knowledge transfer using classification algorithms depends on the difference between the distribution that generates the training examples and the one from which test examples are to be drawn. The task can be especially difficult when the training examples are from one or several domains different from the test domain. In this paper, we propose a locally weighted ensemble framework to combine multiple models for transfer learning, where the weights are dynamically assigned according to a model's predictive power on each test example. It can integrate the advantages of various learning algorithms and the labeled information from multiple training domains into one unified classification model, which can then be applied on a different domain. Importantly, different from many previously proposed methods, none of the base learning method is required to be specifically designed for transfer learning. We show the optimality of a locally weighted ensemble framework as a general approach to combine multiple models for domain transfer. We then propose an implementation of the local weight assignments by mapping the structures of a model onto the structures of the test domain, and then weighting each model locally according to its consistency with the neighborhood structure around the test example. Experimental results on text classification, spam filtering and intrusion detection data sets demonstrate significant improvements in classification accuracy gained by the framework. On a transfer learning task of newsgroup message categorization, the proposed locally weighted ensemble framework achieves 97% accuracy when the best single model predicts correctly only on 73% of the test examples. In summary, the improvement in accuracy is over 10% and up to 30% across different problems.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira. Analysis of representations for domain adaptation. In Proc. of NIPS' 07, pages 137--144. 2007.
|
| |
4
|
|
 |
5
|
|
| |
6
|
A.J. Carlson, C.M. Cumby, J.L.R. Nicholas D.Rizzolo, and D.Roth. Snow learning architecture. http://l2r.cs.uiuc.edu/~cogcomp/asoftware.php?skey=SNOW#projects.
|
| |
7
|
|
| |
8
|
C.-C. Chang and C.-J. Lin. Libsvm: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
|
 |
9
|
Wenyuan Dai , Gui-Rong Xue , Qiang Yang , Yong Yu, Co-clustering based classification for out-of-domain documents, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
[doi> 10.1145/1281192.1281218]
|
 |
10
|
Wenyuan Dai , Qiang Yang , Gui-Rong Xue , Yong Yu, Boosting for transfer learning, Proceedings of the 24th international conference on Machine learning, p.193-200, June 20-24, 2007, Corvalis, Oregon
[doi> 10.1145/1273496.1273521]
|
| |
11
|
H. Daumé and D. Marcu. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research, 26:101--126, 2006.
|
| |
12
|
|
 |
13
|
|
| |
14
|
W. Fan and I. Davidson. On sample selection bias and its efficient correction via model averaging and unlabeled examples. In Proc. of SDM'07.
|
| |
15
|
|
| |
16
|
A. Genkin, D. D. Lewis, and D. Madigan. Bbr: Bayesian logistic regression software. http://stat.rutgers.edu/~madigan/BBR/.
|
| |
17
|
J. Hoeting, D. Madigan, A. Raftery, and C. Volinsky. Bayesian model averaging: a tutorial. Statist. Sci., 14:382--417, 1999.
|
| |
18
|
J. Huang, A. J. Smola, A. Gretton, K. M. Borgwardt, and B. Scholkopf. Correcting sample selection bias by unlabeled data. In Proc. of NIPS' 06, pages 601--608. 2007.
|
| |
19
|
|
| |
20
|
|
| |
21
|
G. Karypis. Cluto - family of data clustering software tools. http://glaros.dtc.umn.edu/gkhome/views/cluto.
|
| |
22
|
X. Li and J. Bilmes. A Bayesian divergence prior for classifier adaptation. In Proc. of AISTATS' 07, 2007.
|
| |
23
|
D.M. Roy and L.P. Kaelbling. Efficient bayesian task-level transfer learning. In Proc. of IJCAI '07.
|
| |
24
|
|
| |
25
|
H. Shimodaira. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 90(2):227--244, 2000.
|
| |
26
|
A. Storkey and M. Sugiyama. Mixture regression for covariate shift. In Proc. of NIPS' 06, pages 1337--1344.
|
 |
27
|
|
| |
28
|
X. Zhu. Semi-supervised learning literature survey. Technical Report 1530, Computer Sciences, University of Wisconsin-Madison, 2005.
|
CITED BY 3
|
|
Sihong Xie , Wei Fan , Jing Peng , Olivier Verscheure , Jiangtao Ren, Latent space domain transfer between high dimensional overlapping distributions, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
Erheng Zhong , Wei Fan , Jing Peng , Kun Zhang , Jiangtao Ren , Deepak Turaga , Olivier Verscheure, Cross domain distribution adaptation via kernel mapping, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, June 28-July 01, 2009, Paris, France
|
|
|
|
|