|
ABSTRACT
Classification is a well-established operation in text mining. Given a set of labels A and a set DA of training documents tagged with these labels, a classifier learns to assign labels to unlabeled test documents. Suppose we also had available a different set of labels B, together with a set of documents DB marked with labels from B. If A and B have some semantic overlap, can the availability of DB help us build a better classifier for A, and vice versa? We answer this question in the affirmative by proposing cross-training: a new approach to semi-supervised learning in presence of multiple label sets. We give distributional and discriminative algorithms for cross-training and show, through extensive experiments, that cross-training can discover and exploit probabilistic relations between two taxonomies for more accurate classification.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
J. Baxter. A model of inductive bias learning. Journal of Artificial Intelligence Research, 12:149--198, 2000. http://www-2.cs.cmu.edu/afs/cs/project/jair/pub/volume12/baxterOOa.pdf.
|
 |
3
|
|
| |
4
|
|
| |
5
|
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, B(39):1--38, 1977.
|
 |
6
|
AnHai Doan , Jayant Madhavan , Pedro Domingos , Alon Halevy, Learning to map between ontologies on the semantic web, Proceedings of the 11th international conference on World Wide Web, May 07-11, 2002, Honolulu, Hawaii, USA
[doi> 10.1145/511446.511532]
|
 |
7
|
Susan Dumais , John Platt , David Heckerman , Mehran Sahami, Inductive learning algorithms and representations for text categorization, Proceedings of the seventh international conference on Information and knowledge management, p.148-155, November 02-07, 1998, Bethesda, Maryland, United States
[doi> 10.1145/288627.288651]
|
| |
8
|
|
| |
9
|
Wen-Syan Li , Quoc Vu , Divakant Agrawal , Yoshinori Hara , Hajime Takano, PowerBookmarks: a system for personalizable Web information organization, sharing, and management, Computer Networks: The International Journal of Computer and Telecommunications Networking, v.31 n.11-16, p.1375-1389, May 17, 1999
|
| |
10
|
|
| |
11
|
|
| |
12
|
A. McCallum. Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering. Software available from http://www.cs.cmu.edu/~mccallum/bow/,1998.
|
| |
13
|
A. McCallum and K. Nigam. A comparison of event models for naive Bayes text classification. In AAAI/ICML-98 Workshop on Learning for Text Categorization, pages 41--48. AAAI Press, 1998. Online at http://www.cs.cmu.edu/~knigam/.
|
| |
14
|
K. Nigam, J. Lafferty, and A. McCallum. Using maximum entropy for text classification. In IJCAI-99 Workshop on Machine Learning for Information Filtering, pages 61--67, 1999. See http://www.cs.cmu.edu/~knigam/ and http://www.cs.cmu.edu/~mccallum/papers/maxent-ijcaiws99.ps.gz.
|
| |
15
|
|
| |
16
|
J. Platt. Sequential minimal optimization: A fast algorithm for training support vector machines. Technical Report MSR-TR-98-14, Microsoft Research, 1998. Online at http://www.research.microsoft.com/users/jplatt/smoTR.pdf.
|
| |
17
|
E. S. Ristad. A natural law of succession. Research report CS-TR-495-95, Princeton University, July 1995.
|
| |
18
|
S. Thrun and J. O'Sullivan. Discovering structure in multiple learning tasks: The TC algorithm. In L. Saitta, editor, Proceedings of the 13th International Conference on Machine Learning ICML-96, San Mateo, CA, 1996. Morgen Kaufmann.
|
 |
19
|
|
 |
20
|
|
|