ACM Home Page
Please provide us with feedback. Feedback
Effective label acquisition for collective classification
Full text MovMov (25:27),  PdfPdf (856 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Las Vegas, Nevada, USA
SESSION: Research papers table of contents
Pages 43-51  
Year of Publication: 2008
ISBN:978-1-60558-193-4
Authors
Mustafa Bilgic  University of Maryland, College Park, MD, USA
Lise Getoor  University of Maryland, College Park, MD, USA
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 69,   Downloads (12 Months): 380,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1401890.1401901
What is a DOI?

ABSTRACT

Information diffusion, viral marketing, and collective classification all attempt to model and exploit the relationships in a network to make inferences about the labels of nodes. A variety of techniques have been introduced and methods that combine attribute information and neighboring label information have been shown to be effective for collective labeling of the nodes in a network. However, in part because of the correlation between node labels that the techniques exploit, it is easy to find cases in which, once a misclassification is made, incorrect information propagates throughout the network. This problem can be mitigated if the system is allowed to judiciously acquire the labels for a small number of nodes. Unfortunately, under relatively general assumptions, determining the optimal set of labels to acquire is intractable. Here we propose an acquisition method that learns the cases when a given collective classification algorithm makes mistakes, and suggests acquisitions to correct those mistakes. We empirically show on both real and synthetic datasets that this method significantly outperforms a greedy approximate inference approach, a viral marketing approach, and approaches based on network structural measures such as node degree and network clustering. In addition to significantly improving accuracy with just a small amount of labeled data, our method is tractable on large networks.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
M. Bilgic and L. Getoor. VOILA: Efficient feature-value acquisition for classification. In AAAI, 2007.
2
 
3
D. A. Cohn, Z. Ghahramani, and M. I. Jordan. Active learning with statistical models. Journal of Artificial Intelligence Research, 4:129--145, 1996.
4
 
5
W. R. Gilks, S. Richardson, and D. J. Spiegelhalter. Markov Chain Monte Carlo in Practice. Interdisciplinary Statistics. Chapman & Hall/CRC, 1996.
 
6
R. A. Howard. Information value theory. IEEE Transactions on Systems Science and Cybernetics, 2(1):22--26, 1966.
7
 
8
M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul. An introduction to variational methods for graphical models. Machine Learning, 1999.
9
 
10
A. Krause and C. Guestrin. Optimal nonmyopic value of information in graphical models - efficient algorithms and theoretical limits. In IJCAI, 2005.
11
12
 
13
Q. Lu and L. Getoor. Link based classification. In ICML, 2003.
 
14
S. Macskassy and F. Provost. A simple relational classifier. In Workshop on Multi-Relational Data Mining in conj. with KDD (MRDM), 2003.
 
15
 
16
 
17
 
18
L. McDowell, K. M. Gupta, and D. W. Aha. Cautious inference in collective classification. In AAAI, 2007.
 
19
J. Neville and D. Jensen. Iterative classification in relational data. In SRL Workshop in AAAI, 2000.
 
20
M. E. J. Newman. Mixing patterns in networks. Physical Review E, 67(2):026126, Feb 2003.
21
 
22
23
 
24
P. Sen, G. M. Namata, M. Bilgic, L. Getoor, B. Gallagher, and T. Eliassi-Rad. Collective classification in network data. Technical Report CS-TR-4905, University of Maryland, College Park, 2008.
25
 
26
B. Taskar, P. Abbeel, and D. Koller. Discriminative probabilistic models for relational data. In UAI, 2002.
 
27
 
28
J. Yedidia, W.T.Freeman, and Y. Weiss. Generalized belief propagation. In NIPS, 2000.
29

Collaborative Colleagues:
Mustafa Bilgic: colleagues
Lise Getoor: colleagues