ACM Home Page
Please provide us with feedback. Feedback
Connections between the lines: augmenting social networks with text
Full text MovMov (17:19),  PdfPdf (559 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Paris, France
SESSION: Research track papers table of contents
Pages 169-178  
Year of Publication: 2009
ISBN:978-1-60558-495-9
Authors
Jonathan Chang  Princeton University, Princeton, NJ, USA
Jordan Boyd-Graber  Princeton University, Princeton, NJ, USA
David M. Blei  Princeton University, Princeton, NJ, USA
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 66,   Downloads (12 Months): 215,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557019.1557044
What is a DOI?

ABSTRACT

Network data is ubiquitous, encoding collections of relationships between entities such as people, places, genes, or corporations. While many resources for networks of interesting entities are emerging, most of these can only annotate connections in a limited fashion. Although relationships between entities are rich, it is impractical to manually devise complete characterizations of these relationships for every pair of entities on large, real-world corpora.

In this paper we present a novel probabilistic topic model to analyze text corpora and infer descriptions of its entities and of relationships between those entities. We develop variational methods for performing approximate inference on our model and demonstrate that our model can be practically deployed on large corpora such as Wikipedia. We show qualitatively and quantitatively that our model can construct and annotate graphs of relationships and make useful predictions.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
E. Agichtein and L. Gravano. Querying text databases for efficient information extraction. Data Engineering, International Conference on, 0:113, 2003.
2
 
3
M. Banko, M. J. Cafarella, S. Soderland, M. Broadhead, and O. Etzioni. Open information extraction from the web. In IJCAI 2007, 2007.
4
 
5
6
 
7
A. Culotta, R. Bekkerman, and A. McCallum. Extracting social networks and contact information from email and the web. AAAI 2005, 2005.
 
8
D. Davidov, A. Rappoport, and M. Koppel. Fully unsupervised discovery of concept-specific relationships by web mining. In ACL, 2007.
 
9
C. Diehl, G. M. Namata, and L. Getoor. Relationship identification for social network discovery. In AAAI 2007, July 2007.
 
10
B. Efron. Estimating the error rate of a prediction rule: Improvement on cross-validation. Journal of the American Statistical Association, 78(382), 1983.
11
12
 
13
M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul. An introduction to variational methods for graphical models. Oct 1999.
 
14
S. Katrenko and P. Adriaans. Learning relations from biomedical corpora using dependency trees. Lecture Notes in Computer Science, 2007.
15
16
 
17
A. McCallum, A. Corrada-Emmanuel, and X. Wang. Topic and role discovery in social networks. IJCAI 2005, 2005.
18
 
19
E. Meeds, Z. Ghahramani, R. Neal, and S. Roweis. Modeling dyadic data with binary latent factors. NIPS 2007, 2007.
20
21
22
 
23
O. J. Nave. Nave's Topical Bible. Thomas Nelson, 2003.
24
 
25
M. E. J. Newman. Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 2006.
 
26
 
27
M. Rabbat, M. Figueiredo, and R. Nowak. Inferring network structure from co-occurrences. NIPS 2006, 2006.
 
28
29
 
30
M. Steyvers and T. Griffiths. Probabilistic topic models. Handbook of Latent Semantic Analysis, 2007.
 
31
L. Tanabe, N. Xie, L. H. Thom, W. Matten, and W. J. Wilbur. Genetag: a tagged corpus for gene/protein named entity recognition. BMC Bioinformatics, 6 Suppl 1, 2005.
 
32
B. Taskar, M.-F. Wong, P. Abbeel, and D. Koller. Link prediction in relational data. NIPS 2003, 2003.
33
 
34
S. Wasserman and P. Pattison. Logit models and logistic regressions for social networks: I. an introduction to markov graphs and p*. Psychometrika, 1996.
35

Collaborative Colleagues:
Jonathan Chang: colleagues
Jordan Boyd-Graber: colleagues
David M. Blei: colleagues