ACM Home Page
Please provide us with feedback. Feedback
Link mining: a new data mining challenge
Full text PdfPdf (564 KB)
Source ACM SIGKDD Explorations Newsletter archive
Volume 5 ,  Issue 1  (July 2003) table of contents
COLUMN: Position papers on MRDM table of contents
Pages: 84 - 89  
Year of Publication: 2003
ISSN:1931-0145
Author
Lise Getoor  University of Maryland, College Park, MD
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 24,   Downloads (12 Months): 262,   Citation Count: 21
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/959242.959253
What is a DOI?

ABSTRACT

A key challenge for data mining is tackling the problem of mining richly structured datasets, where the objects are linked in some way. Links among the objects may demonstrate certain patterns, which can be helpful for many data mining tasks and are usually hard to capture with traditional statistical models. Recently there has been a surge of interest in this area, fueled largely by interest in web and hypertext mining, but also by interest in mining social networks, security and law enforcement data, bibliographic citations and epidemiological records.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
M. Bilenko and R. J. Mooney. On evaluation and training-set construction for duplicate detection. under review.
 
3
S. Chakrabarti. Mining the Web. Morgan Kaufman, 2002.
4
5
 
6
R. Chellappa and A. Jain. Markov random fields: theory and applications. Academic Press, Boston, 1993.
 
7
D. Cohn and T. Hofmann. The missing link - a probabilistic model of document content and hypertext connectivity. In Neural Information Processing Systems 13, 2001.
 
8
 
9
 
10
L. Dehaspe, H. Toivonen, and R. D. King. Finding frequent substructures in chemical compounds. In R. Agrawal, P. Stolorz, and G. Piatetsky-Shapiro, editors, 4th International Conference on Knowledge Discovery and Data Mining, pages 30--36. AAAI Press., 1998.
11
12
 
13
 
14
R. Feldman. Link analysis: Current state of the art. In KDD-02 Tutorial, 2002.
 
15
P. A. Flach and N. Lavrac. The role of feature construction in inductive rule learning. In Proc. of the ICML2000 workshop on Attribute-Value and Relational Learning: crossing the boundaries, 2000.
 
16
L. Getoor, N. Friedman, D. Koller, and A. Pfeffer. Learning probabilistic relational models. In S. Dzeroski and N. Lavrac, editors, Relational Data Mining, pages 307--335. Kluwer, 2001.
 
17
L. Getoor, N. Friedman, D. Koller, and B. Taskar. Learning probabilistic models with link uncertainty. Journal of Machine Learning Research, 2002.
 
18
L. Getoor and D. Jensen. Proc. AAAI-2000 Workshop on Learning Statistical Models from Relational Data. AAAI Press, 2000.
 
19
L. Getoor and D. Jensen. Proc. IJCAI 2003 Workshop on Learning Statistical Models from Relational Data. AAAI Press, 2003.
20
21
 
22
R. Hummel and S. Zucker. On the foundations of relaxation labeling processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(5):267--287, 1983.
 
23
 
24
D. Jensen. Statistical challenges to inductive inference in linked data. In Seventh International Workshop on Artificial Intelligence and Statistics, 1999.
 
25
D. Jensen and H. Goldberg. AAAI Fall Symposium on AI and Link Analysis. AAAI Press, 1998.
26
27
 
28
 
29
 
30
 
31
Q. Lu and L. Getoor. Link-based classification. In Proc. of ICML-03, 2003.
 
32
K. Murphy and Y. Weiss. Loopy belief propagation for approximate inference: an empirical study. In Proc. of UAI-99. Morgan Kaufman, 1999.
 
33
J. Neville and D. Jensen. Iterative classification in relational data. In Proc. AAAI-2000 Workshop on Learning Statistical Models from Relational Data. AAAI Press, 2000.
34
 
35
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bring order to the web. Technical report, Stanford University, 1998.
36
 
37
H. Pasula, B. Marthi, B. Milch, S. Russell, and I. Shpitser. Identity uncertainty and citation matching. In Advances in Neural Information Processing Systems 15 (NIPS2002). MIT Press, 2003.
 
38
A. Popescul, L. Ungar, S. Lawrence, and D. Pennock. Towards structural logistic regression: Combing relational and statistical learning. In KDD Workshop on Multi-Relational Data Mining, 2002.
 
39
 
40
M. Richardson and P. Domingos. The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank. In Advances in Neural Information Processing Systems 14. MIT Press, 2002.
 
41
S. Russell. Identity uncertainty. In Proc. of IFSA-01, Vancouver, 2001.
42
 
43
 
44
B. Taskar, P. Abbeel, and D. Koller. Discriminative probabilistic models for relational data. In Proc. of UAI-02, pages 485--492, Edmonton, Canada, 2002.
 
45
B. Taskar, E. Segal, and D. Koller. Probabilistic classification and clustering in relational data. In Proc. of IJCAI-01, 2001.
 
46
W. E. Winkler. Advanced methods for record linkage. Technical report, Statistical Research Division, U.S. Census Bureau, 1994.
 
47
W. E. Winkler. Methods for record linkage and bayesian networks. Technical report, Statistical Research Division, U.S. Census Bureau, 1994.
 
48

CITED BY  20