ACM Home Page
Please provide us with feedback. Feedback
Efficient clustering of high-dimensional data sets with application to reference matching
Full text PdfPdf (274 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Boston, Massachusetts, United States
Pages: 169 - 178  
Year of Publication: 2000
ISBN:1-58113-233-6
Authors
Andrew McCallum  WhizBang! Labs - Research, 4616 Henry Street, Pittsburgh, PA and School of Computer Science, Carnegie Mellon University, Pittsburgh, PA
Kamal Nigam  School of Computer Science, Carnegie Mellon University, Pittsburgh, PA
Lyle H. Ungar  Computer and Info. Science, University of Pennsylvania, Philadelphia, PA
Sponsors
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
AAAI : Am Assoc for Artifical Intelligence
SIGART: ACM Special Interest Group on Artificial Intelligence
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 30,   Downloads (12 Months): 212,   Citation Count: 82
Additional Information:

references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/347090.347123
What is a DOI?

REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
H. Akaike. On entropy maximization principle. Applications of Statistics, pages 27-41, 1977.
 
2
M. R. Anderberg. Cluster Analysis for Application. Academic Press, 1973.
 
3
P. S. Bradley, U. Fayyad, and C. Reina. Scaling clustering algorithms to large databases. In Proc. 4th International Conf. on Knowledge Discovery and Data Mining (KDD-98). AAAI Press, August 1998.
 
4
I. P. Felligi and A. B. Sunter. A theory for record linkage. Journal of the American Statistical Society, 64:1183-1210, 1969.
5
6
7
 
8
H. Hirsh. Integrating mulitple sources of information in text classification using whril. In Snowbird Learning Conference, April 2000.
 
9
J. Hylton. Identifying and merging related bibliographic records. MIT LCS Masters Thesis, 1996.
 
10
B. Kilss and W. Alvey, editors. Record Linkage Techniques-1985, 1985. Statistics of Income Division, Internal Revenue Service Publication 1299-2-96. Available from http://www.fcsm.gov/.
 
11
 
12
A. K. McCallum. Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering. http://www.cs.cmu.edu/ mccallum/bow, 1996.
 
13
A. Monge and C. Elkan. The field-matching problem: algorithm and applications. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, August 1996.
 
14
A. Monge and C. Elkan. An efficient domain-independent algorithm for detecting approximately duplicate database records. In The proceedings of the SIGMOD 1997 workshop on data mining and knowledge discovery, May 1997.
 
15
 
16
H. B. Newcombe, J. M. Kennedy, S. J. Axford, and A. P. James. Automatic linkage of vital records. Science, 130:954-959, 1959.
 
17
S. Omohundro. Five balltree construction algorithms. Technical report 89-063, International Computer Science Institute, Berkeley, California, 1989.
 
18
K. Rose. Deterministic annealing for clustering, compression, classification, regression, and related optimization problems. Proceedings of the IEEE, 86(11):2210-2239, 1998.
 
19
 
20
M. Sankaran, S. Suresh, M. Wong, and D. Nesamoney. Method for incremental aggregation of dynamically increasing database data sets. U.S. Patent 5,794,246, 1998.
 
21
D. Sanko and J. B. Kruskal. Macromolecules: The Theory and Practice of Sequence Comparison. Addison-Wesley, 1983.
 
22
J. W. Tukey and J. O. Pedersen. Method and apparatus for information access employing overlapping clusters. U.S. Patent 5,787,422, 1998.
23

CITED BY  83

Collaborative Colleagues:
Andrew McCallum: colleagues
Kamal Nigam: colleagues
Lyle H. Ungar: colleagues