ACM Home Page
Please provide us with feedback. Feedback
Incremental clustering for dynamic information processing
Full text PdfPdf (1.49 MB)
Source ACM Transactions on Information Systems (TOIS) archive
Volume 11 ,  Issue 2  (April 1993) table of contents
Pages: 143 - 164  
Year of Publication: 1993
ISSN:1046-8188
Author
Fazli Can  Miami Univ., Oxford, OH
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 32,   Downloads (12 Months): 180,   Citation Count: 24
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/130226.134466
What is a DOI?

ABSTRACT

Clustering of very large document databases is useful for both searching and browsing. The periodic updating of clusters is required due to the dynamic nature of databases. An algorithm for incremental clustering is introduced. The complexity and cost analysis of the algorithm together with an investigation of its expected behavior are presented. Through empirical testing it is shown that the algorithm achieves cost effectiveness and generates statistically valid clusters that are compatible with those of reclustering. The experimental evidence shows that the algorithm creates an effective and efficient retrieval environment.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
ANDERBERG, M.R. Clustering Analysis for Applications. Academic Press, New York, 1973.
 
2
3
 
4
CAN, F. Validation of clustering structures in information retrieval. In Proceedings of the Canadian Conference on Electrical and Computer Engineering (Montreal, Que., Sept., 1989). EIC, Montreal, Que., 1989, 572-575.
 
5
CAN, F. Experiments of incremental clustering. Working Paper 91-002, Dept. of Systems Analysis, Miami Univ., Oxford, Oh., Aug. 1991.
 
6
CAN, F. On the efficiency of best-match cluster searches. (Submitted for publication.)
 
7
CAN, F., AND DROCHAK II, N.D. Incremental clustering for dynamic document databases. In Proceedtngs of the 1990 Symposium on Applied Computing (Fayetteville, Ark., April 1990). IEEE, Los Alamitos, Calif., 1990, 61 67.
8
 
9
10
 
11
DIALOG. Dialog Database Catalog. Dialog Information Services Inc., 1989.
 
12
13
 
14
GRIFFITHS, h., LUCKHURST, C., AND WILLETT, P. Using interdocument similarity information in document retrieval systems. J. Am. Soc. Inf. Sci. 37, i (1986), 3 11.
 
15
HALL, J. L. Online Bibliographic Databases: A Directory and Sourcebook, 4th ed. Aslib, Great Britain, 1986.
 
16
 
17
HODGES, J. L., ANn LEHMANN, E.L. Basic Concepts of Probabd~ty and Statistics. Holden-Day, San Fransisco, Calif., 1964.
 
18
 
19
J^RDINE, N., ~D V~a'~ RIJSBERGEN, C.J. The use of hierarchical clustering in information retrieval. Inf. Storage RetrLeval 7 (1971), 217-240.
 
20
 
21
 
22
 
23
24
25
 
26
 
27
28
 
29
30
31

CITED BY  24


REVIEW

"Caroline Merriam Eastman : Reviewer"

An algorithm that incrementally revises clusters of documents as additions are made to a database is described. It is based on the use of the cover coefficient concept to measure similarities among documents. The terminology used i  more...