|
ABSTRACT
A new means of evaluating the cluster hypothesis is introduced and the results of such an evaluation are presented for four collections. The results of retrieval experiments comparing a sequential search, a cluster-based search, and a search of the clustered collection in which individual documents are scored against the query are also presented. These results indicate that while the absolute performance of a search on a particular collection is dependent on the pairwise similarity of the relevant documents, the relative effectiveness of clustered retrieval versus sequential retrieval is independent of this factor. However, retrieval of entire clusters in response to a query usually results in a poorer performance than retrieval of individual documents from clusters.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Salton, G., ed., (1971) The SMART Retrieval System. Prentice-Hall, Englewood Cliffs, N.J.
|
| |
2
|
|
| |
3
|
|
| |
4
|
Williamson, R.E., (1974) Real-time Document Retrieval. Ph.D. Thesis, Cornell University.
|
| |
5
|
Jardine, N. and van Rijsbergen, C. J., (1971) The Use of Hierarchic Clustering in Information Retrieval. Inform. Stor. ~ Retr., 7, 217- 240.
|
| |
6
|
|
| |
7
|
Ide, Eleanor Rose Cook, (1969) Relevance Feedback in an Automatic Document Retrieval System. Master Thesis, Cornell University. Report ISR-15 to the National Science Foundation.
|
| |
8
|
|
 |
9
|
|
| |
10
|
van Rijsbergen, C. J. and Sparck Jones, K., (1973) A Test for the Separation of Relevant and Non-relevant Documents in Experimental Retrieval Collections. Journal of Documentation, 29, 251-257.
|
| |
11
|
van Rijsbergen, C. J., (1974) Further Experiments with Hierarchic Clustering in Document Retrieval. Inform. Stor. ~ Retr., }tO, 1-14.
|
| |
12
|
van Rijsbergen, C. J. and Croft, W. B., (1975) Document Clustering: An Evaluation of Some Experiments with the Cranfield 1400 Collection. Inform. Proc. ~ Mana#ement, 11, 171- 182.
|
| |
13
|
Croft, W. B., (1980) A Model of Cluster Searching Based on Classification. Inform. Systems, 5, 189--195.
|
| |
14
|
Griffiths, Alan, Robinson, Lesley A., and Willett, Peter, (1984) Hierarchic Agglomerative Clustering Methods for Automatic Document Classification. Journal of Documentation, 40, 175-205.
|
CITED BY 27
|
|
|
|
|
|
|
|
S. T. Dumais , G. W. Furnas , T. K. Landauer , S. Deerwester , R. Harshman, Using latent semantic analysis to improve access to textual information, Proceedings of the SIGCHI conference on Human factors in computing systems, p.281-285, May 15-19, 1988, Washington, D.C., United States
|
|
|
Javed Aslam , Katya Pelekhov , Daniela Rus, Using star clusters for filtering, Proceedings of the ninth international conference on Information and knowledge management, p.306-313, November 06-11, 2000, McLean, Virginia, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Javed Aslam , Katya Pelekhov , Daniela Rus, Static and dynamic information organization with star clusters, Proceedings of the seventh international conference on Information and knowledge management, p.208-217, November 02-07, 1998, Bethesda, Maryland, United States
|
|
|
Javed Aslam , Katya Pelekhov , Daniela Rus, A practical clustering algorithm for static and dynamic information organization, Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, p.51-60, January 17-19, 1999, Baltimore, Maryland, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|