|
ABSTRACT
Term clustering and syntactic phrase formation are methods for transforming natural language text. Both have had only mixed success as strategies for improving the quality of text representations for document retrieval. Since the strengths of these methods are complementary, we have explored combining them to produce superior representations. In this paper we discuss our implementation of a syntactic phrase generator, as well as our preliminary experiments with producing phrase clusters. These experiments show small improvements in retrieval effectiveness resulting from the use of phrase clusters, but it is clear that corpora much larger than standard information retrieval test collections will be required to thoroughly evaluate the use of this technique.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Michael It. Anderberg. Cluster Analysis for Applications. Academic Press, New York, 1973.
|
 |
2
|
|
| |
3
|
|
 |
4
|
|
| |
5
|
W. Bruce Croft. Clustering large files of documents using the aingle-link method. Journal of the American Society for Information Science, pages 341-344, November 1977.
|
 |
6
|
|
 |
7
|
|
| |
8
|
Martin Dillon and Ann S. Gray. FASIT" A fully automatic syntactically based indexing system. Journal of the American Society for Information Science, 34(2)'99-108, March 1983.
|
| |
9
|
Joel L. Fagan. E#perim.ents irt Automatic Phrase l#dezing for Oocurnen# Retrieval: A Comparison of Syntactic and Non-Syntactic )tlethods. PhD thesis, Department of Computer Science, Cornel/University, September 1987.
|
| |
10
|
|
 |
11
|
|
| |
12
|
|
| |
13
|
Donald J. Iiillman and Andrew J. Kasarda. The LEADER retrieval system. In AFIPS Proceedings 54#, pages 44 7-455, 1969.
|
| |
14
|
Lynette Ilirschman, Ralph Grishman, and Naomi Sager. Grammatically-based automatic word class formation, lnformagion Processing and Management, 11:39-57, 1975.
|
| |
15
|
J. Kittler. Feature selection and extraction. In Tzay Y. Young and King-Sun Fu, editors, Handbook of Pattern Recognition and Image Processing, pages 59-83. Academic Press, Orlando, 1986.
|
| |
16
|
Paul H. Klingbiel. Machine-aided indexing of technical literature. Information Storage and Retrieval, 9:79-84, 1973.
|
 |
17
|
|
| |
18
|
M. E. Lesk. Word-word associations in document retrieval systems. American Documentation, pages 27-38, January 1969.
|
| |
19
|
David D. Lewis. A description of CACM-3204-ML!, a test collection for information retrieval and machine learning. Information RetrievM Laboratory Memo 90-1, Computer and Information Science Department, University of Massachusetts at Amherst, 1990.
|
| |
20
|
|
| |
21
|
David D. Lewis, W. Bruce Croft, and Nehru Bhandaru. Language-oriented information retrieval. International Journal of Intelligent Systems, 4(3):285-318, 1989.
|
| |
22
|
|
| |
23
|
Jack Minker, Gerald A. Wilson, and Barbara H. Zi#nmerman. An evaluation of query expansion by tile addition of clustered terms for a document retrieval system. Information Storage and Retrieval, 8:329-348, 1972.
|
| |
24
|
Paul M. Mort, David L. Waltz, Howard L. Resnikoff, and George G. Robertson. Automatic indexing of text. Ted,nical Report 86-1, Tltiltking btaehines Corporation, January 1986.
|
| |
25
|
T. Noreault and It. Chatham. A procedure for the estimation of term similarity coefficients. Information Technology, pages 189-196, 1982.
|
| |
26
|
M. F. Porter. An algoritlxrn for suffax stripping. Program, 14(3):130--137, July 1980.
|
 |
27
|
|
| |
28
|
|
 |
29
|
|
| |
30
|
Tengku Moltd Tengku Sembok. Logical-Linguistic Model and Ezperirnents in Document Retrieval. PhD thesis, Department of Computing Science, University of Glasgow, August 1989.
|
 |
31
|
|
| |
32
|
K. Sparck Jones #nd E. O. Barber. What makes an automatic keyword classification effective? Journal of the American Society for Information Science, pages 166--175, May-june 1971.
|
| |
33
|
K. Sparck Jones and R. G. Bates. Research on automatic indexing 1974- 1976 (2 volumes). Technical report, Computer Laboratory. University of Cambridge, 1977.
|
| |
34
|
K. Sparck Jones and j. I. Tait. Automatic search term variant generation. Journal of Documentation, 40(1):50-66, March 1984.
|
| |
35
|
Karen Sparck Jones. Automatic Keyu,ord Classification for Ir#forma/io# Rctrie#,al. Archon Books, 1971.
|
| |
36
|
Karen Sparck Jones. Collection properties influencing automatic term classification performance. Information Storage and Retrieval, 9:499-513, 1973.
|
 |
37
|
|
 |
38
|
|
| |
39
|
Peter Willett. A fast procedure for the calculation of Similarity coefficients in automatic classification. Information Processing and Manager#ent, 17:53-60, 1981.
|
| |
40
|
Clement T. Yu and Vijay V. Rnghavan. Single-pns.# method for determining the semantic relationships between terms. Journal of the American Society for Information Science, pages 345-354, November 197"7.
|
CITED BY 25
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
W. Bruce Croft , Howard R. Turtle , David D. Lewis, The use of phrases and structured queries in information retrieval, Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval, p.32-45, October 13-16, 1991, Chicago, Illinois, United States
|
|
|
Tomek Strzalkowski , Jose Perez-Carballo , Mihnea Marinescu, Natural language information retrieval in digital libraries, Proceedings of the first ACM international conference on Digital libraries, p.117-125, March 20-23, 1996, Bethesda, Maryland, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|