|
ABSTRACT
This paper describes a new approach for dealing with the vocabulary problem in human-computer interaction. Most approaches to retrieving textual materials depend on a lexical match between words in users' requests and those in or assigned to database objects. Because of the tremendous diversity in the words people use to describe the same object, lexical matching methods are necessarily incomplete and imprecise [5]. The latent semantic indexing approach tries to overcome these problems by automatically organizing text objects into a semantic structure more appropriate for matching user requests. This is done by taking advantage of implicit higher-order structure in the association of terms with text objects. The particular technique used is singular-value decomposition, in which a large term by text-object matrix is decomposed into a set of about 50 to 150 orthogonal factors from which the original matrix can be approximated by linear combination. Terms and objects are represented by 50 to 150 dimensional vectors and matched against user queries in this “semantic” space. Initial tests find this completely automatic method widely applicable and a promising way to improve users' access to many kinds of textual materials, or to objects and services for which textual descriptions are available.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
|
| |
4
|
Deerwester, S., Dumais, S.T., Fumas, G.W., Landauer, T.K., and Harshman, R.A. Indexing by latent semantic analysis. Journal of the American Society for Information Science, in press.
|
| |
5
|
Fumas, G.W., Landauer, T.K., Gomez, UM., and Dumais, S.T. Statistical semanUcs: Analysis of the potential performance of key-word information systems. Bell System Technical Journal, 1983, 62(6), 1753-1806.
|
 |
6
|
|
| |
7
|
|
 |
8
|
|
 |
9
|
|
 |
10
|
|
 |
11
|
P. Orwick , J. T. Jaynes , T. R. Barstow , L. S. Bohn, DOMAIN/DELPHI: retrieving documents online, Proceedings of the SIGCHI conference on Human factors in computing systems, p.114-121, April 13-17, 1986, Boston, Massachusetts, United States
|
| |
12
|
|
| |
13
|
Sparck Jones, K. Automatic keyword classification for information retrieval. Buttersworth, 1971.
|
| |
14
|
Streeter, L.A. and Lochbaum, K.E. An expert exert-locating system based on automatic representation of semantic structure. In Proceedings of lEEE Conference on AI Applications. San Diego, CA, March 1988.
|
 |
15
|
|
| |
16
|
Weyer, S. The design of a dynamic book for information search. International Journal of Man Machine Studies, 1982,17, 87-107.
|
CITED BY 52
|
|
|
|
|
|
|
|
|
|
|
Christos H. Papadimitriou , Hisao Tamaki , Prabhakar Raghavan , Santosh Vempala, Latent semantic indexing: a probabilistic analysis, Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.159-168, June 01-04, 1998, Seattle, Washington, United States
|
|
|
|
|
|
|
|
|
Dennis E. Egan , Joel R. Remde , Louis M. Gomez , Thomas K. Landauer , Jennifer Eberhardt , Carol C. Lochbaum, Formative design evaluation of superbook, ACM Transactions on Information Systems (TOIS), v.7 n.1, p.30-57, Jan. 1989
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Robert B. Allen , Pascal Obry , Michael Littman, An interface for navigating clustered document sets returned by queries, Proceedings of the conference on Organizational computing systems, p.166-171, November 01-04, 1993, Milpitas, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
R. Dolin , J. Pierre , M. Butler , R. Avedon, Practical evaluation of IR within automated classification systems, Proceedings of the eighth international conference on Information and knowledge management, p.322-329, November 02-06, 1999, Kansas City, Missouri, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
D. Jiménez , E. Ferretti , V. Vidal , P. Rosso , C. F. Enguix, The influence of semantics in IR using LSI and K-means clustering techniques, Proceedings of the 1st international symposium on Information and communication technologies, September 24-26, 2003, Dublin, Ireland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Anirban Dasgupta , Ravi Kumar , Prabhakar Raghavan , Andrew Tomkins, Variable latent semantic indexing, Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
|
|
|
|
|
|
Wensi Xi , Edward A. Fox , Weiguo Fan , Benyu Zhang , Zheng Chen , Jun Yan , Dong Zhuang, SimFusion: measuring similarity using unified relationship matrix, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
|
|
|
|
|
|
|
|
|
Suresh K. Bhavnani , Christopher K. Bichakjian , Timothy M. Johnson , Roderick J. Little , Frederick A. Peck , Jennifer L. Schwartz , Victor J. Strecher, Strategy hubs: Domain portals to help find comprehensive information, Journal of the American Society for Information Science and Technology, v.57 n.1, p.4-24, January 2006
|
|
|
Shlomo Argamon , Navot Akiva , Amihood Amir , Oren Kapah, Efficient unsupervised recursive word segmentation using minimum description length, Proceedings of the 20th international conference on Computational Linguistics, p.1058-es, August 23-27, 2004, Geneva, Switzerland
|
|
|
|
|
|
Gerhard Fischer , Stefanie Lindstaedt , Jonathan Ostwald , Kurt Schneider , Jay Smith, Informing system design through organizational learning, Proceedings of the 1996 international conference on Learning sciences, p.52-59, July 25-27, 1996, Evanston, Illinois
|
|
|
|
|
|
|
|
|
Peter A. Chew , Brett W. Bader , Tamara G. Kolda , Ahmed Abdelali, Cross-language information retrieval using PARAFAC2, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
|
|
|
Yanan Liu , Fei Wu , Yueting Zhuang , Jun Xiao, Active post-refined multimodality video semantic concept detection with tensor representation, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|