|
ABSTRACT
In a document retrieval, or other pattern matching environment where stored entities (documents) are compared with each other or with incoming patterns (search requests), it appears that the best indexing (property) space is one where each entity lies as far away from the others as possible; in these circumstances the value of an indexing system may be expressible as a function of the density of the object space; in particular, retrieval performance may correlate inversely with space density. An approach based on space density computations is used to choose an optimum indexing vocabulary for a collection of documents. Typical evaluation results are shown, demonstating the usefulness of the model.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Salton, G., and Yang, C.S. On the specification of term values in automatic indexing. J. Documen. 29, 4 (Dec. 1973), 351-372.
|
| |
3
|
Sparck Jones, K. A statistical interpretation of term specificity and its application to retrieval. J. Documen. 28, 1 (March 1972), 11-20.
|
| |
4
|
Williamson, R.E. Real-time document retrieval. Ph.D. Th., Computer Sci. Dep., Cornell U., June 1974.
|
| |
5
|
Wong, A. An investigation of the effects of different indexing methods on the document space configuration. Sci. Rep. ISR-22, Computer Sci. Dep., Cornell U., Section II, Nov. 1974.
|
| |
6
|
|
| |
7
|
Salton, G., Yang, C.S., and Yu, C.T. Contribution to the theory of indexing. Proc. IFIP Congress 74, Stockholm, August 1974. American Elsevier, New York, 1974.
|
CITED BY 292
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Shannon Bradshaw , Andrei Scheinkman , Kristian Hammond, Guiding people to information: providing an interface to a digital library using reference as a basis for indexing, Proceedings of the 5th international conference on Intelligent user interfaces, p.37-43, January 09-12, 2000, New Orleans, Louisiana, United States
|
|
|
|
|
|
|
|
|
|
|
|
P. Efraimidis , C. Glymidakis , B. Mamalis , P. Spirakis , B. Tampakas, Parallel text retrieval on a high performance supercomputer using the Vector Space Model, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.58-66, July 09-13, 1995, Seattle, Washington, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xiaobin Fu , Jay Budzik , Kristian J. Hammond, Mining navigation history for recommendation, Proceedings of the 5th international conference on Intelligent user interfaces, p.106-112, January 09-12, 2000, New Orleans, Louisiana, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Daniel M. Dunlavy , John Conroy , Dianne P. O'Leary, QCS: a tool for querying, clustering, and summarizing documents, Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Demonstrations, p.11-12, May 27-June 01, 2003, Edmonton, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
M. Catherine McCabe , Jinho Lee , Abdur Chowdhury , David Grossman , Ophir Frieder, On the design and evaluation of a multi-dimensional approach to information retrieval (poster session), Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, p.363-365, July 24-28, 2000, Athens, Greece
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Marcos André Gonçalves , Edward A. Fox , Layne T. Watson , Neill A. Kipp, Streams, structures, spaces, scenarios, societies (5s): A formal model for digital libraries, ACM Transactions on Information Systems (TOIS), v.22 n.2, p.270-312, April 2004
|
|
|
|
|
|
|
|
|
Andrew Booker , Michelle Condliff , Mark Greaves , Fred B. Holt , Anne Kao , Daniel J. Pierce , Stephen Poteet , Yuan-Jye Jason Wu, Visualizing Text Data Sets, Computing in Science and Engineering, v.1 n.4, p.26-35, July 1999
|
|
|
|
|
|
|
|
|
Claire Cardie , Vincent Ng , David Pierce , Chris Buckley, Examining the role of statistical and linguistic knowledge sources in a general-knowledge question-answering system, Proceedings of the sixth conference on Applied natural language processing, p.180-187, April 29-May 04, 2000, Seattle, Washington
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Shuming Shi , Ji-Rong Wen , Qing Yu , Ruihua Song , Wei-Ying Ma, Gravitation-based model for information retrieval, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
|
|
|
|
|
|
|
|
|
Ding-Yi Chen , Xue Li , Zhao Yang Dong , Xia Chen, Determining the fitness of a document model by using conflict instances, Proceedings of the sixteenth Australasian database conference, p.125-133, January 01, 2005, Newcastle, Australia
|
|
|
Jun Suzuki , Tsutomu Hirao , Yutaka Sasaki , Eisaku Maeda, Hierarchical directed acyclic graph kernel: methods for structured natural language data, Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, p.32-39, July 07-12, 2003, Sapporo, Japan
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ingo Mierswa , Michael Wurst , Ralf Klinkenberg , Martin Scholz , Timm Euler, YALE: rapid prototyping for complex data mining tasks, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Qiaozhu Mei , Dong Xin , Hong Cheng , Jiawei Han , ChengXiang Zhai, Generating semantic annotations for frequent patterns with context analysis, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Fabio Crestani , Sandor Dominich , Mounia Lalmas , Cornelis Joost van Rijsbergen, Mathematical, logical, and formal methods in information retrieval: an introduction to the special issue, Journal of the American Society for Information Science and Technology, v.54 n.4, p.281-284, February 15, 2003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Liping Wang , Qing Li , Na Li , Guozhu Dong , Yu Yang, Substructure similarity measurement in chinese recipes, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Weifeng Su , Jiying Wang , Qiong Huang , Fred Lochovsky, Query result ranking over e-commerce web databases, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Meenakshi Nagarajan , Amit Sheth , Marcos Aguilera , Kimberly Keeton , Arif Merchant , Mustafa Uysal, Altering document term vectors for classification: ontologies as expectations of co-occurrence, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
Hong Yu , Minsuk Lee , David Kaufman , John Ely , Jerome A. Osheroff , George Hripcsak , James Cimino, Development, implementation, and a cognitive evaluation of a definitional question answering system for physicians, Journal of Biomedical Informatics, v.40 n.3, p.236-251, June, 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Manuel A. Pérez-Quiñones , Andrea Kavanaugh , Uma Murthy , Philip Isenhour , Jaideep Godara , Spencer Lee , Alain Fabian, VizBlog: a discovery tool for the blogosphere, Proceedings of the 8th annual international conference on Digital government research: bridging disciplines & domains, May 20-23, 2007, Philadelphia, Pennsylvania
|
|
|
|
|
|
Nayer M. Wanas , Dina A. Said , Nadia H. Hegazy , Nevin M. Darwish, A study of local and global thresholding techniques in text categorization, Proceedings of the fifth Australasian conference on Data mining and analystics, p.91-101, November 29-30, 2006, Sydney, Australia
|
|
|
|
|
|
Christopher Scaffidi , Kevin Bierhoff , Eric Chang , Mikhael Felker , Herman Ng , Chun Jin, Red Opal: product-feature scoring from reviews, Proceedings of the 8th ACM conference on Electronic commerce, June 11-15, 2007, San Diego, California, USA
|
|
|
|
|
|
Achim Ebert , Peter Dannenmann , Matthias Deller , Daniel Steffen , Nahum Gershon, A large 2d+3d focus+context screen, CHI '08 extended abstracts on Human factors in computing systems, April 05-10, 2008, Florence, Italy
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yunbo Cao , Huizhong Duan , Chin-Yew Lin , Yong Yu , Hsiao-Wuen Hon, Recommending questions using the mdl-based tree cut model, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Miguel García-Remesal , Pedro Gil , Víctor Maojo , Holger Billhardt , José Crespo, SAT & ZB: novel tools to acquire and browse conceptual schemas from public online databases for biomedical applications, Tutorials, posters, panels and industrial contributions at the 26th international conference on Conceptual modeling, November 01-01, 2007, Auckland, New Zealand
|
|
|
Antonio Sanfilippo , Christian Posse , Banu Gopalan , Stephen Tratz , Michelle Gregory, Integrating ontological knowledge and textual evidence in estimating gene and gene product similarity, Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis, June 08-08, 2006, New York City, New York
|
|
|
|
|
|
Candida Tauro , Sameer Ahuja , Manuel A. Pérez-Quiñones , Andrea Kavanaugh , Philip Isenhour, Deliberation in the wild: a visualization tool for blog discovery and citizen-to-citizen participation, Proceedings of the 2008 international conference on Digital government research, May 18-21, 2008, Montreal, Canada
|
|
|
|
|
|
Adriana S. Vivacqua , Jose A. Rodrigues Nt. , Michele Machado , Rodrigo Padula , Melissa Paes , Patricia Barros , Geraldo Xexeo , Jano M. de Souza , Mutaleci Miranda, Community-supported collaborative navigation with FoxPeer, International Journal of Web Based Communities, v.5 n.1, p.126-138, November 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
David Fernandes , Edleno S. de Moura , Berthier Ribeiro-Neto , Altigran S. da Silva , Marcos André Gonçalves, Computing block importance for searching on web sites, Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, November 06-10, 2007, Lisbon, Portugal
|
|
|
|
|
|
|
|
|
Matthias Deller , Achim Ebert , Michael Bender , Stefan Agne , Henning Barthel, Preattentive visualization of information relevance, Proceedings of the international workshop on Human-centered multimedia, September 28-28, 2007, Augsburg, Bavaria, Germany
|
|
|
|
|
|
|
|
|
|
|
|
Pedro DeRose , Warren Shen , Fei Chen , AnHai Doan , Raghu Ramakrishnan, Building structured web community portals: a top-down, compositional, and incremental approach, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
Jingjing Liu , Wei Lai , Xian-Sheng Hua , Yalou Huang , Shipeng Li, Video search re-ranking via multi-graph propagation, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abhilasha Bhargav-Spantzel , Anna C. Squicciarini , Shimon Modi , Matthew Young , Elisa Bertino , Stephen J. Elliott, Privacy preserving multi-factor authentication with biometrics, Journal of Computer Security, v.15 n.5, p.529-560, October 2007
|
|
|
|
|
|
Benjamin E. Teitler , Michael D. Lieberman , Daniele Panozzo , Jagan Sankaranarayanan , Hanan Samet , Jon Sperling, NewsStand: a new view on news, Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems, November 05-07, 2008, Irvine, California
|
|
|
|
|
|
|
|
|
|
|
|
GunWoo Park , JinGi Chae , Dae Hee Lee , SangHoon Lee, Personalized search based on user intention through the hierarchical phrase vector model, Proceedings of the WSEAS International Conference on Applied Computing Conference, p.205-210, May 27-30, 2008, Istanbul, Turkey
|
|
|
GunWoo Park , JinGi Chae , Dae Hee Lee , SangHoon Lee, Personalized search based on user intention through the hierarchical phrase vector model, Proceedings of the WSEAS International Conference on Applied Computing Conference, p.205-210, May 27-30, 2008, Istanbul, Turkey
|
|
|
|
|
|
Tobias Rausch , Alun Thomas , Nicola J. Camp , Lisa A. Cannon-Albright , Julio C. Facelli, A parallel genetic algorithm to discover patterns in genetic markers that indicate predisposition to multifactorial disease, Computers in Biology and Medicine, v.38 n.7, p.826-836, July, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ashley George , Adetokunbo Makanju , Evangelos Milios , Nur Zincir-Heywood , Markus Latzel , Sotirios Stergiopoulos, NetPal: a dynamic network administration knowledge base, Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds, October 27-30, 2008, Ontario, Canada
|
|
|
|
|
|
Masahiro Ito , Kotaro Nakayama , Takahiro Hara , Shojiro Nishio, Association thesaurus construction methods based on link co-occurrence analysis for wikipedia, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
Aoying Zhou , Rong Zhang , Weining Qian , Quang Hieu Vu , Tianming Hu, Adaptive indexing for content-based search in P2P systems, Data & Knowledge Engineering, v.67 n.3, p.381-398, December, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Windson Viana , Samira Hammiche , Bogdan Moisuc , Marlène Villanova-Oliver , Jérôme Gensel , Hervé Martin, Semantic keyword-based retrieval of photos taken with mobile devices, Proceedings of the 6th International Conference on Advances in Mobile Computing and Multimedia, November 24-26, 2008, Linz, Austria
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Allan J. C. Silva , Marcos André Gonçalves , Alberto H. F. Laender , Marco A. B. Modesto , Marco Cristo , Nivio Ziviani, Finding what is missing from a digital library: A case study in the Computer Science field, Information Processing and Management: an International Journal, v.45 n.3, p.380-391, May, 2009
|
|
|
|
|
|
|
|
|
Bernd Bruegge , Joern David , Jonas Helming , Maximilian Koegel, Classification of tasks using machine learning, Proceedings of the 5th International Conference on Predictor Models in Software Engineering, May 18-19, 2009, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Huanhuan Cao , Derek Hao Hu , Dou Shen , Daxin Jiang , Jian-Tao Sun , Enhong Chen , Qiang Yang, Context-aware query classification, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
K. Rajan , V. Ramalingam , M. Ganesan , S. Palanivel , B. Palaniappan, Automatic classification of Tamil documents using vector space model and artificial neural network, Expert Systems with Applications: An International Journal, v.36 n.8, p.10914-10918, October, 2009
|
|
|
|
|
|
Albert Weichselbraun , Gerhard Wohlgenannt , Arno Scharl , Michael Granitzer , Thomas Neidhart , Andreas Juffinger, Discovery and evaluation of non-taxonomic relations in domain ontologies, International Journal of Metadata, Semantics and Ontologies, v.4 n.3, p.212-222, August 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|