|
ABSTRACT
The concept of relevance is a hot topic in the information retrieval process. In recent years the extreme growth of digital documents brought to light the need for novel approaches and more efficient techniques to improve the accuracy of IR systems to take into account real users' information needs. In this article we propose a novel metric to measure the semantic relatedness between words. Our approach is based on ontologies represented using a general knowledge base for dynamically building a semantic network. This network is based on linguistic properties and it is combined with our metric to create a measure of semantic relatedness. In this way we obtain an efficient strategy to rank digital documents from the Internet according to the user's interest domain. The proposed methods, metrics, and techniques are implemented in a system for information retrieval on the Web. Experiments are performed on a test set built using a directory service having information about analyzed documents. The obtained results compared to other similar systems show an effective improvement.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
|
 |
4
|
|
| |
5
|
Berners-Lee, T., Hendler, J., and Lassila, O. 2001. The semantic Web: A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities. Sci. Amer. 284, 5 (5), 28--37.
|
| |
6
|
Boyce, B. R., Meadow, C. T., and Kraft, D. H. 1994. Measurement in Information Science. Academic Press Inc.
|
| |
7
|
Budanitsky, A. 1999. Lexical semantic relatedness and its application in natural language processing. Tech. rep., Department of Computer Science, University of Toronto.
|
| |
8
|
Castano, S., Ferrara, A., and Montanelli, S. 2003. H-match: An algorithm for dynamically matching ontologies in peer-based systems. In Proceedings of the International Workshop on Semantic Web and Databases (SWDB). 231--250.
|
 |
9
|
Paola Velardi , Paolo Fabriani , Michele Missikoff, Using text processing techniques to automatically enrich a domain ontology, Proceedings of the international conference on Formal Ontology in Information Systems, p.270-284, October 17-19, 2001, Ogunquit, Maine, USA
[doi> 10.1145/505168.505194]
|
 |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
Halliday, M. and Hasan, R. 1976. Cohesion In English. Longman.
|
| |
15
|
Harter, S. P. 1992. Psychological relevance and information science. J. Amer. Soc. Inform. Sci. 43, 9, 602--615.
|
 |
16
|
|
| |
17
|
|
| |
18
|
Kerschberg, L., Kim, W., and Scime, A. 2003. A personalizable agent for semantic taxonomy-based Web search. In Lecture Notes in Artificial Intelligence. Springer, 3--31.
|
| |
19
|
Leacock, C. and Chodorow, M. 1998. Combining local context and WordNet similarity for word sense identification. In WordNet: An Electronic Lexical Database, C. Fellbaum, Ed. The MIT Press, Cambridge, Chapter 11, 265--283.
|
| |
20
|
Lee, C.-H. and Yang, H.-C. 2001. Text mining of bilingual parallel corpora with a measure of semantic similarity. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics. IEEE, 470--475.
|
| |
21
|
Lee, J., Kim, M., and Lee, Y. 1993. Information retrieval based on conceptual distance in is a hierarchies. J. Docum. 49, 2, 188--207.
|
| |
22
|
|
 |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
Robert Neches , Richard Fikes , Tim Finin , Tom Gruber , Ramesh Patil , Ted Senator , William R. Swartout, Enabling technology for knowledge sharing, AI Magazine, v.12 n.3, p.36-56, Fall 1991
|
| |
27
|
Park, T. 1993. The nature of relevance in information retrieval: An empirical study. Library Quart. 63, 3, 318--351.
|
| |
28
|
Rada, R., Mili, H., Bicknell, E., and Blettner, M. 1989. Development and application of a metric on semantic nets. IEEE Trans. Syst. Man and Cyber 19, 1, 17--30.
|
| |
29
|
Saracevic, T. 1975. Relevance: A review of and a framework for thinking on the notion in information science. J. Amer. Soc. Inform. Sci. 26, 6, 321--343.
|
| |
30
|
Saracevic, T. 1996. Relevance reconsidered. In Proceedings of the 2nd International Conference on Conceptions of Library and Information Science: Integration in Perspective (CoLIS2), P. Ingwersen and N. Pors, Eds. The Royal School of Librarianship, 201--218.
|
| |
31
|
Schutz, A. 1970. Reflections on the Problem of Relevance. Yale University Press, New Haven.
|
| |
32
|
|
| |
33
|
Shepard, R. N. 1987. Towards a universal law of generalisation for psychological science. Science 237, 1317--1323.
|
| |
34
|
Amit Sheth , Clemens Bertram , David Avant , Brian Hammond , Krysztof Kochut , Yashodhan Warke, Managing Semantic Content for the Web, IEEE Internet Computing, v.6 n.4, p.80-87, July 2002
[doi> 10.1109/MIC.2002.1020330]
|
| |
35
|
Srihari, R., Rao, A., Han, B., Munirathnam, S., and Xiaoyun, W. 2000. A model for multi-model information retrieval. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME'00). vol. 2. IEEE, 701--704.
|
| |
36
|
Stairmand, M. A. 1996. A computational analysis of lexical cohesion with applications in information retrieval. Ph.D. thesis, Centre for Computational Linguistics, UMIST Manchester.
|
 |
37
|
|
| |
38
|
Swanson, D. 1986. Subjective versus objective relevance in bibliographic retrieval systems. Library Quart. 56, 4, 389--398.
|
| |
39
|
Vakkari, P. and Hakala, N. 2000. Changes in relevance criteria and problem stages in task performance. J. Docum. 56, 5, 389--398.
|
| |
40
|
|
 |
41
|
Giannis Varelas , Epimenidis Voutsakis , Paraskevi Raftopoulou , Euripides G.M. Petrakis , Evangelos E. Milios, Semantic similarity methods in wordNet and their application to information retrieval on the web, Proceedings of the 7th annual ACM international workshop on Web information and data management, November 04-04, 2005, Bremen, Germany
[doi> 10.1145/1097047.1097051]
|
| |
42
|
Weihua, L. 2002. Ontology supported intelligent information agent. In Proceedings on the 1st International IEEE Symposium on Intelligent Systems. IEEE, 383--387.
|
 |
43
|
Ron Weiss , Bienvenido Vélez , Mark A. Sheldon, HyPursuit: a hierarchical network search engine that exploits content-link hypertext clustering, Proceedings of the the seventh ACM conference on Hypertext, p.180-193, March 16-20, 1996, Bethesda, Maryland, United States
[doi> 10.1145/234828.234846]
|
| |
44
|
|
| |
45
|
|
|