| Shallow NLP techniques for internet search |
| Full text |
Pdf
(422 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 171
archive
Proceedings of the 29th Australasian Computer Science Conference - Volume 48
table of contents
Hobart, Australia
Pages: 167 - 176
Year of Publication: 2006
ISBN ~ ISSN:1445-1336 , 1-920682-30-9
|
|
Authors
|
|
Alex Penev
|
National ICT Australia and School of Computer Science and Engineering, University of New South Wales, Sydney, NSW, Australia
|
|
Raymond Wong
|
National ICT Australia and School of Computer Science and Engineering, University of New South Wales, Sydney, NSW, Australia
|
|
| Publisher |
Australian Computer Society, Inc.
Darlinghurst, Australia, Australia
|
| Bibliometrics |
Downloads (6 Weeks): 11, Downloads (12 Months): 74, Citation Count: 0
|
|
|
ABSTRACT
Information Retrieval (IR) is a major component in many of our daily activities, with perhaps its most prominent role manifested in search engines. Today's most advanced engines use the keyword-based ("bag of words") paradigm, which concedes some inherent disadvantages. We believe that natural language (NL) is a more user-oriented, context-preservative and intuitive mechanism for web search.In this paper, we explore shallow NLP techniques to support a range of NL queries over an existing keyword-based engine. We present JASE, a web application enveloping the Google search engine, which performs web searches by decomposing input NL queries and generating new queries that are more suitable for the search engine. By using some of Google's syntactic operators and filters, it creates "clever" queries to improve precision.A preliminary evaluation was conducted to test JASE's accuracy, and results have been encouraging. We conclude that the NL model has potential to not only rival the keyword-based paradigm, but substantially surpass it.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
Charniak, E. (1994), Statistical language learning, in 'Language and Computers 12', The MIT Press.
|
| |
6
|
Charniak, E. (1997), 'Statistical techniques for natural language parsing', AI Magazine 18(4), 33-44.
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
Jones, K. S. (1972), 'A statistical interpretation of term specificity and its application to retrieval', Journal of Documentation 28(1), 11-21.
|
| |
11
|
|
| |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
Luhn, H. P. (1957), 'A statistical approach to mechanized encoding and searching of literary information', IBM Journal of Research and Development, 4(4), 600-605.
|
| |
16
|
Munoz, A. (1996), Compound key word generation from document databases using a hierarchical clustering ART model. IDA, Amsterdam.
|
| |
17
|
Page, L., Brin, S., Motwani, R. & Winograd, T. (1998), 'The pagerank citation ranking: Bringing order to the web', Stanford Digital Library Technologies Project.
|
| |
18
|
Porter, M. (1980), An algorithm for suffix stripping, in 'Program', Vol. 14, pp. 130-137.
|
 |
19
|
John Prager , Eric Brown , Anni Coden , Dragomir Radev, Question-answering by predictive annotation, Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, p.184-191, July 24-28, 2000, Athens, Greece
[doi> 10.1145/345508.345574]
|
| |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
Turney, P. (1999), 'Learning to extract keyphrases from text', Technical Report ERB-1057, National Research Council, Institute for Information Technology.
|
| |
24
|
|
|