|
ABSTRACT
In natural language processing, conflation is the process of merging or lumping together nonidentical words which refer to the same principal concept. This can relate both to words which are entirely different in form (e.g., "group" and "collection"), and to words which share some common root (e.g., "group", "grouping", "subgroups"). In the former case the words can only be mapped by referring to a dictionary or thesaurus, but in the latter case use can be made of the orthographic similarities between the forms. One popular approach is to remove affixes from the input words, thus reducing them to a stem; if this could be done correctly, all the variant forms of a word would be converted to the same standard form. Since the process is aimed at mapping for retrieval purposes, the stem need not be a linguistically correct lemma or root (see also Frakes 1982).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Dawson, J. L. 1974: "Suffix removal and word conflation," <i>ALLC Bulletin,</i> <b>2</b>(3), 33--46 (1974).
|
| |
2
|
Frakes, W. B., 1982: <i>Term Conflation for Information Retrieval,</i> Ph.D. dissertation, Syracuse University, August 1982.
|
| |
3
|
Lennon, M., Pierce, D. S., Tarry, B. D. and Willett, P. 1981: "An evaluation of some conflation algorithms for information retrieval", <i>Journal of Information Science,</i> <b>3</b>, 177--183 (1981).
|
| |
4
|
Lovins, J. B. 1968: "Development of a stemming algorithm", <i>Mechanical Translation and Computational Linguistics,</i> <b>11,</b> 22--31 (1968).
|
| |
5
|
Paice, C. D. 1977: <i>Information Retrieval and the Computer,</i> London: MacDonald & Jane's, 1977; chapter 4.
|
| |
6
|
Porter, M. F. 1980: "An algorithm for suffix stripping", <i>Program,</i> <b>14,</b> 130--137 (1980).
|
| |
7
|
Ulmschneider, J. and Doszkocs, T. 1983: "A practical stemming algorithm for online search assistance", <i>Online Review,</i> 7(4), (1983).
|
CITED BY 23
|
|
|
|
|
James C. French , Allison L. Powell , Walter R. Creighton, III, Efficient searching in distributed digital libraries, Proceedings of the third ACM conference on Digital libraries, p.283-284, June 23-26, 1998, Pittsburgh, Pennsylvania, United States
|
|
|
|
|
|
Andrew Olney , Max Louwerse , Eric Matthews , Johanna Marineau , Heather Hite-Mitchell , Arthur Graesser, Utterance classification in AutoTutor, Proceedings of the HLT-NAACL 03 workshop on Building educational applications using natural language processing, p.1-8, May 31-31, 2003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hassan H. Malik , John R. Kender, Clustering web images using association rules, interestingness measures, and hypergraph partitions, Proceedings of the 6th international conference on Web engineering, July 11-14, 2006, Palo Alto, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|