ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Evaluating the novelty of text-mined rules using lexical knowledge
Full text PdfPdf (568 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
San Francisco, California
Pages: 233 - 238  
Year of Publication: 2001
ISBN:1-58113-391-X
Authors
Sugato Basu  University of Texas, Austin, TX
Raymond J. Mooney  University of Texas, Austin, TX
Krupakar V. Pasupuleti  University of Texas, Austin, TX
Joydeep Ghosh  University of Texas, Austin, TX
Sponsors
SIGMOD: ACM Special Interest Group on Management of Data
AAAI : American Association for Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 37,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/502512.502544
What is a DOI?

ABSTRACT

In this paper, we present a new method of estimating the novelty of rules discovered by data-mining methods using WordNet, a lexical knowledge-base of English words. We assess the novelty of a rule by the average semantic distance in a knowledge hierarchy between the words in the antecedent and the consequent of the rule - the more the average distance, more is the novelty of the rule. The novelty of rules extracted by the DiscoTEX text-mining system on Amazon.com book descriptions were evaluated by both human subjects and by our algorithm. By computing correlation coefficients between pairs of human ratings and between human and automatic ratings, we found that the automatic scoring of rules based on our novelty measure correlates with human judgments about as well as human judgments correlate with one another. @Text mining


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
S. C. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and 1%. A. Haxshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41:391-407, 1990.
 
3
R. Feldman, edRor. Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI-99} Workshop on Text Mining: Foundations, Techniques and Applications, Stockholm, Sweden, August 1999.
 
4
R. Feldman and I. Dagan. Knowledge discovery in textual databases (KDT). In Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95), pages 112-117, 1995.
 
5
C. D. Fellbaum. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA, 1998.
 
6
 
7
 
8
 
9
G. Hirst and D. St-Onge. Lexical chains as representations of context for the detection and correction of malapropims. In C. Fellbaum, editor, WordNet: An Electronic Lexical Database, chapter 13, pages 305-332. MIT Press, 1998.
 
10
11
 
12
T. K. Landauer and S. T. Dumais. A solution to Plato's problem: The Latent Semantic Analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104:211-240, 1997.
 
13
C. Leacock and M. Chodorow. Combining local context and WordNet similarity for word sense identification. In C. Fellbaum, editor, WordNet: An Electronic Lezical Database, chapter 11, pages 265-284. MIT Press, 1998.
 
14
J. H. Lee, M. H. Kim, and Y. J. Lee. Information retrieval based on a conceptual distance in IS-A heirarchy. Journal of Documentation, 49(2):188-207, June 1993.
 
15
 
16
 
17
D. Mladenid, editor. Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining (KDD-2000) Workshop on Text Mining, Boston, MA, August 2000.
 
18
 
19
U. Y. Nahm and R. J. Mooney. Mining soft-matching rules from textual data. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI-2001), Seattle, WA, 2001.
 
20
B. Padmanabhan and A. Tuzhilin. A belief-driven method for discovering unexpected patterns. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), pages 94-100, 1998.
 
21
R. Rada, H. Mili, E. Bicknell, and M. Blettner. Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man and Cybernetics, 19(1):17-30, 1989.
 
22
P. Resnick. WordNet and distribution analysis: A class-based approach to lexical discovery. In Statistically-Based Natural-Language-Processing Techniques: Papers from the 1992 AAAI Workshop. AAAI Press, 1992.
 
23
P. Resnick. Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI-95), pages 448-453, 1995.
24
 
25
26


Collaborative Colleagues:
Sugato Basu: colleagues
Raymond J. Mooney: colleagues
Krupakar V. Pasupuleti: colleagues
Joydeep Ghosh: colleagues