ACM Home Page
Please provide us with feedback. Feedback
Associative text categorization exploiting negated words
Full text PdfPdf (214 KB)
Source Symposium on Applied Computing archive
Proceedings of the 2006 ACM symposium on Applied computing table of contents
Dijon, France
SESSION: Data mining (DM) table of contents
Pages: 530 - 535  
Year of Publication: 2006
ISBN:1-59593-108-2
Authors
Elena Baralis  Politecnico di Torino, Corso Duca degli Abruzzi, Torino, Italy
Paolo Garza  Politecnico di Torino, Corso Duca degli Abruzzi, Torino, Italy
Sponsor
SIGAPP: ACM Special Interest Group on Applied Computing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 35,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1141277.1141402
What is a DOI?

ABSTRACT

Associative classification has been recently applied to text document categorization. However, differently from classification of structured data, the quality of the generated classifier is rather low. This effect is mainly due to the poor precision of generated rules.To increase the precision of associative classifiers we propose the use of classification rules including negated words, i.e. words that the considered document should not contain. Rules are in the form "If a document includes words A and B, but not word Z, then it belongs to class C1". Mining classification rules with negated words becomes quickly intractable when decreasing the support threshold. We tackle this problem by means of an opportunistic approach, where negated words are only generated to specialize rules that may wrongly classify training documents. Hence precision is increased, without losing recall.Experiments on the Reuters corpus show that our classifier based on negated words achieves good precision and recall results, while yielding an easily interpretable model typical of associative classifiers.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
 
4
5
6
 
7
B. Goethals and M. J. Zaki. FIMI'03: Workshop on frequent itemset mining implementations. In FIMI'03, 2003.
8
 
9
S. Hettich and S. D. Bay. The reuters-21578 text collection. The UCI KDD Archive.
 
10
 
11
 
12
 
13
B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. In KDD'98, NY, 1998.
 
14
15
 
16
J. Quinlan. C4.5: program for classification learning. Morgan Kaufmann, 1992.
 
17
J. Rocchio. Relevance feedback in information retrieval. Prentice-Hall, 1971.
 
18
19
20
 
21


Collaborative Colleagues:
Elena Baralis: colleagues
Paolo Garza: colleagues