|
ABSTRACT
This paper presents a new approach to identifying concepts expressed in a collection of email messages, and organizing them into an ontology or taxonomy for browsing. It incorporates techniques from text mining, information retrieval, natural language processing and machine learning to generate a concept ontology. Nominal N-gram mining is used to identify candidate concepts. Wordnet and surface text pattern matching are used to identify relationships among the concepts. A supervised clustering algorithm is then used to further cluster the concepts. The experiments show that the approach is effective.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
E. Blomqvist. Fully automatic construction of enterprise ontologies using design patterns: Initial method and first experiences. In The 5th International Conference on Ontologies, DataBases, and Applications of Semantics. ODBASE, 2005.
|
| |
3
|
P. Cimiano, A. Hotho, and S. Staab. Comparing conceptual, divisive and agglomerative clustering for learning taxonomies from text. In Proceedings of the European Conference on Artificial Intelligence, pages 435--439. ECAI, 2004.
|
| |
4
|
P. Cimiano and J. Völker. Text2onto: A framework for ontology learning and data-driven change discovery. In Proceedings of the 10th International Conference on Applications of Natural Language to Information Systems. NLDB, 2005.
|
| |
5
|
F. Colace, M. D. Santo, and M. Vento. An automatic algorithm for building ontologies from data. In International Conference on Information and Communication Technologies: From Theory to Applications, 2004.
|
| |
6
|
|
| |
7
|
D. Elliman and J. R. G. Pulido. Automatic derivation of on-line document ontologies. In International Workshop on Mechanisms for Enterprise Integration: From Objects to Ontology, 15th European Conference on Object Oriented Programming. MERIT, 2001.
|
| |
8
|
C. Fellbaum. WordNet:An Electronic Lexical Database. MIT Press, 1998.
|
| |
9
|
B. Fortuna, D. Mladenic, and M. Grobelnik. Semi-automatic construction of topic ontology. In Conference on Data Mining and Data Warehouses. SiKDD, 2005.
|
| |
10
|
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer-Verlag, 2001.
|
| |
11
|
|
| |
12
|
L. Karoui, M.-A. Aufaure, and N. Bennacer. Ontology discovery from web pages: Application to tourism. In In the Workshop of Knowledge Discovery and Ontologies. KDO, 2004.
|
| |
13
|
L. Khan and L. Wang. Automatic ontology derivation using clustering for image classification. In In Proc. of 8th International Workshop on Multimedia Information Systems. IWMIS, 2002.
|
| |
14
|
M. Sabou. Extracting ontologies from software documentation. In Workshop on Ontology Learning and Population, European Conference on Artificial Intelligence, pages 22--23. ECAI, 2004.
|
 |
15
|
|
| |
16
|
|
| |
17
|
C. J. Thomas, A. P. Sheth, and W. S. York. Modular ontology design using canonical building blocks in the biochemistry domain. In International Conference on Formal Ontology in Information Systems. FOIS, 2006.
|
| |
18
|
R. Tibshirani, G. Walther, and T. Hastie. Estimating the number of clusters in a dataset via the gap statistic. In Tech. Rep. 208, Dept. of Statistics, Stanford University., 2000.
|
| |
19
|
Y. Wang, J. Volker, and P. Haase. Towards semi-automatic ontology building supported by large-scale knowledge acquisition. In In AAAI Fall Symposium On Semantic Web for Collaborative Knowledge Acquisition, pages 70--77. AAAI, 2006.
|
 |
20
|
|
 |
21
|
|
|