ACM Home Page
Please provide us with feedback. Feedback
A neural model for unsupervised taxonomy enrichment
Full text PdfPdf (175 KB)
Source International Conference on Information Integration and web-based Applications and Services archive
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services table of contents
Linz, Austria
SESSION: iiWAS 2008: Data mining and agents for information integration table of contents
Pages 264-270  
Year of Publication: 2008
ISBN:978-1-60558-349-5
Authors
Emil Şt. Chifu  Technical University of Cluj-Napoca, Cluj-Napoca, Romania
Ioan Alfred Leţia  Technical University of Cluj-Napoca, Cluj-Napoca, Romania
Sponsor
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 52,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1497308.1497358
What is a DOI?

ABSTRACT

The most important prerequisite for the success of the Semantic Web research is the construction of complete and reliable domain ontologies. In this paper we describe an unsupervised framework for domain ontology enrichment based on mining domain text corpora. Specifically, we enrich the hierarchical backbone of an existing ontology, i.e. its taxonomy, with new domain-specific concepts. The framework is based on an extended model of hierarchical self-organizing maps. As being founded on an unsupervised neural network architecture, the framework can be applied to different languages and domains. Terms extracted by mining a text corpus encode contextual content information, in a distributional vector space. The enrichment behaves like a classification of the extracted terms into the existing taxonomy by attaching them as hyponyms for the nodes of the taxonomy. The experiments reported are in the "Lonely Planet" tourism domain. The taxonomy and the corpus are the ones proposed in the PASCAL ontology learning and population challenge. The experimental results prove that the quality of the enrichment is considerably improved by using semantics based vector representations for the classified (newly added) terms, like the document category histograms (DCH) and the document frequency times inverse term frequency (DF-ITF) weighting scheme.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Buitelaar, P., Cimiano, P., Grobelnik, M., Sintek, M., 2005. Ontology learning from text. Tutorial at ECML/PKDD workshop on Knowledge Discovery and Ontologies.
 
3
Buitelaar, P., Cimiano, P., Magnini B., 2005. Ontology learning from text: an overview. In P. Buitelaar, P. Cimiano, B. Magnini (Eds.), Ontology Learning from Text: Methods, Evaluation and Applications, Frontiers in Artificial Intelligence and Applications Series. IOS Press, pp. 1--10.
 
4
Chifu, E.Şt., Leţia, I. A. 2006. Unsupervised ontology enrichment with hierarchical self-organizing maps, In: IEEE 2nd International Conference on Intelligent Computer Communication and Processing, pp. 3--9, IEEE Press, Cluj-Napoca.
 
5
Cimiano, P., Völker, J., 2005. Towards large-scale, open-domain and ontology-based named entity classification. In RANLP'05, International Conference on Recent Advances in Natural Language Processing, pp. 166--172.
 
6
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., 2002. GATE: a framework and graphical development environment for robust NLP tools and applications. In 40th Anniversary Meeting of the ACL.
 
7
Dittenbach, M., Merkl, D., Rauber, A., 2002. Organizing and exploring high-dimensional data with the Growing Hierarchical Self-Organizing Map. In L. Wang, et al. (Eds.), 1st International Conference on Fuzzy Systems and Knowledge Discovery, vol. 2, pp. 626--630.
 
8
Grobelnik, M., Cimiano, P., Gaussier, E., Buitelaar, P., Novak, B., Brank, J., Sintek, M. 2006. Task description for PASCAL challenge. Evaluating ontology learning and population from text.
 
9
Kohonen, T., Kaski, S., Lagus, K., Salojärvi, J., Honkela, J., Paatero, V., Saarela, A., 2000. Self-organization of a massive document collection. IEEE Transactions on Neural Networks 11, pp. 574--585.
 
10
 
11
 
12
Witschel, H. F., 2005. Using decision trees and text mining techniques for extending taxonomies. In Learning and Extending Lexical Ontologies by using Machine Learning Methods, Workshop at ICML-05, pp. 61--68.

Collaborative Colleagues:
Emil Şt. Chifu: colleagues
Ioan Alfred Leţia: colleagues