ACM Home Page
Please provide us with feedback. Feedback
The role of syntactic features in protein interaction extraction
Full text PdfPdf (260 KB)
Source
Conference on Information and Knowledge Management archive
Proceeding of the 2nd international workshop on Data and text mining in bioinformatics table of contents
Napa Valley, California, USA
SESSION: Bio-text mining table of contents
Pages 61-68  
Year of Publication: 2008
ISBN:978-1-60558-251-1
Authors
Timur Fayruzov  Ghent University, Ghent, Belgium
Martine De Cock  Ghent University, Ghent, Belgium
Chris Cornelis  Ghent University, Ghent, Belgium
Veronique Hoste  University College Ghent, Ghent, Belgium
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 61,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1458449.1458463
What is a DOI?

ABSTRACT

Most approaches for protein interaction mining from biomedical texts use both lexical and syntactic features. However, the individual impact of these two kinds of features on the effectiveness of the mining process has not yet been thoroughly studied. In this paper, we perform such a study on a recently published state of the art support vector machine approach that uses both lexical and syntactic features. To this end, we strip this approach down to an algorithm that uses only a subset of the initial syntactic features. Next, we compare the original and the stripped-down method by evaluating them on 5 benchmark datasets as well as by performing 5 additional cross-dataset experiments. Although the original method exploits a very rich feature set including words, parts-of-speech and grammatical relations, it is not significantly better than the stripped-down version; in fact, the former does not even consistently outperform the latter.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. Airola, S. Pyysalo, J. Björne, T. Pahikkala, F. Ginter, and T. Salakoski. A graph kernel for protein-protein interaction extraction. In Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing at ACL'08, p. 1--9, 2008.
 
2
R. Bunescu, R. Ge, R. J. Kate, E. M. Marcotte, R. J. Mooney, A. K. Ramani, and Y. W. Wong. Comparative experiments on learning information extractors for proteins and their interactions. Artificial Intelligence in Medicine, 33(2):139--155, 2005.
 
3
R. C. Bunescu and R. J. Mooney. Subsequence kernels for relation extraction. Advances in Neural Information Processing Systems, 18:171--178, 2006.
 
4
M. Collins and N. Duffy. Convolution kernels for natural language. Advances in Neural Information Processing Systems, 14:625--632, 2001.
 
5
 
6
7
 
8
M. de Marneffe, B. MacCartney and C. D. Manning. Generating Typed Dependency Parses from Phrase Structure Parses. In Proceedings of LREC-06, 2006.
 
9
J. Ding, D. Berleant, D. Nettleton, and E. S. Wurtele. Mining medline: Abstracts, sentences, or phrases? In Proceedings of the Pacific Symposium on Biocomputing, p. 326--337, 2002.
 
10
T. Fayruzov, M. De Cock, C. Cornelis, and V. Hoste. Deeper: A full parsing based approach to protein relation extraction. Lecture Notes in Computer Science, 4973: 36--47, 2008.
 
11
 
12
C. Giuliano, A. Lavelli, and L. Romano. Exploiting shallow linguistic information for relation extraction from biomedical literature. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), p. 401--408, 2006.
 
13
J. A. Hanley and B. J. McNeil. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1):29--36, 1982.
 
14
D. Haussler. Convolution kernels on discrete structures, Technical report, University of California at Santa Cruz, 1999.
 
15
S. Katrenko and P. Adriaans. Learning relations from biomedical corpora using dependency tree levels. In Proceedings of the Fifteenth Dutch-Belgian Conference on Machine Learning (Benelearn), 2006.
 
16
 
17
C. Nedellec. Learning language in logic - genic interaction extraction challenge. In Proceedings of the ICML-2005 Workshop on Learning Language in Logic (LLL05), p. 31--37, 2005.
 
18
S. Pyysalo, A. Airola, J. Heimonen, J. Björne, F. Ginter, and T. Salakoski. Comparative analysis of five protein-protein interaction corpora. BMC Bioinformatics, 9 (Suppl 3):S6, 2008.
 
19
S. Pyysalo, F. Ginter, J. Heimonen, J. Björne, J. Boberg, J. Järvinen, and T. Salakoski. BioInfer: A corpus for information extraction in the biomedical domain. BMC Bioinformatics, 8:50, 2007.
 
20
R. Saetre, K. Sagae, and J. Tsujii. Syntactic features for protein-protein interaction extraction. In Short Paper Proceedings of the Second International Symposium on Languages in Biology and Medicine (LBM), 2007.
 
21
 
22
S. Van Landeghem, Y. Saeys, B. De Baets, and Y. Van de Peer. Extracting protein-protein interactions from text using rich feature vectors and feature selection. To appear in Proceedings of Third International Symposium on Semantic Mining in Biomedicine (SMBM), 2008.
 
23
J. Xiao, J. Su, G. Zhou, and C. Tan. Protein-protein interaction extraction: a supervised learning approach. In Proceedings of the 1st International Symposium on Semantic Mining in Biomedicine (SMBM), 2005.
 
24
A. Yakushiji, Y. Miyao, T. Ohta, Y. Tateisi, and J. Tsujii. Automatic construction of predicate-argument structure patterns for biomedical information extraction. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, p. 284--292, 2006.

Collaborative Colleagues:
Timur Fayruzov: colleagues
Martine De Cock: colleagues
Chris Cornelis: colleagues
Veronique Hoste: colleagues