|
ABSTRACT
Protein-Protein Interaction (PPI) extraction, among ongoing biomedical text mining challenges, is becoming a topic in focus because of its crucial role in providing a starting point to understand biological processes. Machine learning (ML) techniques have been applied to extract the PPI information from biomedical literature. Although they have provided reasonable performance so far, more features are required for real use. In particular, many ML-approaches lack human understandability for learned models. Here, we propose a novel method for classifying PPI sentences. Our approach utilizes the modified hypernetwork model, a hypergraph with weighted hyperedges that are calibrated via an evolutionary learning method. The evolutionary hypernetwork memorizes fragments of training patterns while self-adjusting its own structure for detecting PPI sentences. For experiments, we show that our approach provides competitive performance compared to other ML methods. Apart from its superior classification performance, the evolving hypernetwork model comes with a highly interpretable structure. We show how significant PPI patterns can be naturally extracted from the learned model. We also analyze the discovered patterns.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
A. M. Cohen and W. R. Hersh. A survey of current work in biomedical text mining. Brief Bioinform, 6(1):57--71, March 2005.
|
| |
3
|
Nikolai Daraselia , Anton Yuryev , Sergei Egorov , Svetalana Novichkova , Alexander Nikitin , Ilya Mazo, Extracting human protein interactions from MEDLINE using a full-sentence parser, Bioinformatics, v.20 n.5, p.604-611, March 2004
[doi> 10.1093/bioinformatics/btg452]
|
| |
4
|
I. Donaldson, J. Martin, B. de Bruijn, C. Wolting, V. Lay, B. Tuekam, S. Zhang, B. Baskin, G. D. Bader, K. Michalickova, T. Pawson, and C. W. Hogue. Prebind and textomy-mining the biomedical literature for protein-protein interactions using a support vector machine. BMC Bioinformatics, 4, March 2003.
|
| |
5
|
J.-H. Eom, S. Kim, S.-H. Kim and B.-T. Zhang. A tree kernel-based method for protein-protein interaction mining from biomedical literature. Lecture Notes in Computer Science, 3886:42--52, 2006.
|
| |
6
|
J.-H. Eom and B.-T. Zhang. Mining protein interaction from biomedical literature with relation kernel method. Lecture Notes in Computer Science, 3973:642--647, 2006.
|
| |
7
|
G. Erkan, A. Ozgur, and D. R. Radev. Semi-supervised classification for extracting protein interaction sentences using dependency parsing. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 228--237, 2007.
|
| |
8
|
Hyunchul Jang , Jaesoo Lim , Joon-Ho Lim , Soo-Jun Park , Kyu-Chul Lee , Seon-Hee Park, Finding the evidence for protein-protein interactions from PubMed abstracts, Bioinformatics, v.22 n.14, p.e220-e226, July 2006
[doi> 10.1093/bioinformatics/btl203]
|
| |
9
|
S. Kim, S.-Y. Shin, I.-H. Lee, S.-J. Kim, R. Sriram, and B.-T. Zhang. Pie: an online prediction system for protein-protein interactions from text. Nucl. Acids Res., 36:W411--W415, May 2008.
|
| |
10
|
M. Krallinger, F. Leitner, and A. Valencia. Assessment of the second biocreative PPI task: automatic extraction of protein-protein interactions. In Proceedings of the 2nd BioCreAtIvE Workshop, pages 41--54, 2007.
|
| |
11
|
E. M. Marcotte, I. Xenarios, and D. Eisenberg. Mining literature for protein-protein interactions. Bioinformatics, 17(4):359--363, April 2001.
|
| |
12
|
T. Ono, H. Hishigaki, A. Tanigami, and T. Takagi. Automated extraction of information on protein-protein interactions from the biological literature. Bioinformatics, 17(2):155--161, February 2001.
|
 |
13
|
|
| |
14
|
|
| |
15
|
S.-Y. Shin, S. Kim, J.-H. Eom, B.-T. Zhang, and R. Sriram. Identifying protein-protein interaction sentences using boosting and kernel methods. In Proceedings of the 2nd BioCreAtIvE Workshop, pages 187--192, 2007.
|
| |
16
|
I. Sondhauss and C. Weihs. Business phase classification and prediction: How to compare interpretability of classification methods? Sonderforschungsbereich, 475, 2004.
|
| |
17
|
|
| |
18
|
J. M. Temkin and M. R. Gilder. Extraction of protein interaction information from unstructured text using a context-free grammar. Bioinformatics, 19(16):2046--2053, November 2003.
|
| |
19
|
A. Yakushiji, Y. Miyao, Y. Tateisi, and J. Tsujii. Biomedical information extraction with predicate-argument structure patterns. In SMBM, 2005.
|
| |
20
|
B.-T. Zhang. Random hypergraph models of learning and memory in biomolecular networks: shorter-term adaptability vs. longer-term persistency. The First IEEE Symposium on Foundations of Computational Intelligence, pages 344--349, 2007.
|
| |
21
|
B.-T. Zhang. Hypernetworks: A molecular evolutionary architecture for cognitive learning and memory. Computational Intelligence Magazine, IEEE, 3(3):49--63, August 2008.
|
| |
22
|
B.-T. Zhang and J.-K. Kim. DNA hypernetworks for information storage and retrieval. In Lecture Notes in Computer Science, DNA12, pages 298--307, 2006.
|
|