| Finding optimal parameters for edit distance based sequence classification is NP-hard |
| Full text |
Pdf
(495 KB)
|
| Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the KDD-09 Workshop on Statistical and Relational Learning in Bioinformatics
table of contents
Paris, France
Pages 17-21
Year of Publication: 2009
ISBN:978-1-60558-667-0
|
|
Authors
|
|
Vlado Kešelj
|
Dalhousie University, Halifax, Canada
|
|
Haibin Liu
|
Dalhousie University, Halifax, Canada
|
|
Norbert Zeh
|
Dalhousie University, Halifax, Canada
|
|
Christian Blouin
|
Dalhousie University, Halifax, Canada
|
|
Chris Whidden
|
Dalhousie University, Halifax, Canada
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 17, Citation Count: 0
|
|
|
ABSTRACT
Parametric edit distance based classification has been applied to two significant problems in the bioinformatics area: biological sequence analysis (DNA, RNA, protein), and semantic relationship extraction from biomedical scientific literature. This method is based on the edit distance measure on sequences, with parametric costs for matching, mismatching, inserts, and deletes of letters. We present a proof that finding optimal parameter values for such classification based on training data is an NP-hard problem, which is an important claim to justify the use of heuristic methods for determining the best parameter values.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
BioCreative challenge. Accessed in August 2008, http://biocreative.sourceforge.net/.
|
| |
3
|
|
| |
4
|
J. Hakenberg, C. Plake, U. Leser, H. Kirsch, and D. Rebholz-Schuhmann. Lll'05 challenge: Genic interaction extraction with alignments and finite state automata. In Proceedings of Learning Language in Logic Workshop (LLL'05) at ICML, page 38ÍC45, Bonn, Germany, 2005.
|
| |
5
|
|
| |
6
|
Minlie Huang , Xiaoyan Zhu , Yu Hao , Donald G. Payan , Kunbin Qu , Ming Li, Discovering patterns to extract protein--protein interactions from full texts, Bioinformatics, v.20 n.18, p.3604-3612, December 2004
[doi> 10.1093/bioinformatics/bth451]
|
| |
7
|
Hyunchul Jang , Jaesoo Lim , Joon-Ho Lim , Soo-Jun Park , Kyu-Chul Lee , Seon-Hee Park, Finding the evidence for protein-protein interactions from PubMed abstracts, Bioinformatics, v.22 n.14, p.e220-e226, July 2006
[doi> 10.1093/bioinformatics/btl203]
|
| |
8
|
J. Li, X. Zhang, Y. Hao, M. Huang, and X. Zhu. Learning domain-specific knowledge from context --- THUIR at TREC'2005 genomics track. In Proceedings of 14th Text Retrireval Conference (TREC2005), Gaithersburg, USA, 2005.
|
| |
9
|
|
| |
10
|
C. Plake, J. Hakenberg, and U. Leser. Learning patterns for information extraction from free text. In Proceedings of AKKD 2005, Karlsruhe, Germany, 2005.
|
| |
11
|
A. Skusa, A. Ruegg, and J. Kohler. Extraction of biological interaction networks from scientific literature. Brief Bioinform, 6(3):263--276, September 2005.
|
|