ACM Home Page
Please provide us with feedback. Feedback
Finding optimal parameters for edit distance based sequence classification is NP-hard
Full text PdfPdf (495 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the KDD-09 Workshop on Statistical and Relational Learning in Bioinformatics table of contents
Paris, France
Pages 17-21  
Year of Publication: 2009
ISBN:978-1-60558-667-0
Authors
Vlado Kešelj  Dalhousie University, Halifax, Canada
Haibin Liu  Dalhousie University, Halifax, Canada
Norbert Zeh  Dalhousie University, Halifax, Canada
Christian Blouin  Dalhousie University, Halifax, Canada
Chris Whidden  Dalhousie University, Halifax, Canada
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 17,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1562090.1562094
What is a DOI?

ABSTRACT

Parametric edit distance based classification has been applied to two significant problems in the bioinformatics area: biological sequence analysis (DNA, RNA, protein), and semantic relationship extraction from biomedical scientific literature. This method is based on the edit distance measure on sequences, with parametric costs for matching, mismatching, inserts, and deletes of letters. We present a proof that finding optimal parameter values for such classification based on training data is an NP-hard problem, which is an important claim to justify the use of heuristic methods for determining the best parameter values.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
BioCreative challenge. Accessed in August 2008, http://biocreative.sourceforge.net/.
 
3
 
4
J. Hakenberg, C. Plake, U. Leser, H. Kirsch, and D. Rebholz-Schuhmann. Lll'05 challenge: Genic interaction extraction with alignments and finite state automata. In Proceedings of Learning Language in Logic Workshop (LLL'05) at ICML, page 38ÍC45, Bonn, Germany, 2005.
 
5
 
6
 
7
 
8
J. Li, X. Zhang, Y. Hao, M. Huang, and X. Zhu. Learning domain-specific knowledge from context --- THUIR at TREC'2005 genomics track. In Proceedings of 14th Text Retrireval Conference (TREC2005), Gaithersburg, USA, 2005.
 
9
 
10
C. Plake, J. Hakenberg, and U. Leser. Learning patterns for information extraction from free text. In Proceedings of AKKD 2005, Karlsruhe, Germany, 2005.
 
11
A. Skusa, A. Ruegg, and J. Kohler. Extraction of biological interaction networks from scientific literature. Brief Bioinform, 6(3):263--276, September 2005.

Collaborative Colleagues:
Vlado Kešelj: colleagues
Haibin Liu: colleagues
Norbert Zeh: colleagues
Christian Blouin: colleagues
Chris Whidden: colleagues