ACM Home Page
Please provide us with feedback. Feedback
Classifying biological articles using web resources
Full text PdfPdf (144 KB)
Source Symposium on Applied Computing archive
Proceedings of the 2004 ACM symposium on Applied computing table of contents
Nicosia, Cyprus
SESSION: Bioinformatics (BIO) table of contents
Pages: 111 - 115  
Year of Publication: 2004
ISBN:1-58113-812-1
Authors
Francisco M. Couto  Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa, Portugal
Bruno Martins  Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa, Portugal
Mário J. Silva  Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa, Portugal
Sponsor
SIGAPP: ACM Special Interest Group on Applied Computing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 19,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/967900.967925
What is a DOI?

ABSTRACT

Text classification systems on biomedical literature aim to select relevant articles to a specific issue from large corpora. Most systems with an acceptable accuracy are based on domain knowledge, which is very expensive and does not provide a general solution. This paper presents a novel approach for text classification on biomedical literature, involving the use of information extracted from related web resources. We validated this approach by implementing the proposed method and testing it on the KDD2002 Cup challenge: bio-text task. Results show that our approach can effectively improve efficiency on text classification systems for biomedical literature.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
D. Benson, I. Karsch-Mizrachi, D. Lipman, J. Ostell, B. Rapp, and D. Wheeler. GenBank. Nucleic Acids Research, 30:17--20, 2002.
 
3
C. Blaschke, R. Hoffmann, J. Oliveros, and A. Valencia. Extracting information automatically from biological literature. Comparative and Functional Genomics, 2:310--313, 2001.
 
4
A. P. Bradley. The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7):1145--1159, 1997.
 
5
 
6
F. Couto, M. Silva, and P. Coutinho. Improving information extraction through biological correlation. In Data Mining and Text Mining for Bioinformatics Workshop co-located with 7th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), Dubrovnik-Cavtat, Croatia, September 2003.
 
7
F. Couto, M. Silva, and P. Coutinho. ProFAL: Protein functional annotation through literature. In VIII Conference on Software Engineering and Databases (JISBD), Alicante, Spain, November 2003.
 
8
M. Gerstein. Integrative database analysis in structural genomics. Nature Structural Biology, Structural genomics supplement: 960--963, November 2000.
9
 
10
L. Hirschman, J. Park, J. Tsujii, L. Wong, and C. H. Wu. Accomplishments and challenges in literature data mining for biology. Bioinformatics, 18(12):1553--1561, 2002.
11
 
12
J. F. Kenney and E. S. Keeping. Mathematics of Statistics, chapter Quartiles, pages 35--37. Princeton, NJ: Van Nostrand, 1962.
 
13
 
14
A. K. McCallum. Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering. http://www.cs.cmu.edu/ mccallum/bow, 1996.
 
15
MEDLINE. PubMed database at the National Library of Medicine. www.ncbi.nih.gov/PubMed.
 
16
MeSH: Medical Subject Headings. www.nlm.nih.gov/mesh/meshhome.html.
 
17
18
 
19
G. Rubin. Around the genomes: The drosophila genome project. Genome Research, 6:71--79, 1996.
 
20
21


Collaborative Colleagues:
Francisco M. Couto: colleagues
Bruno Martins: colleagues
Mário J. Silva: colleagues