ACM Home Page
Please provide us with feedback. Feedback
A system for finding biological entities that satisfy certain conditions from texts
Full text PdfPdf (284 KB)
Source
Conference on Information and Knowledge Management archive
Proceeding of the 17th ACM conference on Information and knowledge management table of contents
Napa Valley, California, USA
SESSION: IR: QA table of contents
Pages 1281-1290  
Year of Publication: 2008
ISBN:978-1-59593-991-3
Authors
Wei Zhou  University of Illinois at Chicago, Chicago, IL, USA
Clement Yu  University of Illinois at Chicago, Chicago, IL, USA
Weiyi Meng  Binghamton University, Binghamton, NY, USA
Sponsors
ACM: Association for Computing Machinery
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 120,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1458082.1458251
What is a DOI?

ABSTRACT

Finding biological entities (such as genes or proteins) that satisfy certain conditions from texts is an important and challenging task in biomedical information retrieval and text mining. It is essential for many biomedical applications, such as drug discovery which normally requires collecting existing scientific facts from documents. This paper presents an effective IR system for this task, in which 1) domain knowledge is incorporated to improve retrieval effectiveness; 2) query expansion with related concepts on multiple semantic levels is employed; 3) a gene symbol disambiguation technique is implemented. We evaluated these techniques and examined two different concept-based IR models. Experiments based upon the proposed framework yield significant improvement (22% for automatic and 16.7% for non-automatic) over the best reported results of passage retrieval in the Genomics track of TREC 2007.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. R. Aronson and T. C. Rindflesch. Query expansion using the umls metathesaurus. In Proc AMIA Annu Fall Symp., pages 485--489. American Medical Informatics Association, Oct. 1997.
 
2
S. buttcher, C. L. A. Clarke, and G. V. Cormack. Domain-specific synonym expansion and validation for biomedical information retrieval. In the Thirteenth Text REtrieval Conference (TREC 2004). National Institute of Standards and Technology, November 2004.
 
3
J. T. Chang, H. Schütze, and R. B. Altman. Creating an online dictionary of abbreviations from medline. J Am Med Inform Assoc., 9(6):612--620, November 2002.
 
4
H. Chen and B. M. Sharp. Content-rich biological network constructed by mining pubmed abstracts. BMC Bioinformatics, 5(147), October 2004.
 
5
ClusterMed. Vivísimo clustermed. http://clustermed.info/.
 
6
H. T. Dang, D. Kelly, and J. Lin. Overview of the trec 2007 question answering track. In the Sixteenth Text REtrieval Conference (TREC 2007). National Institute of Standards and Technology, November 2007.
 
7
D. Demner-Fushman, S. M. Humphrey, N. C. Ide, R. F. Loane, J. G. Mork, M. E. Ruiz, P. Ruch, L. H. Smith, J. W. Wilbur, and A. R. Aronson. Combining resources to find answers to biomedical questions. In the Sixteenth Text REtrieval Conference (TREC 2007). National Institute of Standards and Technology, November 2007.
 
8
 
9
A. Doms and M. Schroeder. Gopubmed: Exploring pubmed with the gene ontology. Nucleic Acids Res., 21(Web Server issue):W783--W786, April 2005.
 
10
S. M. Douglas, G. T. Montelione, and M. Gerstein. Pubnet: a flexible system for visualizing literature derived networks. Genome Biol., 6(9):R80, July 2005.
 
11
A. D. Eaton. Hubmed: a web-based biomedical literature search interface. Nucleic Acids Res., 34(Web Server issue):W745--W747, January 2006.
 
12
P. Fontelo, F. Liu, and M. Ackerman. askmedline: a free-text, natural language query tool for medline/pubmed. BMC Medical Informatics and Decision Making, 5(5), March 2005.
 
13
T. Goetz and C.-W. von der Lieth. Pubfinder: a tool for improving retrieval rate of relevant pubmed abstracts. Nucleic Acids Res., 33(Web Server issue):W774--W778, July 2005.
 
14
W. Hersh, A. Cohen, L. Ruslen, and P. Roberts. Trec 2007 genomics track overview. In the Sixteenth Text REtrieval Conference (TREC 2007). National Institute of Standards and Technology, November 2007.
 
15
W. Hersh, S. Price, and L. Donohoe. Assessing thesaurus-based query expansion using the umls metathesaurus. In Proc AMIA Annu Fall Symp., pages 344--348. American Medical Informatics Association, November 2000.
 
16
R. Hoffmann and A. Valencia. A gene network for navigating the literature. Nature Genetics, 36(7):664--664, July 2004.
 
17
ISI-knowledge. Isi knowledge. http://isiknowledge.com/.
 
18
T.-K. Jenssen, A. Lagreid, J. Komorowski, and E. Hovig. A literature network of human genes for high-throughput analysis of gene expression. Nature Genetics, 28(1):21--28, May 2001.
 
19
V. I. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10(8):707--710, 1966.
 
20
21
 
22
D. A. B. Lindberg, B. L. Humphreys, and A. T. McCray. The unified medical language system. Methods of Information in Medicine, 32(4):281--291, August 1993.
23
 
24
U. Mudunuri, R. Stephens, D. Bruining, D. Liu, and F. J. Lebeda. botxminer: mining biomedical literature with a new web-based application. Nucleic Acids Res., 34(Web Server issue):W748--W752, March 2006.
 
25
H.-M. Muller, E. E. Kenny, and P. W. Sternberg. Textpresso: an ontology-based information retrieval and extraction system for biological literature. PLoS Biol., 2(11):e309, Nov. 2004.
 
26
C. Perez-Iratxeta, P. Borka, and M. A. Andrade. Xplormed: a tool for exploring medline abstracts. Trends Biochem Sci., 26(9):573--575, September 2001.
 
27
M. V. Plikus, Z. Zhang, and C.-M. Chuong. Pubfocus: semantic medline/pubmed citations analytics through integration of controlled biomedical dictionaries and ranking algorithm. BMC Bioinformatics, 7(424), October 2006.
 
28
 
29
S. E. Robertson and S. Walker. Okapi/keenbow at trec-8. In the Eighth Text REtrieval Conference (TREC 2007). National Institute of Standards and Technology, November 2000.
 
30
B. J. A. Schijvenaars, B. Mons, M. Weeber, M. J. Schuemie, E. M. van Mulligen, H. M. Wain, and J. A. Kors. Thesaurus-based disambiguation of gene symbols. BMC Bioinformatics, 6(149), October 2005.
 
31
N. R. Smalheiser, W. Zhou, and V. I. Torvik. Anne o'tate: A tool to support user-driven summarization, drill-down and browsing of pubmed search results. Journal of Biomedical Discovery and Collaboration, 3(2), February 2008.
 
32
C. A. Sneiderman, D. Demner-Fushman, M. Fiszman, N. C. Ide, and T. C. Rindflesch. Knowledge-based methods to help clinicians find answers in medline. Journal of American Medical Information Assoc., 14(6):772--780, July 2007.
 
33
H. Tenner, G. R. Thurnayr, and R. Thurmayr. Data mining with meva in medline. In the 4th International Symposium on Medical Data Analysis (ISMDA 2003), pages 39--46, October 2003.
 
34
 
35
 
36
37
 
38
39

Collaborative Colleagues:
Wei Zhou: colleagues
Clement Yu: colleagues
Weiyi Meng: colleagues