ACM Home Page
Please provide us with feedback. Feedback
Passage relevance models for genomics search
Full text PdfPdf (132 KB)
Source
Conference on Information and Knowledge Management archive
Proceeding of the 2nd international workshop on Data and text mining in bioinformatics table of contents
Napa Valley, California, USA
SESSION: Bio-text mining table of contents
Pages 45-52  
Year of Publication: 2008
ISBN:978-1-60558-251-1
Authors
Jay Urbain  Milwaukee School of Engineering, Milwaukee, WI, USA
Ophir Frieder  Illinois Institute of Technology, Chicago, IL, USA
Nazli Goharian  Illinois Institute of Technology, Chicago, IL, USA
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 54,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1458449.1458461
What is a DOI?

ABSTRACT

We present a passage relevance model for integrating semantic and statistical evidence of biomedical concepts and topics using a probabilistic graphical model. Component models of topics, concepts, terms, and document are represented as potential functions within a Markov Random Field. The probability of a passage being relevant to a biologist's information need is represented as the joint distribution across all potential functions. Relevance model feedback of top ranked passages is used to improve distributional estimates of concepts and topics in context, and a dimensional indexing strategy is used for efficient aggregation of concept and term statistics. By integrating multiple sources of evidence including dependencies between topics, concepts, and terms, we seek to improve genomics literature passage retrieval precision. Using this model, we are able to demonstrate statistically significant improvements in retrieval precision using a large genomics literature corpus.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Azzopardi, L., Girolami, M., & van Rijsbergen, C. J. (2004). Topic Based Language Models for ad hoc Information Retrieval. Proceedings of the International Joint Conference on Neural Networks.
 
2
Blei, D., Jordan, M., & Ng, A. (2003). Hierarchical Bayesian models for applications in information retrieval, In: Bernardo, J. M., Bayarri, M., Berger, J. O., Dawid, A. P., Heckerman, D., Smith, A. F. M., & West, M. (Eds.), Bayesian Statistics 7.
 
3
4
 
5
Demner-Fushman, D., Humphrey, S. M., Ide, N. C., Loane, R. F., Mork, J. G., Ruiz, M.E., Smith, L. H., Wilbur, W. J., Aronson, A. R., & Ruch, P. (2007). Combining Resources to Find Answers to Biomedical Questions. The Sixteenth Text REtrieval Conference Proceedings.
 
6
Firth, J. R. (1957). A Synopsis of Linguistic Theory, 1930-1955. Studies in Linguistic Analysis. Oxford: Blackwell, 1--32.
 
7
 
8
 
9
Hersh W., et al. (2007). TREC 2007 Genomics Track Overview. The Sixteenth Text REtrieval Conference Proceedings.
10
 
11
Ittycheriah, A., & Roukos, S. (2001). IBM's Statistical Question Answering System. TREC-11.
12
 
13
14
 
15
16
17
 
18
Porter, M. F. (1980). An algorithm for suffix stripping. Program, 14,130--137.
 
19
Rijsbergen, C. J. (1977). A theoretical basis for using co-occurrence data in information retrieval. Journal of Documentation, 33(2):106--119.
 
20
Robertson, S. (1977). The probability ranking principle in IR. Journal of Documentation, 33(4):294--303.
 
21
Schwartz, A., & Hearst, M. (2003). A simple algorithm for identifying abbreviation definitions in biomedical text. Pacific Symposium on Biocomputing.
 
22
Steyvers, M. (2006). Probabilistic Topic Models. In Landauer, T., McNamara, D., Dennis, S., & Kintch W. (eds), Latent Semantic Analysis: A Road to Meaning. Laurence Erlbaum.
23
 
24
Urbain, J., Goharian, N., & Frieder, O. (2006). IIT TREC-2006: Genomics Track. Proceedings of the Fifteenth Text REtrieval Conference.
 
25
Urbain, J., Goharian, N., & Frieder, O. (2007a, October). Combining Semantics, Context, and Statistical Evidence in Genomics Literature Search. IEEE 7th International Symposium on BioInformatics and BioEngineering.
 
26
Urbain, J., Goharian, N., & Frieder, O. (2007b, November). IIT TREC 2007 Genomics Track: Using Concept-Based Semantics in Context for Genomics Literature Passage Retrieval. The Sixteenth Text REtrieval Conference (TREC 2007) Conference Proceedings.
 
27
28
29
 
30
 
31
Zhou, W., & Yu, C. (2007, November). TREC Genomics Track at UIC. The Sixteenth Text REtrieval Conference Proceedings.

Collaborative Colleagues:
Jay Urbain: colleagues
Ophir Frieder: colleagues
Nazli Goharian: colleagues