ACM Home Page
Please provide us with feedback. Feedback
Anticipating annotations and emerging trends in biomedical literature
Full text PdfPdf (284 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Las Vegas, Nevada, USA
SESSION: Industrial papers table of contents
Pages 954-962  
Year of Publication: 2008
ISBN:978-1-60558-193-4
Authors
Fabian Mörchen  Siemens Corporate Research, Princeton, NJ, USA
Mathäus Dejori  Siemens Corporate Research, Princeton, NJ, USA
Dmitriy Fradkin  Siemens Corporate Research, Princeton, NJ, USA
Julien Etienne  Siemens Corporate Research, Princeton, NJ, USA
Bernd Wachmann  Siemens Corporate Research, Princeton, NJ, USA
Markus Bundschus  Ludwig-Maximilians-University, Munich, Germany
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): n/a,   Downloads (12 Months): n/a,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1401890.1402004
What is a DOI?

ABSTRACT

The BioJournalMonitor is a decision support system for the analysis of trends and topics in the biomedical literature. Its main goal is to identify potential diagnostic and therapeutic biomarkers for specific diseases. Several data sources are continuously integrated to provide the user with up-to-date information on current research in this field. State-of-the-art text mining technologies are deployed to provide added value on top of the original content, including named entity detection, relation extraction, classification, clustering, ranking, summarization, and visualization. We present two novel technologies that are related to the analysis of temporal dynamics of text archives and associated ontologies. Currently, the MeSH ontology is used to annotate the scientific articles entering the PubMed database with medical terms. Both the maintenance of the ontology as well as the annotation of new articles is performed largely manually. We describe how probabilistic topic models can be used to annotate recent articles with the most likely MeSH terms. This provides our users with a competitive advantage because, when searching for MeSH terms, articles are found long before they are manually annotated. We further present a study on how to predict the inclusion of new terms in the MeSH ontology. The results suggest that early prediction of emerging trends is possible. The trend ranking functions are deployed in our system to enable interactive searches for the hottest new trends relating to a disease.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
D. M. Blei, K. Franks, M. I. Jordan, and I. S. Mian. Statistical modeling of biomedical corpora: mining the caenorhabditis genetic center bibliography for genes related to life span. BMC Bioinformatics, 7(1), 2006.
 
3
D. M. Blei and M. I. Jordan. Modeling annotated data. pages 127--134, 2003.
 
4
 
5
M. Bundschus, M. Dejori, S. Yu, V. Tresp, and H.-P. Kriegel. Statistical modeling of medical indexing processes for biomedical knowledge information discovery from text. Submitted, 2008.
 
6
 
7
C. W. Gay, M. Kayaalp, and A. R. Aronson. Semi-automatic indexing of full text biomedical articles. In AMIA Annu Symp Proc, pages 271--275, 2005.
 
8
T. L. Griffiths and M. Steyvers. Finding scientific topics. Proc Natl Acad Sci U S A, 101 Suppl 1:5228--5235, 2004.
 
9
Q. He, K. Chang, E.-P. Lim, and J. Zhang. Bursty feature representation for clustering text streams. In Proc. SIAM Int. Conf. on Data Mining, 2007.
 
10
T. Hofmann. Probabilistic latent semantic analysis. In Proc. of Uncertainty in Artificial Intelligence, Stockholm, 1999.
 
11
 
12
S. M. Humphrey, T. C. Rindflesch, and A. R. Aronson. Automatic indexing by discipline and high-level categories: methodology and potential applications, 2000.
13
14
 
15
B. Lent, R. Agrawal, and R. Srikant. Discovering trends in text databases. In Proc. 3rd Int. Conf. Knowledge Discovery and Data Mining, pages 227--230, 1997.
 
16
 
17
A. McCallum, A. Corrada-Emmanuel, and X. Wang. Topic and role discovery in social networks. 2005.
 
18
A. McCallum and K. Nigam. A comparison of event models for naive bayes text classification, 1998.
19
 
20
F. Mörchen, K. Brinker, and C. Neubauer. Any-time clustering of high frequency news streams. In Proc. Data Mining Case Studies Workshop, KDD, 2007.
21
 
22
A. Névéol, S. E. Shooshan, S. M. Humphrey, T. C. Rindflesch, and A. R. Aronson. Multiple approaches to fine-grained indexing of the biomedical literature. In Pacific Symp. on Biocomputing, pages 292--303. World Scientific, 2007.
 
23
A. Névéol, S. E. Shooshan, J. G. Mork, and A. R. Aronson. Fine-grained indexing of the biomedical literature: Mesh subheading attachment for a medline indexing tool. In Proc. AMIA Symp, 2007.
24
 
25
 
26
R. Schult and M. Spiliopoulou. Discovering emerging topics in unlabelled text collections. In Proc. East European ADBIS Conf., pages 353--366, 2006.
27
28
29

Collaborative Colleagues:
Fabian Mörchen: colleagues
Mathäus Dejori: colleagues
Dmitriy Fradkin: colleagues
Julien Etienne: colleagues
Bernd Wachmann: colleagues
Markus Bundschus: colleagues