ACM Home Page
Please provide us with feedback. Feedback
An adaptive threshold framework for event detection using HMM-based life profiles
Full text PdfPdf (380 KB)
Source
ACM Transactions on Information Systems (TOIS) archive
Volume 27 ,  Issue 2  (February 2009) table of contents
Article No. 9  
Year of Publication: 2009
ISSN:1046-8188
Authors
Chien Chin Chen  National Taiwan University, Taipei City, Taiwan
Meng Chang Chen  Academia Sinica, Nankang, Taiwan
Ming-Syan Chen  National Taiwan University, Taipei City, Taiwan
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 25,   Downloads (12 Months): 262,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1462198.1462201
What is a DOI?

ABSTRACT

When an event occurs, it attracts attention of information sources to publish related documents along its lifespan. The task of event detection is to automatically identify events and their related documents from a document stream, which is a set of chronologically ordered documents collected from various information sources. Generally, each event has a distinct activeness development so that its status changes continuously during its lifespan. When an event is active, there are a lot of related documents from various information sources. In contrast when it is inactive, there are very few documents, but they are focused. Previous works on event detection did not consider the characteristics of the event's activeness, and used rigid thresholds for event detection. We propose a concept called life profile, modeled by a hidden Markov model, to model the activeness trends of events. In addition, a general event detection framework, LIPED, which utilizes the learned life profiles and the burst-and-diverse characteristic to adjust the event detection thresholds adaptively, can be incorporated into existing event detection methods. Based on the official TDT corpus and contest rules, the evaluation results show that existing detection methods that incorporate LIPED achieve better performance in the cost and F1 metrics, than without.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
Aizen, J., Huttenlocher, D., Kleinberg, J., and Novak, A. 2004. Traffic-based feedback on the web. In Proc. Nat. Acad. Sci. 101, 525--5260.
3
 
4
Allan, J., Carbonell, J., Doddington, G., Yamron, J., and Yang, Y. 1998b. Topic detection and tracking pilot study: final report. In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop. 194--218.
 
5
Allan, J., Lavrenko, V., Frey, D., and Khandelwal, V. 2000. Proceeding of the TDT Workshop.
 
6
 
7
Barlas, Y. and Kanar, K. 1999. A dynamic pattern-oriented test for model validation. In Proceedings of 4th Systems Science European Congress. 269--286.
 
8
Baum, L. E., Petrie, T., Soules, G., and Weiss, N. 1970. A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Statist. 41, 164--171.
 
9
Chen, C. C., Chen, Y. T., Sun Y., and Chen, M. C. 2003. Life cycle modeling of news events using aging theory. In Proceedings of the 14th European Conference on Machine Learning. 47--59.
10
 
11
 
12
 
13
Dempster, A. P., Laird, N. M., and Rubin, D. B. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statist. Soc. Series B 39, 1--38.
 
14
 
15
 
16
17
18
 
19
Ghahramani, S. 2000. Fundamentals of Probability. Prentice Hall.
20
21
22
23
 
24
25
 
26
 
27
Markov, A. A. 1913. An example of statistical investigation in the text of ‘Eugene Onyegin’ illustrating coupling of 'tests' in chains. In Proc. Acad. Sci. 7, 153--162.
 
28
Martin, A., Doddington, G., Kamm, T., Ordowski, M., and Przybocki, M. 1997. The DET curve in assessment of detection task performance. In Proc. EuroSpeech, 4, 1985--1898.
 
29
 
30
Myers, C., Rabiner, L. R., and Rosenberg, A. E. 1980. Performance tradeoffs in dynamic time warping algorithms for isolated word recognition. IEEE Trans. Acoust. Speech Signal Proc. 28, 6, 623--635.
 
31
Papka, R. 1999. P.h.D thesis, Department of Computer Science, University of Massachusetts.
 
32
 
33
 
34
Rocchio, J. J. 1971. Relevance feedback in information retrieval, In The SMART Retrieval System, Prentice Hall, 313--323.
 
35
 
36
Silverman, B. 1986. Density Estimation for Statistics and Data Analysis. Chapman and Hall.
 
37
Viterbi, A. J. 1967. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inform. Theory IT-13, 1260-1269.
38
 
39
Zhang, J., Ghahramani, Z., and Yang, Y. 2004. A probabilistic model for online document clustering with application to novelty detection. In Proceedings of the Conference on Neural Information Processing System. 1617--1624.

Collaborative Colleagues:
Chien Chin Chen: colleagues
Meng Chang Chen: colleagues
Ming-Syan Chen: colleagues