ACM Home Page
Please provide us with feedback. Feedback
Analyzing feature trajectories for event detection
Full text PdfPdf (409 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Amsterdam, The Netherlands
SESSION: Topic detection and tracking table of contents
Pages: 207 - 214  
Year of Publication: 2007
ISBN:978-1-59593-597-7
Authors
Qi He  Nanyang Technological University
Kuiyu Chang  Nanyang Technological University
Ee-Peng Lim  Nanyang Technological University
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 26,   Downloads (12 Months): 183,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1277741.1277779
What is a DOI?

ABSTRACT

We consider the problem of analyzing word trajectories in both time and frequency domains, with the specific goal of identifying important and less-reported, periodic and aperiodic words. A set of words with identical trends can be grouped together to reconstruct an event in a completely un-supervised manner. The document frequency of each word across time is treated like a time series, where each element is the document frequency - inverse document frequency (DFIDF) score at one time point. In this paper, we 1) first applied spectral analysis to categorize features for different event characteristics: important and less-reported, periodic and aperiodic; 2) modeled aperiodic features with Gaussian density and periodic features with Gaussian mixture densities, and subsequently detected each feature's burst by the truncated Gaussian approach; 3) proposed an unsupervised greedy event detection algorithm to detect both aperiodic and periodic events. All of the above methods can be applied to time series data in general. We extensively evaluated our methods on the 1-year Reuters News Corpus [3] and showed that they were able to uncover meaningful aperiodic and periodic events.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Apache lucene-core 2.0.0, http://lucene.apache.org.
 
2
Google news alerts, http://www.google.com/alerts.
 
3
Reuters corpus, http://www.reuters.com/researchandstandards/corpus/.
 
4
5
6
7
 
8
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39(1):1--38, 1977.
 
9
 
10
Q. He, K. Chang, and E.-P. Lim. A model for anticipatory event detection. In ER, pages 168--181, 2006.
 
11
Q. He, K. Chang, E.-P. Lim, and J. Zhang. Bursty feature reprensentation for clustering text streams. In SDM, accepted, 2007.
12
 
13
14
15
 
16
W. D. Penny. Kullback-liebler divergences of normal, gamma, dirichlet and wishart densities. Technical report, 2001.
17
18
19
20
21


Collaborative Colleagues:
Qi He: colleagues
Kuiyu Chang: colleagues
Ee-Peng Lim: colleagues