| A sentence level probabilistic model for evolutionary theme pattern mining from news corpora |
| Full text |
Pdf
(286 KB)
|
Source
|
Symposium on Applied Computing
archive
Proceedings of the 2009 ACM symposium on Applied Computing
table of contents
Honolulu, Hawaii
SESSION: Information access and retrieval track
table of contents
Pages 1742-1747
Year of Publication: 2009
ISBN:978-1-60558-166-8
|
|
Authors
|
|
Shizhu Liu
|
Illinois institute of Technology, Chicago, IL
|
|
Yuval Merhav
|
Illinois institute of Technology, Chicago, IL
|
|
Wai Gen Yee
|
Illinois institute of Technology, Chicago, IL
|
|
Nazli Goharian
|
Illinois institute of Technology, Chicago, IL
|
|
Ophir Frieder
|
Illinois institute of Technology, Chicago, IL
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 20, Downloads (12 Months): 84, Citation Count: 0
|
|
|
ABSTRACT
Some recent topic model-based methods have been proposed to discover and summarize the evolutionary patterns of themes in temporal text collections. However, the theme patterns extracted by these methods are hard to interpret and evaluate. To produce a more descriptive representation of the theme pattern, we not only give new representations of sentences and themes with named entities, but we also propose a sentence-level probabilistic model based on the new representation pattern. Compared with other topic model methods, our approach not only gets each topic's distribution per term, but also generates candidate summary sentences of the themes as well. Consequently, the results are easier to understand and can be evaluated using the top sentences produced by our probabilistic model. Experimentation with the proposed methods on the Tsunami dataset shows that the proposed methods are useful in the discovery of evolutionary theme patterns.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
 |
3
|
|
| |
4
|
|
 |
5
|
|
 |
6
|
|
| |
7
|
T. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl. 1):5228--5235, 2004.
|
| |
8
|
|
 |
9
|
|
 |
10
|
|
 |
11
|
|
 |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
R. Feldman and I. Dagan. Knowledge discovery in textual databases (kdt). In KDD, pages 112--117, 1995.
|
| |
16
|
|
 |
17
|
|
| |
18
|
A. Kontostathis, L. Galitsky, W. M. Pottenger, S. Roy, and D. J. Phelps. A survey of emerging trend detection in textual data mining. Survey of Text Mining, pages 185--224, 2003.
|
 |
19
|
|
| |
20
|
S. Roy, D. Gevry, and W. M. Pottenger. Methodologies for trend detection in textual data mining. In the Textmine '02 Workshop, Second SIAM International Conference on Data Mining, 2002.
|
| |
21
|
Alias-I, "LingPipe," Website, 9 2006. {Online}. Available: http://www.alias-i.com/lingpipe/index.html
|
| |
22
|
Topic detection and tracking (tdt) project. homepage: http://www.nist.gov/speech/tests/tdt/.
|
 |
23
|
|
 |
24
|
|
 |
25
|
Gabriel Pui Cheong Fung , Jeffrey Xu Yu , Huan Liu , Philip S. Yu, Time-dependent event hierarchy construction, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
[doi> 10.1145/1281192.1281227]
|
| |
26
|
Nadeau D. and Sekine S. A Survey of Named Entity Recognition and Classification. In: Sekine, S. and Ranchhod, E. Named Entities: Recognition, classification and use. Special issue of Linguistics Investigationes. 30(1) pp. 3--26.
|
|