| On burstiness-aware search for document sequences |
| Full text |
Mov
(15:23),
Pdf
(416 KB)
|
Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Paris, France
SESSION: Research track papers
table of contents
Pages 477-486
Year of Publication: 2009
ISBN:978-1-60558-495-9
|
|
Authors
|
|
Theodoros Lappas
|
University of Caifornia, Riverside, Riverside, CA, USA
|
|
Benjamin Arai
|
University of Caifornia, Riverside, Riverside, CA, USA
|
|
Manolis Platakis
|
University of Athens, Athens, Greece
|
|
Dimitrios Kotsakos
|
University of Athens, Athens, Greece
|
|
Dimitrios Gunopulos
|
University of Caifornia, Riverside, Riverside, CA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 40, Downloads (12 Months): 115, Citation Count: 0
|
|
|
ABSTRACT
As the number and size of large timestamped collections (e.g. sequences of digitized newspapers, periodicals, blogs) increase, the problem of efficiently indexing and searching such data becomes more important. Term burstiness has been extensively researched as a mechanism to address event detection in the context of such collections. In this paper, we explore how burstiness information can be further utilized to enhance the search process. We present a novel approach to model the burstiness of a term, using discrepancy theory concepts. This allows us to build a parameter-free, linear-time approach to identify the time intervals of maximum burstiness for a given term. Finally, we describe the first burstiness-driven search framework and thoroughly evaluate our approach in the context of different scenarios.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
 |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
 |
8
|
|
 |
9
|
|
| |
10
|
|
 |
11
|
|
| |
12
|
Q. He, K. Chang, E.-P. Lim, and J. Zhang. Bursty feature representation for clustering text streams. In SIAM '07.
|
 |
13
|
|
 |
14
|
|
| |
15
|
National Digital Newspaper Program (NDNP), http://www.loc.gov/ndnp.
|
| |
16
|
|
| |
17
|
|
 |
18
|
Michail Vlachos , Christopher Meek , Zografoula Vagena , Dimitrios Gunopulos, Identifying similarities, periodicities and bursts for online search queries, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
[doi> 10.1145/1007568.1007586]
|
 |
19
|
|
|