| Stream prediction using a generative model based on frequent episodes in event sequences |
| Full text |
Pdf
(271 KB)
|
Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Las Vegas, Nevada, USA
SESSION: Research papers
table of contents
Pages 453-461
Year of Publication: 2008
ISBN:978-1-60558-193-4
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Citation Count: 1
|
|
|
ABSTRACT
This paper presents a new algorithm for sequence prediction over long categorical event streams. The input to the algorithm is a set of target event types whose occurrences we wish to predict. The algorithm examines windows of events that precede occurrences of the target event types in historical data. The set of significant frequent episodes associated with each target event type is obtained based on formal connections between frequent episodes and Hidden Markov Models (HMMs). Each significant episode is associated with a specialized HMM, and a mixture of such HMMs is estimated for every target event type. The likelihoods of the current window of events, under these mixture models, are used to predict future occurrences of target events in the data. The only user-defined model parameter in the algorithm is the length of the windows of events used during model estimation. We first evaluate the algorithm on synthetic data that was generated by embedding (in varying levels of noise) patterns which are preselected to characterize occurrences of target events. We then present an application of the algorithm for predicting targeted user-behaviors from large volumes of anonymous search session interaction logs from a commercially-deployed web browser tool-bar.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J. Bilmes. A gentle tutorial on the EM algorithm and its application to parameter estimation for gaussian mixture and hidden markov models. Technical Report TR-97-021, International Computer Science Institute, Berkeley, California, Apr. 1997.
|
| |
2
|
|
| |
3
|
D. Downey, S. T. Dumais, and E. Horvitz. Models of searching and browsing: Languages, studies and applications. In Proceedings of International Joint Conference on Artificial Intelligence, pages 2740--2747, 2007.
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
G. M. Weiss and H. Hirsh. Learning to predict rare events in event sequences. In Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD 98), pages 359--363, New York City, NY, USA, Aug. 27-31 1998.
|
| |
14
|
A. Ypma and T. Heskes. Automatic categorization of web pages and user clustering with mixtures of Hidden Markov Models. In Lecture Notes in Computer Science, Proceedings of WEBKDD 2002 - Mining Web Data for Discovering Usage Patterns and Profiles, volume 2703, pages 35--49, 2003.
|
|