| A unifying framework for detecting outliers and change points from non-stationary time series data |
| Full text |
Pdf
(573 KB)
|
| Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Edmonton, Alberta, Canada
POSTER SESSION: Poster papers
table of contents
Pages: 676 - 681
Year of Publication: 2002
ISBN:1-58113-567-X
|
|
Authors
|
|
Kenji Yamanishi
|
NEC Corporation, 4-1-1, Miyazaki, Miyamae, Kawasaki, Kanagawa 216-8555, JAPAN
|
|
Jun-ichi Takeuchi
|
NEC Corporation, 4-1-1, Miyazaki, Miyamae, Kawasaki, Kanagawa 216-8555, JAPAN
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 21, Downloads (12 Months): 144, Citation Count: 11
|
|
|
ABSTRACT
We are concerned with the issues of outlier detection and change point detection from a data stream. In the area of data mining, there have been increased interest in these issues since the former is related to fraud detection, rare event discovery, etc., while the latter is related to event/trend by change detection, activity monitoring, etc. Specifically, it is important to consider the situation where the data source is non-stationary, since the nature of data source may change over time in real applications. Although in most previous work outlier detection and change point detection have not been related explicitly, this paper presents a unifying framework for dealing with both of them on the basis of the theory of on-line learning of non-stationary time series. In this framework a probabilistic model of the data source is incrementally learned using an on-line discounting learning algorithm, which can track the changing data source adaptively by forgetting the effect of past data gradually. Then the score for any given data is calculated to measure its deviation from the learned model, with a higher score indicating a high possibility of being an outlier. Further change points in a data stream are detected by applying this scoring method into a time series of moving averaged losses for prediction using the learned model. Specifically we develop an efficient algorithms for on-line discounting learning of auto-regression models from time series data, and demonstrate the validity of our framework through simulation and experimental applications to stock market data analysis.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
H. Akaike and G. Kitagawa, Practices in Time Series Analysis I, II, Asakura Shoten (in Japanese), 1994, 1995.
|
| |
2
|
V. Barnett and T. Lewis, Outliers in Statistical Data, John Wiley & Sons, 1994.
|
| |
3
|
P. Burge and J. Shaw-Taylor, Detecting cellular fraud using adaptive prototypes, in Proc. of AI Approaches to Fraud Detection and Risk Management, pp:9--13, 1997.
|
 |
4
|
|
 |
5
|
|
| |
6
|
2S. B. Guthery, Partition regression, Jr. Amer. Statist. Ass., 69:945--947, 1974.
|
| |
7
|
D. M. Hawkins, Point estimation of parameters of piecewise regression models, Jr. of the Royal Statistical Society Series C, 25(1):51--57, 1976.
|
| |
8
|
M.Huskova, Nonparametric procedures for detecting a change in simple linear regression models, in Applied Change Point Problems in Statistics (1993).
|
| |
9
|
G. Kitagawa and W.Gersch, Smoothness Priors Analysis of Time Series, Lecture Notes in Statistics, 116, Springer-Verlag (1996).
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
T.Ozaki and G.Kitagawa, A Method for Time Series Analysis, (in Japanese), Asakura Shoten, (1995).
|
| |
14
|
J. Rissanen, Fisher information and stochastic complexity, IEEE Trans. Inf. Theory, IT-42, 1, pp. 40--47 (1996).
|
 |
15
|
Kenji Yamanishi , Jun-Ichi Takeuchi , Graham Williams , Peter Milne, On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, p.320-324, August 20-23, 2000, Boston, Massachusetts, United States
[doi> 10.1145/347090.347160]
|
 |
16
|
|
CITED BY 11
|
|
|
|
Xinjie Lu , Tian Yang , Zaifei Liao , Manzoor Elahi , Wei Liu , Hongan Wang, Incremental outlier detection in data streams using local correlation integral, Proceedings of the 2009 ACM symposium on Applied Computing, March 08-12, 2009, Honolulu, Hawaii
|
|
Marcel Karnstedt , Daniel Klan , Christian Pölitz , Kai-Uwe Sattler , Conny Franke, Adaptive burst detection in a stream engine, Proceedings of the 2009 ACM symposium on Applied Computing, March 08-12, 2009, Honolulu, Hawaii
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Shyam Boriah , Vipin Kumar , Michael Steinbach , Christopher Potter , Steven Klooster, Land cover change detection: a case study, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE Design Automation Conference on
Gwo-Dong Chen
, Daniel D. Gajski
|