ACM Home Page
Please provide us with feedback. Feedback
A frequent pattern based framework for event detection in sensor network stream data
Full text PdfPdf (278 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the Third International Workshop on Knowledge Discovery from Sensor Data table of contents
Paris, France
SESSION: Short research papers table of contents
Pages 87-96  
Year of Publication: 2009
ISBN:978-1-60558-668-7
Authors
Li Wan  Beijing University of Posts and Telecommunications, Beijing, China and EBUPT Information Technology Co., Ltd, Beijing, China
Jianxin Liao  Beijing University of Posts and Telecommunications, Beijing, China and EBUPT Information Technology Co., Ltd, Beijing, China
Xiaomin Zhu  Beijing University of Posts and Telecommunications, Beijing, China and EBUPT Information Technology Co., Ltd, Beijing, China
Sponsors
: Cooperating Objects Network of Excellence (CONET)
: Geographic Information Science and Technology (GIST) Group at Oak Ridge National Laboratory
: Computational Sciences and Engineering (CSE) Division at the Oak Ridge National Laboratory
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 29,   Downloads (12 Months): 29,   Citation Count: 0
Additional Information:

abstract   references   index terms  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1601966.1601982
What is a DOI?

ABSTRACT

In this paper, we presented a frequent pattern based framework for event detection in stream data, it consists of frequent pattern discovery, frequent pattern selection and modeling three phases: In the first phase, a MNOE (Mining Non-Overlapping Episode) algorithm is proposed to find the non-overlapping frequent pattern in time series. In the frequent pattern selection phase, we proposed an EGMAMC (Episode Generated Memory Aggregation Markov Chain) model to help us selecting episodes which can describe stream data significantly. Then we defined feature flows to represent the instances of discovered frequent patterns and categorized the distribution of frequent pattern instances into three categories according to the spectrum of their feature flows. At last, we proposed a clustering algorithm EDPA (Event Detection by Pattern Aggregation) to aggregate strongly correlated frequent patterns together. We argue that strongly correlated frequent patterns form events and frequent patterns in different categories can be aggregated to form different kinds of events. Experiments on real-world sensor network datasets demonstrate that the proposed MNOE algorithm is more efficient than the existing non-overlapping episode mining algorithm and EDPA performs better when the input frequent patterns are maximal, significant and non-overlapping.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Li Wan, Jianxin Liao, Xiaomin Zhu, "Mining Sequential Episodes from Segmentation of Multivariate Time Series", the proceedings of DMIN 2009 (to be appear).
 
2
J. Pei, J. Han, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu, "PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth," Proc. 17th Int'l Conf. Data Eng., pp. 215--224, 2001.
 
3
N. Meger and C. Rigotti. Constraint-based mining of episode rules and optimal window sizes. In J.-F. Boulicaut, F. Esposito, F. Giannotti, and D. Pedreschi, editors, Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'04), pages 313--324. Springer, 2004.
 
4
G. Cormode, S. Muthukrishnan What's hot and what's not: tracking most frequent items dynamically. PODS 2003: 296--306
 
5
C. Giannella, J. Han, J. Pei, X. Yan, and P. S. Yu, Mining Frequent Patterns in Data Streams at Multiple Time Granularities, in H. Kargupta, A. Joshi, K. Sivakumar, and Y. Yesha (eds.), Next Generation Data Mining, AAAI/MIT, 2003.
 
6
Gaber, M, M., Krishnaswamy, S., and Zaslavsky, A., On-board Mining of Data Streams in Sensor Networks, Accepted as a chapter in the forthcoming book Advanced Methods of Knowledge Discovery from Complex Data, (Eds.) Sanghamitra Badhyopadhyay, Ujjwal Maulik, Lawrence Holder and Diane Cook.
 
7
G. S. Manku and R. Motwani. Approximate frequency counts over data streams. In Proceedings of the 28th International Conference on Very Large Data Bases, Hong Kong, China, August 2002.
 
8
Joao Gama, Pedro Pereira Rodrigues, Raquel Sebastiao, Evaluating Algorithms that Learn from Data Streams. The proceeding of SAC'09, 2009
 
9
Eduardo J. Spinosa, Joao Gama, Cluster-based novel concept detection in data streams applied to intrusion detection in computer networks, the proceeding of SAC'08, 2008.
 
10
Yixin Chen, Li Tu, Density-Based Clustering for Real-Time Stream Data, the proceedings of kdd'07, 2007.
 
11
Joao Gama, Mohamed Medhat Gaber, Learning from data streams: processing techniques in sensor networks. Springer 2007.
 
12
Auroop R. Ganguly, Joao Gama, Olufemi A. Omitaomu, Mohamed Medhat Gaber, Ranga Raju Vatsavai (Editor), Knowledge Discovery from Sensor Data (Industrial Innovation), CRC, 2008
 
13
Mohamed Medhat Gaber, Arkady Zaslavsky, Shonali Krishnaswamy, Mining Data Streams: A Review, SIGMOD Record, Vol. 34, No. 2, Page:18--26, 2005
 
14
Han J, Kamber M 2001 Data mining: Concepts and techniques (San Fransisco, CA: Morgan Kauffmann)
 
15
S. Laxman, P. S. Sastry, and K. P. Unnikrishnan. Discovering frequent episodes and learning Hidden Markov Models: A formal connection. IEEE Transactions on Knowledge and Data Engineering, 17(11):1505--1517, Nov. 2005.
 
16
S. Laxman, Stream Prediction Using A Generative Model Based On Frequent Episodes In Event Sequences Proceeding of KDD 2008
 
17
G. M. Weiss and H. Hirsh. Learning to predict rare events in event sequences. In proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD 98), pages 359--363, New York City, NY, USA, Aug. 27--31 1998.
 
18
Wren, C. R.; Ivanov, Y. A.; Leigh, D.; Westhues, J., "The MERL Motion Detector Dataset", Workshop on Massive Datasets (MD), ISBN: 978-1-50593-981-8, pp. 10--14, November 2007
 
19
db.csail.mit.edu/labdata/labdata.html.
 
20
Jessica Lin, Eamonn Keogh, Stefano Lonardi, Bill Chiu. A Symbolic Representation of Time Series, with Implications for Streaming Algorithms, In Proceeding of DMKD, 2003
 
21
Sirish Chandrasekaran, Owen Cooper, Amol Deshpande, etc TelegraphCQ: Continuous Dataflow Processing for an Uncertain World. Proceedings of CIDR 2003 Conference
 
22
D. Abadi, D. Carney, U. Cetintemel, etc. Aurora: A New Model and Architecture for Data Stream Management. In VLDB Journal (12)2: 120--139, 2003
 
23
D. Abadi, D. Carney, U. Cetintemel, etc. Aurora: A Data Stream Management System (Demonstration). In proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'03), San Diego, CA, June 2003.
 
24
Daniel J. Abadi, Yanif Ahmad, Magdalena Balazinska, etc, The Design of the Borealis Stream Processing Engine, 2nd Biennial Conference on Innovative Data Systems Research (CIDR'05), Asilomar, CA, January 2005.
 
25
R. Motwani, J. Widom, A. Arasu, etc. Query Processing, Resource Management, and Approximation in a Data Stream Management System. In Proc. of CIDR 2003, Jan. 2003