|
ABSTRACT
Clustering is an established data mining technique for grouping objects based on similarity. For sensor networks one aims at grouping sensor measurements in groups of similar measurements. As sensor networks have limited resources in terms of available memory and energy, a major task sensor clustering is efficient computation on sensor nodes. As a dominating energy consuming task, communication has to be reduced for a better energy efficiency. Considering memory, one has to reduce the amount of stored information on each sensor node. For in-network clustering, k-center based approaches provide k representatives out of the collected sensor measurements. We propose EDISKCO, an outlier aware incremental method for efficient detection of k-center clusters. Our novel approach is energy aware and reduces amount of required transmissions while producing high quality clustering results. In thorough experiments on synthetic and real world data sets, we show that our approach outperforms a competing technique in both clustering quality and energy efficiency. Thus, we achieve overall significantly better life times of our sensor networks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
I. Assent, R. Krieger, E. Müller, and T. Seidl. INSCY: Indexing subspace clusters with in-process-removal of redundancy. In Proc. IEEE ICDM, pages 719--724, 2008.
|
| |
2
|
K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is nearest neighbors meaningful. In Proc. IDBT, pages 217--235, 1999.
|
| |
3
|
M. Charikar, C. Chekuri, T. Feder, and R. Motwani. Incremental clustering and dynamic information retrieval. In Proc. ACM STOC:, pages 626--635, 1997.
|
| |
4
|
M. Charikar, S. Khuller, D. M. Mount, and G. Narasimhan. Algorithms for facility location problems with outliers. In Proc. SODA, pages 642--651, 2001.
|
| |
5
|
M. Charikar, L. O'Callaghan, and R. Panigrahy. Better streaming algorithms for clustering problems. In Proc. ACM STOC, pages 30--39, 2003.
|
| |
6
|
G. Cormode, S. Muthukrishnan, and W. Zhuang. Conquering the divide: Continuous clustering of distributed data streams. In Proc. IEEE ICDE, pages 1036--1045, 2007.
|
| |
7
|
T. Feder and D. Greene. Optimal algorithms for approximate clustering. In Proc. ACM STOC, pages 434--444, 1988.
|
| |
8
|
A. R. Ganguly, J. Gama, O. A. Omitaomu, M. M. Gaber, and R. R. Vatsavai. Knowledge Discovery from Sensor Data. 2008.
|
| |
9
|
T. F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38(2--3):293--306, 1985.
|
| |
10
|
S. Guha. Tight results for clustering and summarizing data streams. In Proc. ICDT, pages 268--275, 2009.
|
| |
11
|
W. R. Heinzelman, A. Chandrakasan, and H. Balakrishnan. Energy-efficient communication protocol for wireless microsensor networks. In Proc. HICSS, 2000.
|
| |
12
|
D. Hochbaum and D. Shmoys. A best possible approximation algorithm for the k-centre problem. Math. of Operations Research, 10:180--184, 1985.
|
| |
13
|
R. Matthew Mccutchen and S. Khuller. Streaming algorithms for k-center clustering with outliers and with anonymity. In Proc. Workshop APPROX / RANDOM, pages 165--178, 2008.
|
| |
14
|
E. Müller, I. Assent, R. Krieger, S. Günnemann, and T. Seidl. DensEst: Density estimation for data mining in high dimensional spaces. In Proc. SIAM SDM, pages 173--184, 2009.
|
| |
15
|
E. Müller, I. Assent, U. Steinhausen, and T. Seidl. OutRank: ranking outliers in high dimensional data. In Proc. DBRank Workshop at IEEE ICDE, pages 600--603, 2008.
|
| |
16
|
S. F. Ossama Younis. Heed: A hybrid, energy-efficient, distributed clustering approach for ad hoc sensor networks. IEEE Transactions on Mobile Computing, 3(4):366--379, 10 2004.
|
| |
17
|
A. S. Tanenbaum, C. Gamage, and B. Crispo. Taking sensor networks from the lab to the jungle. Computer, 39(8):98--100, 2006.
|
| |
18
|
R. Tibshirani and G. Walther. Cluster validation by prediction strength. Journal of Computational & Graphical Statistics, 14(3):511--528, September 2005.
|
|