| GAMPS: compressing multi sensor data by grouping and amplitude scaling |
| Full text |
Pdf
(2.55 MB)
|
Source
|
International Conference on Management of Data
archive
Proceedings of the 35th SIGMOD international conference on Management of data
table of contents
Providence, Rhode Island, USA
SESSION: Research session 20: data management pearls
table of contents
Pages 771-784
Year of Publication: 2009
ISBN:978-1-60558-551-2
|
|
Authors
|
|
Sorabh Gandhi
|
University of California, Santa Barbara, Santa Barbara, CA, USA
|
|
Suman Nath
|
Microsoft Research, Redmond, WA, USA
|
|
Subhash Suri
|
University of California, Santa Barbara, Santa Barbara, CA, USA
|
|
Jie Liu
|
Microsoft Research, Redmond, WA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 28, Downloads (12 Months): 128, Citation Count: 0
|
|
|
ABSTRACT
We consider the problem of collectively approximating a set of sensor signals using the least amount of space so that any individual signal can be efficiently reconstructed within a given maximum (L∞) error ε. The problem arises naturally in applications that need to collect large amounts of data from multiple concurrent sources, such as sensors, servers and network routers, and archive them over a long period of time for offline data mining. We present GAMPS, a general framework that addresses this problem by combining several novel techniques. First, it dynamically groups multiple signals together so that signals within each group are correlated and can be maximally compressed jointly. Second, it appropriately scales the amplitudes of different signals within a group and compresses them within the maximum allowed reconstruction error bound. Our schemes are polynomial time O(α, β approximation schemes, meaning that the maximum (L∞) error is at most α ε and it uses at most β times the optimal memory. Finally, GAMPS maintains an index so that various queries can be issued directly on compressed data. Our experiments on several real-world sensor datasets show that GAMPS significantly reduces space without compromising the quality of search and query.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Balinksi, M. L. On finding integer solutions to linear programs.In IBM Scientific Computing Symposium on Combinatorial Problems (1966), IBM, pp.225--248.
|
| |
3
|
Buragohain, C., Shrivastava, N., and Suri, S. Space efficient streaming algorithms for the maximum error histogram.In ICDE (2007).
|
| |
4
|
Chan, K., and Fu, A. efficient time series matching by wavelets. In ICDE (1999).
|
| |
5
|
Qiuxia Chen , Lei Chen , Xiang Lian , Yunhao Liu , Jeffrey Xu Yu, Indexable PLA for efficient similarity search, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
 |
6
|
|
| |
7
|
Dang, T., Bulusu, N., and chi Feng, W. RIDA:A robust information-driven data compression architecture for irregular wireless sensor networks. In EWSN (2007).
|
| |
8
|
|
 |
9
|
|
 |
10
|
Christos Faloutsos , M. Ranganathan , Yannis Manolopoulos, Fast subsequence matching in time-series databases, Proceedings of the 1994 ACM SIGMOD international conference on Management of data, p.419-429, May 24-27, 1994, Minneapolis, Minnesota, United States
|
 |
11
|
|
 |
12
|
Carlos Guestrin , Peter Bodik , Romain Thibaux , Mark Paskin , Samuel Madden, Distributed regression: an efficient framework for modeling sensor network data, Proceedings of the 3rd international symposium on Information processing in sensor networks, April 26-27, 2004, Berkeley, California, USA
[doi> 10.1145/984622.984624]
|
 |
13
|
|
 |
14
|
Eamonn Keogh , Kaushik Chakrabarti , Michael Pazzani , Sharad Mehrotra, Locally adaptive dimensionality reduction for indexing large time series databases, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.151-162, May 21-24, 2001, Santa Barbara, California, United States
|
| |
15
|
|
| |
16
|
|
 |
17
|
Flip Korn , H. V. Jagadish , Christos Faloutsos, Efficiently supporting ad hoc queries in large datasets of time sequences, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.289-300, May 11-15, 1997, Tucson, Arizona, United States
|
| |
18
|
Kuehn, A., and Hamburger, M. Aheuristic program for locating warehouses. Management Science 9 (1963), 643--666.
|
| |
19
|
Lazaridis, I., and Mehrotra, S. Capturing sensor-generated time series with quality guarantees. In ICDE (2003).
|
| |
20
|
|
 |
21
|
Jessica Lin , Eamonn Keogh , Stefano Lonardi , Bill Chiu, A symbolic representation of time series, with implications for streaming algorithms, Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, June 13-13, 2003, San Diego, California
[doi> 10.1145/882082.882086]
|
| |
22
|
Lin, J., Vlachos, M., Keogh, E., and Gunopulos, D. Iterative incremental clustering of time series. In EDBT (2004).
|
| |
23
|
Liu, J., Priyantha, B., Zhao, F., Liang, C. M., Wang, Q., and James, S. Towards discovering data center genome using sensor nets. In HotEmNets (2008).
|
| |
24
|
|
| |
25
|
|
 |
26
|
|
 |
27
|
|
| |
28
|
|
| |
29
|
|
| |
30
|
Rugh, W. J. Linear System Theory Prentice Hall, 1995.
|
| |
31
|
Slepian, D., and Wolf, J. Noiseless coding of correlated information sources. IEEE Trans. on Info. Theory 19 (1973).
|
| |
32
|
|
| |
33
|
Xiong, Z., Liveris, A. D., and Cheng, S. Distributed source coding for sensor networks. IEEE Signal Processing Magazine 21 (2004).
|
|