|
ABSTRACT
Webcams, microphones, pressure gauges and other sensors provide exciting new opportunities for querying and monitoring the physical world. In this paper we focus on querying wide area sensor databases, containing (XML) data derived from sensors spread over tens to thousands of miles. We present the first scalable system for executing XPATH queries on such databases. The system maintains the logical view of the data as a single XML document, while physically the data is fragmented across any number of host nodes. For scalability, sensor data is stored close to the sensors, but can be cached elsewhere as dictated by the queries. Our design enables self starting distributed queries that jump directly to the lowest common ancestor of the query result, dramatically reducing query response times. We present a novel query-evaluate gather technique (using XSLT) for detecting (1) which data in a local database fragment is part of the query result, and (2) how to gather the missing parts. We define partitioning and cache invariants that ensure that even partial matches on cached data are exploited and that correct answers are returned, despite our dynamic query-driven caching. Experimental results demonstrate that our techniques dramatically increase query throughputs and decrease query response times in wide area sensor databases.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Apache Xindice Database. http://www.dbxml.org.
|
| |
2
|
Xalan-Java. http://xml.apache.org/xalan-j.
|
| |
3
|
|
| |
4
|
|
 |
5
|
|
 |
6
|
Robert D. Blumofe , Christopher F. Joerg , Bradley C. Kuszmaul , Charles E. Leiserson , Keith H. Randall , Yuli Zhou, Cilk: an efficient multithreaded runtime system, Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.207-216, July 19-21, 1995, Santa Barbara, California, United States
|
| |
7
|
|
| |
8
|
M. J. Carey et al. XPERANTO: Publishing object-relational data as XML. In WebDB, 2000.
|
| |
9
|
D. Carney et al. Monitoring streams - A new class of data management applications. In VLDB, 2002.
|
| |
10
|
S. Chandrasekaran et al. TelegraphCQ: Continuous dataflow processing for an uncertain world. In CIDR, 2003.
|
| |
11
|
|
 |
12
|
Armando Fox , Steven D. Gribble , Yatin Chawathe , Eric A. Brewer , Paul Gauthier, Cluster-based scalable network services, Proceedings of the sixteenth ACM symposium on Operating systems principles, p.78-91, October 05-08, 1997, Saint Malo, France
|
| |
13
|
M. Franklin and M. Carey. Client-server caching revisited. In International Workshop on Distributed Object Management, 1992.
|
 |
14
|
|
 |
15
|
Jim Gray , Pat Helland , Patrick O'Neil , Dennis Shasha, The dangers of replication and a solution, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.173-182, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
16
|
S. Gribble, A. Halevy, Z. Ives, M. Rodrig, and D. Suciu. What can databases do for peer-to-peer. In WebDB, 2001.
|
| |
17
|
Matthew Harren , Joseph M. Hellerstein , Ryan Huebsch , Boon Thau Loo , Scott Shenker , Ion Stoica, Complex Queries in DHT-based Peer-to-Peer Networks, Revised Papers from the First International Workshop on Peer-to-Peer Systems, p.242-259, March 07-08, 2002
|
 |
18
|
Panos Kalnis , Wee Siong Ng , Beng Chin Ooi , Dimitris Papadias , Kian-Lee Tan, An adaptive peer-to-peer network for distributed caching of OLAP results, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin
[doi> 10.1145/564691.564695]
|
 |
19
|
David Karger , Eric Lehman , Tom Leighton , Rina Panigrahy , Matthew Levine , Daniel Lewin, Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web, Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, p.654-663, May 04-06, 1997, El Paso, Texas, United States
[doi> 10.1145/258533.258660]
|
| |
20
|
|
 |
21
|
|
| |
22
|
S. Madden and M. J. Franklin. Fjording the stream: An architecture for queries over streaming sensor data. In ICDE, 2002.
|
 |
23
|
|
 |
24
|
|
| |
25
|
R. Motwani et al. Query processing, approximation, and resource management in a data stream management system. In CIDR, 2003.
|
 |
26
|
|
 |
27
|
|
| |
28
|
|
| |
29
|
Jeff Sidell , Paul M. Aoki , Adam Sah , Carl Staelin , Michael Stonebraker , Andrew Yu, Data Replication in Mariposa, Proceedings of the Twelfth International Conference on Data Engineering, p.485-494, February 26-March 01, 1996
|
 |
30
|
|
| |
31
|
|
| |
32
|
B. W. Wah. File placement on distributed computer systems. IEEE Computer, 17(1), 1984.
|
| |
33
|
M. Wahl, T. Howes, and S. Kille. Lightweight Directory Access Protocol (v3). Tech report, IETF, RFC 2251, 1997.
|
CITED BY 26
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jason Campbell , Phillip B. Gibbons , Suman Nath , Padmanabhan Pillai , Srinivasan Seshan , Rahul Sukthankar, IrisNet: an internet-scale architecture for multimedia sensors, Proceedings of the 13th annual ACM international conference on Multimedia, November 06-11, 2005, Hilton, Singapore
|
|
|
|
|
|
|
|
|
Qiong Luo , Lionel M. Ni , Bingsheng He , Hejun Wu , Wenwei Xue, MEADOWS: modeling, emulation, and analysis of data of wireless sensor networks, Proceeedings of the 1st international workshop on Data management for sensor networks: in conjunction with VLDB 2004, August 30-30, 2004, Toronto, Canada
|
|
|
|
|
|
|
|
|
|
|
|
Jonathan Ledlie , Jeff Shneidman , Matt Welsh , Mema Roussopoulos , Margo Seltzer, Open problems in data collection networks, Proceedings of the 11th workshop on ACM SIGOPS European workshop: beyond the PC, p.27-es, September 19-22, 2004, Leuven, Belgium
|
|
|
|
|
|
|
|
|
Suman Nath , Amol Deshpande , Yan Ke , Phillip B. Gibbons , Brad Karp , Srinivasan Seshan, IrisNet: an architecture for internet-scale sensing services, Proceedings of the 29th international conference on Very large data bases, p.1137-1140, September 09-12, 2003, Berlin, Germany
|
|
|
Navendu Jain , Dmitry Kit , Prince Mahajan , Praveen Yalagandula , Mike Dahlin , Yin Zhang, STAR: self-tuning aggregation for scalable monitoring, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
|
|
|
Levent Gurgen , Claudia Roncancio , Cyril Labbé , André Bottaro , Vincent Olive, SStreaMWare: a service oriented middleware for heterogeneous sensor data management, Proceedings of the 5th international conference on Pervasive services, July 06-10, 2008, Sorrento, Italy
|
|
|
Nissanka B. Priyantha , Aman Kansal , Michel Goraczko , Feng Zhao, Tiny web services: design and implementation of interoperable and evolvable sensor networks, Proceedings of the 6th ACM conference on Embedded network sensor systems, November 05-07, 2008, Raleigh, NC, USA
|
|
|
|
|
|
|
|
|
Magdalena Balazinska , Amol Deshpande , Michael J. Franklin , Phillip B. Gibbons , Jim Gray , Mark Hansen , Michael Liebhold , Suman Nath , Alexander Szalay , Vincent Tao, Data Management in the Worldwide Sensor Web, IEEE Pervasive Computing, v.6 n.2, p.30-40, April 2007
|
|
|
|
|
|
|
|
|
|
|