| Ranking distributed probabilistic data |
| Full text |
Pdf
(546 KB)
|
Source
|
International Conference on Management of Data
archive
Proceedings of the 35th SIGMOD international conference on Management of data
table of contents
Providence, Rhode Island, USA
SESSION: Research session 10: probabilistic databases I
table of contents
Pages 361-374
Year of Publication: 2009
ISBN:978-1-60558-551-2
|
|
Authors
|
|
Feifei Li
|
Florida State University, Tallahassee, FL, USA
|
|
Ke Yi
|
Hong Kong University of Science and Technology, Hong Kong, China
|
|
Jeffrey Jestes
|
Florida State University, Tallahassee, FL, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 52, Downloads (12 Months): 213, Citation Count: 0
|
|
|
ABSTRACT
Ranking queries are essential tools to process large amounts of probabilistic data that encode exponentially many possible deterministic instances. In many applications where uncertainty and fuzzy information arise, data are collected from multiple sources in distributed, networked locations, e.g., distributed sensor fields with imprecise measurements, multiple scientific institutes with inconsistency in their scientific data. Due to the network delay and the economic cost associated with communicating large amounts of data over a network, a fundamental problem in these scenarios is to retrieve the global top-k tuples from all distributed sites with minimum communication cost. Using the well founded notion of the expected rank of each tuple across all possible worlds as the basis of ranking, this work designs both communication- and computation-efficient algorithms for retrieving the top-k tuples with the smallest ranks from distributed sites. Extensive experiments using both synthetic and real data sets confirm the efficiency and superiority of our algorithms over the straightforward approach of forwarding all data to the server.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Parag Agrawal , Omar Benjelloun , Anish Das Sarma , Chris Hayworth , Shubha Nabar , Tomoe Sugihara , Jennifer Widom, Trio: a system for data, uncertainty, and lineage, Proceedings of the 32nd international conference on Very large data bases, September 12-15, 2006, Seoul, Korea
|
| |
2
|
|
 |
3
|
|
 |
4
|
|
 |
5
|
Jihad Boulos , Nilesh Dalvi , Bhushan Mandhani , Shobhit Mathur , Chris Re , Dan Suciu, MYSTIQ: a system for finding more answers by using probabilities, Proceedings of the 2005 ACM SIGMOD international conference on Management of data, June 14-16, 2005, Baltimore, Maryland
[doi> 10.1145/1066157.1066277]
|
 |
6
|
|
 |
7
|
|
| |
8
|
|
 |
9
|
|
 |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
Amol Deshpande , Carlos Guestrin , Samuel R. Madden , Joseph M. Hellerstein , Wei Hong, Model-driven data acquisition in sensor networks, Proceedings of the Thirtieth international conference on Very large data bases, p.588-599, August 31-September 03, 2004, Toronto, Canada
|
 |
14
|
|
| |
15
|
GLPK. GNU Linear Programming Kit. http://www.gnu.org/software/glpk/.
|
 |
16
|
|
 |
17
|
|
 |
18
|
Ravi Jampani , Fei Xu , Mingxi Wu , Luis Leopoldo Perez , Christopher Jermaine , Peter J. Haas, MCDB: a monte carlo approach to managing uncertain data, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada
[doi> 10.1145/1376616.1376686]
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
 |
22
|
|
| |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
|
| |
27
|
C. Re, N. Dalvi, and D. Suciu. efficient top-k query evaluation on probalistic databases. In ICDE, 2007.
|
| |
28
|
|
| |
29
|
P. Sen and A. Deshpande. Representing and querying correlated tuples in probabilistic databases. In ICDE, 2007.
|
| |
30
|
|
 |
31
|
|
| |
32
|
|
| |
33
|
M. A. Soliman, I. F. Ilyas, and K. C.-C. Chang. Top-k query processing in uncertain databases. In ICDE, 2007.
|
 |
34
|
|
| |
35
|
|
 |
36
|
D. Zeinalipour-Yazti , Z. Vagena , D. Gunopulos , V. Kalogeraki , V. Tsotras , M. Vlachos , N. Koudas , D. Srivastava, The threshold join algorithm for top-k queries in distributed sensor networks, Proceedings of the 2nd international workshop on Data management for sensor networks, August 30-30, 2005, Trondheim, Norway
[doi> 10.1145/1080885.1080896]
|
| |
37
|
X. Zhang and J. Chomicki. On the semantics and evaluation of top-k queries in probabilistic databases. In DBRank, 2008.
|
|