|
ABSTRACT
In emerging applications such as location-based services, sensor monitoring and biological management systems, the values of the database items are naturally imprecise. For these uncertain databases, an important query is the Probabilistic k-Nearest-Neighbor Query (k-PNN), which computes the probabilities of sets of k objects for being the closest to a given query point. The evaluation of this query can be both computationally- and I/O-expensive, since there is an exponentially large number of k object-sets, and numerical integration is required. Often a user may not be concerned about the exact probability values. For example, he may only need answers that have sufficiently high confidence. We thus propose the Probabilistic Threshold k-Nearest-Neighbor Query (T-k-PNN), which returns sets of k objects that satisfy the query with probabilities higher than some threshold T. Three steps are proposed to handle this query efficiently. In the first stage, objects that cannot constitute an answer are filtered with the aid of a spatial index. The second step, called probabilistic candidate selection, significantly prunes a number of candidate sets to be examined. The remaining sets are sent for verification, which derives the lower and upper bounds of answer probabilities, so that a candidate set can be quickly decided on whether it should be included in the answer. We also examine spatially-efficient data structures that support these methods. Our solution can be applied to uncertain data with arbitrary probability density functions. We have also performed extensive experiments to examine the effectiveness of our methods.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
P. A. Sistla, O. Wolfson, S. Chamberlain, and S. Dao, "Querying the uncertain position of moving objects," in Temporal Databases: Research and Practice, 1998.
|
| |
2
|
D. Pfoser and C. Jensen, "Capturing the uncertainty of moving-objects representations," in Proc. SSDBM, 1999.
|
| |
3
|
Amol Deshpande , Carlos Guestrin , Samuel R. Madden , Joseph M. Hellerstein , Wei Hong, Model-driven data acquisition in sensor networks, Proceedings of the Thirtieth international conference on Very large data bases, p.588-599, August 31-September 03, 2004, Toronto, Canada
|
| |
4
|
|
| |
5
|
V. Ljosa and A. K. Singh, "APLA: Indexing arbitrary probability distributions," in Proc. ICDE, 2007.
|
 |
6
|
|
| |
7
|
J. Chen and R. Cheng, "Efficient evaluation of imprecise location-dependent queries," in Proc. ICDE, 2007.
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
Parag Agrawal , Omar Benjelloun , Anish Das Sarma , Chris Hayworth , Shubha Nabar , Tomoe Sugihara , Jennifer Widom, Trio: a system for data, uncertainty, and lineage, Proceedings of the 32nd international conference on Very large data bases, September 12-15, 2006, Seoul, Korea
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
Nick Koudas , Beng Chin Ooi , Kian-Lee Tan , Rui Zhang, Approximate NN queries on streams with guaranteed error/performance bounds, Proceedings of the Thirtieth international conference on Very large data bases, p.804-815, August 31-September 03, 2004, Toronto, Canada
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
| |
19
|
|
 |
20
|
Sarvjeet Singh , Chris Mayfield , Sagar Mittal , Sunil Prabhakar , Susanne Hambrusch , Rahul Shah, Orion 2.0: native support for uncertain data, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada
[doi> 10.1145/1376616.1376744]
|
| |
21
|
Singh et al, "Database support for pdf attributes," in Proc. ICDE, 2008.
|
| |
22
|
|
| |
23
|
H. Kriegel, P. Kunath, and M. Renz, "Probabilistic nearest-neighbor query on uncertain objects," in DASFAA, 2007.
|
| |
24
|
Y. Qi, S. Singh, R. Shah, and S. Prabhakar, "Indexing probabilistic nearest-neighbor threshold queries," in Proc. Workshop on Management of Uncertain Data, 2008.
|
| |
25
|
|
| |
26
|
R. Cheng, J. Chen, M. Mokbel, and C. Chow, "Probabilistic verifiers: Evaluating constrained nearest-neighbor queries over uncertain data," in Proc. ICDE, 2008.
|
| |
27
|
Yufei Tao , Reynold Cheng , Xiaokui Xiao , Wang Kay Ngai , Ben Kao , Sunil Prabhakar, Indexing multi-dimensional uncertain data with arbitrary probability density functions, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
| |
28
|
|
 |
29
|
|
| |
30
|
M. Soliman, I. Ilyas, and K. Chang, "Top-k query processing in uncertain databases," in Proc. ICDE, 2007.
|
 |
31
|
|
| |
32
|
|
| |
33
|
|
| |
34
|
|
 |
35
|
|
|