ACM Home Page
Please provide us with feedback. Feedback
Evaluating probability threshold k-nearest-neighbor queries over uncertain data
Full text PdfPdf (906 KB)
Source Extending Database Technology; Vol. 360 archive
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology table of contents
Saint Petersburg, Russia
SESSION: Research sessions: Uncertainty table of contents
Pages 672-683  
Year of Publication: 2009
ISBN:978-1-60558-422-5
Authors
Reynold Cheng  The University of Hong Kong
Lei Chen  Hong Kong Univ. of Sci. & Tech.
Jinchuan Chen  Hong Kong Polytechnic University
Xike Xie  The University of Hong Kong
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 23,   Downloads (12 Months): 166,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1516360.1516438
What is a DOI?

ABSTRACT

In emerging applications such as location-based services, sensor monitoring and biological management systems, the values of the database items are naturally imprecise. For these uncertain databases, an important query is the Probabilistic k-Nearest-Neighbor Query (k-PNN), which computes the probabilities of sets of k objects for being the closest to a given query point. The evaluation of this query can be both computationally- and I/O-expensive, since there is an exponentially large number of k object-sets, and numerical integration is required. Often a user may not be concerned about the exact probability values. For example, he may only need answers that have sufficiently high confidence. We thus propose the Probabilistic Threshold k-Nearest-Neighbor Query (T-k-PNN), which returns sets of k objects that satisfy the query with probabilities higher than some threshold T. Three steps are proposed to handle this query efficiently. In the first stage, objects that cannot constitute an answer are filtered with the aid of a spatial index. The second step, called probabilistic candidate selection, significantly prunes a number of candidate sets to be examined. The remaining sets are sent for verification, which derives the lower and upper bounds of answer probabilities, so that a candidate set can be quickly decided on whether it should be included in the answer. We also examine spatially-efficient data structures that support these methods. Our solution can be applied to uncertain data with arbitrary probability density functions. We have also performed extensive experiments to examine the effectiveness of our methods.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
P. A. Sistla, O. Wolfson, S. Chamberlain, and S. Dao, "Querying the uncertain position of moving objects," in Temporal Databases: Research and Practice, 1998.
 
2
D. Pfoser and C. Jensen, "Capturing the uncertainty of moving-objects representations," in Proc. SSDBM, 1999.
 
3
 
4
 
5
V. Ljosa and A. K. Singh, "APLA: Indexing arbitrary probability distributions," in Proc. ICDE, 2007.
6
 
7
J. Chen and R. Cheng, "Efficient evaluation of imprecise location-dependent queries," in Proc. ICDE, 2007.
 
8
 
9
 
10
 
11
 
12
 
13
 
14
 
15
 
16
 
17
 
18
 
19
20
 
21
Singh et al, "Database support for pdf attributes," in Proc. ICDE, 2008.
 
22
 
23
H. Kriegel, P. Kunath, and M. Renz, "Probabilistic nearest-neighbor query on uncertain objects," in DASFAA, 2007.
 
24
Y. Qi, S. Singh, R. Shah, and S. Prabhakar, "Indexing probabilistic nearest-neighbor threshold queries," in Proc. Workshop on Management of Uncertain Data, 2008.
 
25
 
26
R. Cheng, J. Chen, M. Mokbel, and C. Chow, "Probabilistic verifiers: Evaluating constrained nearest-neighbor queries over uncertain data," in Proc. ICDE, 2008.
 
27
 
28
29
 
30
M. Soliman, I. Ilyas, and K. Chang, "Top-k query processing in uncertain databases," in Proc. ICDE, 2007.
31
 
32
 
33
 
34
35
Collaborative Colleagues:
Reynold Cheng: colleagues
Lei Chen: colleagues
Jinchuan Chen: colleagues
Xike Xie: colleagues