ACM Home Page
Please provide us with feedback. Feedback
Exploiting contexts to deal with uncertainty in classification
Full text PdfPdf (300 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 1st ACM SIGKDD Workshop on Knowledge Discovery from Uncertain Data table of contents
Paris, France
Pages 19-22  
Year of Publication: 2009
ISBN:978-1-60558-675-5
Authors
Bianca Zadrozny  Fluminense Fed. Univ., Niterói, Brazil
Gisele L. Pappa  Fed. Univ. of Minas Gerais, Belo Horizonte, Brazil
Wagner Meira, Jr.  Fed. Univ. of Minas Gerais, Belo Horizonte, Brazil
Marcos André Gonçalves  Fed. Univ. of Minas Gerais, Belo Horizonte, Brazil
Leonardo Rocha  Fed. Univ. São João Del Rei, São João Del Rei, Brazil
Thiago Salles  Fed. Univ. of Minas Gerais, Belo Horizonte, Brazil
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 13,   Downloads (12 Months): 19,   Citation Count: 0
Additional Information:

abstract   references   index terms  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1610555.1610558
What is a DOI?

ABSTRACT

Uncertainty is often inherent to data and still there are just a few data mining algorithms that handle it. In this paper we focus on how to account for uncertainty in classification algorithms, in particular when data attributes should not be considered completely truthful for classifying a given sample. Our starting point is that each piece of data comes from a potentially different context and, by estimating context probabilities of an unknown sample, we may derive a weight that quantifies their influence. We propose a lazy classification strategy that incorporates the uncertainty into both the training and usage of classifiers. We also propose uK-NN, an extension of the traditional K-NN that implements our approach. Finally, we illustrate uK-NN, which is currently being evaluated experimentally, using a document classification toy example.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
C. C. Aggarwal. On density based transforms for uncertain data mining. In Proc. of ICDE, pages 866--875. IEEE Computer Society, 2007.
 
2
C. C. Aggarwal and P. S. Yu. A survey of uncertain data algorithms and applications. IEEE Trans. on Knowledge and Data Engineering, 21(5):609--623, 2009.
 
3
J. Bi and T. Zhang. Support vector classification with input data uncertainty. In Proc. Advances in Neural Information Processing Systems (NIPS), pages 161--168, 2004.
 
4
M. Chau, R. Cheng, B. Kao, and J. Ng. Uncertain data mining: An example in clustering location data. In Proc. of 10th PAKDD, pages 199--204, 2006.
 
5
C. K. Chui, B. Kao, and E. Hung. Mining frequent itemsets from uncertain data. In Proc. of 11th PAKDD, 2007.
 
6
T. Cover and P. Hart. Nearest neighbor pattern classification. Knowledge Based Systems, 8(6):373--389, 1995.
 
7
L. C. da Rocha, F. Mourão, A. M. Pereira, M. A. Gonçalves, and W. Meira Jr. Exploiting temporal contexts in text classification. In CIKM, pages 243--252, 2008.
 
8
M. Hua and J. Pei. Cleaning disguised missing data: a heuristic approach. In Proc. of the 13th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pages 950--958. ACM, 2007.
 
9
H.-P. Kriegel and M. Pfeifle. Hierarchical density-based clustering of uncertain data. In Proc. of the 5th ICDM, pages 689--692. IEEE Computer Society, 2005.
 
10
A. Niculescu-Mizil and R. Caruana. Predicting good probabilities with supervised learning. In Proc. of the 22nd ICML, pages 625--632, 2005.
 
11
B. Qin, Y. Xia, S. Prabhakar, and Y. Tu. A rule-based classification algorithm for uncertain data. In 1st MOUND 2009 at ICDE, 2009.
 
12
B. Zadrozny, J. Langford, and N. Abe. Cost-sensitive learning by cost-proportionate example weighting. In Proc. of 3rd ICDM, pages 435--442, 2003.