ACM Home Page
Please provide us with feedback. Feedback
Adversarial classification
Full text PdfPdf (215 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Seattle, WA, USA
SESSION: Research track papers table of contents
Pages: 99 - 108  
Year of Publication: 2004
ISBN:1-58113-888-1
Authors
Nilesh Dalvi  University of Washington - Seattle, Seattle, WA
Pedro Domingos  University of Washington - Seattle, Seattle, WA
Mausam  University of Washington - Seattle, Seattle, WA
Sumit Sanghai  University of Washington - Seattle, Seattle, WA
Deepak Verma  University of Washington - Seattle, Seattle, WA
Sponsors
SIGMOD: ACM Special Interest Group on Management of Data
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 13,   Downloads (12 Months): 135,   Citation Count: 20
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1014052.1014066
What is a DOI?

ABSTRACT

Essentially all data mining algorithms assume that the data-generating process is independent of the data miner's activities. However, in many domains, including spam detection, intrusion detection, fraud detection, surveillance and counter-terrorism, this is far from the case: the data is actively manipulated by an adversary seeking to make the classifier produce false negatives. In these domains, the performance of a classifier can degrade rapidly after it is deployed, as the adversary learns to defeat it. Currently the only solution to this is repeated, manual, ad hoc reconstruction of the classifier. In this paper we develop a formal framework and algorithms for this problem. We view classification as a game between the classifier and the adversary, and produce a classifier that is optimal given the adversary's optimal strategy. Experiments in a spam detection domain show that this approach can greatly outperform a classifier learned in the standard way, and (within the parameters of the problem) automatically adapt the classifier to the adversary's evolving manipulations.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
4
 
5
 
6
D. Fudenberg and D. Levine. The Theory of Learning in Games. MIT Press, Cambridge, MA, 1999.
 
7
D. Fudenberg and J. Tirole. Game Theory. MIT Press, Cambridge, MA, 1991.
 
8
 
9
L. Guernsey. Retailers rise in Google rankings as rivals cry foul. New York Times, November 20, 2003.
10
11
 
12
M. Kearns. Computational game theory. Tutorial, Department of Computer and Information Sciences, University of Pennsylvania, Philadelphia, PA, 2002. http://www.cis.upenn.edu/~mkearns/nips02tutorial/.
 
13
 
14
B. Krebs. Online piracy spurs high-tech arms race. Washington Post, June 26, 2003.
 
15
M. L. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the Eleventh International Conference on Machine Learning, pages 157--163, New Brunswick, NJ, 1994. Morgan Kaufmann.
 
16
B. Lloyd. Been gazumped by Google? Trying to make sense of the "Florida" update. Search Engine Guide, November 25, 2003.
17
 
18
A. McCallum and K. Nigam. A comparison of event models for Naive Bayes text classification. In Proceedings of the AAAI-98 Workshop on Learning for Text Categorization, Madison, WI, 1998. AAAI Press.
 
19
F. Nielsen. Email data, 2003. http://www.imm.dtu.dk/~rem/datasets/.
 
20
 
21
J. Rennie. Ifile spam classifier, 2003. http://www.nongnu.org/ifile/.
 
22
 
23
M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz. A Bayesian approach to filtering junk e-mail. In Proceedings of the AAAI-98 Workshop on Learning for Text Categorization, Madison, WI, 1998. AAAI Press.
 
24
25
 
26
J. M. Smith. Evolution and the Theory of Games. Cambridge University Press, Cambridge, UK, 1982.
 
27
P. Turney. Cost-sensitive learning bibliography. Online bibliography, NRC Institute for Information Technology, Ottawa, Canada, 1998. http://members.rogers.com/peter.turney/bibliographies/cost-sensitive.html.
 
28
WordNet 2.0: A lexical database for the English language, 2003. http://www.cogsci.princeton.edu/~wn/.

CITED BY  20

Collaborative Colleagues:
Nilesh Dalvi: colleagues
Pedro Domingos: colleagues
Mausam: colleagues
Sumit Sanghai: colleagues
Deepak Verma: colleagues