|
ABSTRACT
Spam is a growing problem; it interferes with valid email and burdens both email users and service providers. In this work, we propose a reactive spam-filtering system based on reporter reputation for use in conjunction with existing spam-filtering techniques. The system has a trust-maintenance component for users, based on their spam-reporting behavior. The challenge that we consider is that of maintaining a reliable system, not vulnerable to malicious users, that will provide early spam-campaign detection to reduce the costs incurred by users and systems. We report on the utility of a reputation system for spam filtering that makes use of the feedback of trustworthy users. We evaluate our proposed framework, using actual complaint feedback from a large population of users, and validate its spam-filtering performance on a collection of real email traffic over several weeks. To test the broader implication of the system, we create a model of the behavior of malicious reporters, and we simulate the system under various assumptions using a synthetic dataset.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
Cormack, G. and Bratko, A. 2006. Batch and online spam filter comparison. In Proceedings of the Third Conference on Email and Anti-Spam.
|
 |
4
|
Nilesh Dalvi , Pedro Domingos , Mausam , Sumit Sanghai , Deepak Verma, Adversarial classification, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA
[doi> 10.1145/1014052.1014066]
|
| |
5
|
DCC. 2006. Dcc reputations. http://www.rhyolite.com/anti-spam/dcc/reputations.html.
|
| |
6
|
Dredze, M., Gevaryahu, R., and Elias-Bachrach, A. 2007. Learning fast classifiers for image spam. In Proceedings of the Fourth Conference on Email and Anti-Spam.
|
| |
7
|
Drucker, H., Wu, D., and Vapnik, V. 1999. Support vector machines for spam categorization. IEEE Trans. Neur. Netw. 10, 5, 1048--1054.
|
 |
8
|
|
| |
9
|
FTC. 2003. The can-spam act: Requirements for commercial emailers. http://www.ftc.gov/bcp/conline/pubs/buspubs/canspam.shtm.
|
| |
10
|
Golbeck, J. and Hendler, J. 2004. Reputation network analysis for email filtering. In Proceedings of the First Conference on Email and Anti-Spam.
|
| |
11
|
Goodman, J. and Yih, W. 2006. Online discriminative spam filter training. In Proceedings of the Third Conference on Email and Anti-Spam.
|
| |
12
|
Hall, R. J. 1999. A countermeasure to duplicate-detecting anti-spam techniques. Tech. rep. 99.9.1. AT&T Labs Research, Florham Park and Middletown, NJ.
|
| |
13
|
He, J. and Thiesson, B. 2007. Asymmetric gradient boosting with application to spam filtering. In Proceedings of the Fourth Conference on Email and Anti-Spam.
|
 |
14
|
|
| |
15
|
Hovold, J. 2005. Naive Bayes spam filtering using word-position-based attributes. In Proceedings of the Second Conference on Email and Anti-Spam.
|
| |
16
|
|
| |
17
|
Ko&lstoke;cz, A. and Alspector, J. 2001. SVM-based filtering of e-mail spam with content-specific misclassification costs. In Proceedings of the IEEE ICDM Workshop on Text Mining (TextDM'2001).
|
 |
18
|
|
| |
19
|
Ko&lstoke;cz, A., Chowdhury, A., and Alspector, J. 2004. The impact of feature selection on signature-driven spam detection. In Proceedings of the First Conference on Email and Anti -Spam.
|
| |
20
|
Ludeman, P. and Libbey, M. 2006. Algorithmically determining store-and-forward MTA relays using domainkeys. In Proceedings of the Third Conference on Email and Anti-Spam.
|
| |
21
|
Metsis, V., Androutsopoulos, I., and Paliouras, G. 2006. Spam filtering with naive Bayes—which naive Bayes? In Proceedings of the Third Conference on Email and Anti-Spam.
|
| |
22
|
Meyer, T. and Whateley, B. 2004. Spambayes: Effective open-source, Bayesian based, email classification system. In Proceedings of the First Conference on Email and Anti-Spam.
|
 |
23
|
|
| |
24
|
Prakash, V. and O'Donnell, A. 2007. A reputation-based approach for efficient filtration of spam. http://www.cloudmark.com/releases/docs/wp_reputation_filtration_10640406.pdf.
|
| |
25
|
Prince, M., Dahl, B., Holloway, L., Keller, A., and Langheinrich, E. 2005. Understanding how spammers steal your e-mail address: An analysis of the first six months of data from Project Honey Pot. In Proceedings of the Second Conference on Email and Anti-Spam.
|
| |
26
|
|
| |
27
|
Resnick, P. and Zeckhauser, R. 2002. Trust among strangers in Internet transactions: Empirical analysis of Ebay's reputation system. Adv. Appl. Microecon. 11, 127--157.
|
 |
28
|
|
| |
29
|
Rios, G. and Zha, H. 2004. Exploring support vector machines and random forests for spam detection. In Proceedings of the First Conference on Email and Anti-Spam.
|
| |
30
|
Sahami, M., Dumais, S., Heckerman, D., and Horvitz, E. 1998. A Bayesian approach to filtering junk e-mail. In Proceedings of the AAAI-98 Workshop on Learning for Text Categorization. Madison, WI.
|
| |
31
|
|
 |
32
|
|
| |
33
|
Symantec. 2004. White paper: Filtering technologies in symantec brightmail antispam 6.0. http://www.symantec.com/offer?a_id=19959.
|
| |
34
|
Taylor, B. 2006. Sender reputation in a large Webmail service. In Proceedings of the Third Conference on Email and Anti-Spam.
|
| |
35
|
Mark Witkowski , Alexander Artikis , Jeremy Pitt, Experiments in Building Experiential Trust in a Society of Objective-Trust Based Agents, Proceedings of the workshop on Deception, Fraud, and Trust in Agent Societies held during the Autonomous Agents Conference: Trust in Cyber-societies, Integrating the Human and Artificial Perspectives, p.111-132, June 01, 2000
|
| |
36
|
Yih, W., Goodman, J., and Hulten, G. 2006. Learning at low false positive rates. In Proceedings of the Third Conference on Email and Anti-Spam.
|
 |
37
|
Kenichi YOSHIDA , Fuminori ADACHI , Takashi WASHIO , Hiroshi MOTODA , Teruaki HOMMA , Akihiro NAKASHIMA , Hiromitsu FUJIKAWA , Katsuyuki YAMAZAKI, Density-based spam detector, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA
[doi> 10.1145/1014052.1014107]
|
|