ACM Home Page
Please provide us with feedback. Feedback
Learning to detect phishing emails
Full text PdfPdf (235 KB)
Source
International World Wide Web Conference archive
Proceedings of the 16th international conference on World Wide Web table of contents
Banff, Alberta, Canada
SESSION: Passwords and phishing table of contents
Pages: 649 - 656  
Year of Publication: 2007
ISBN:978-1-59593-654-7
Authors
Ian Fette  Carnegie Mellon University
Norman Sadeh  Carnegie Mellon University
Anthony Tomasic  Carnegie Mellon University
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 37,   Downloads (12 Months): 339,   Citation Count: 8
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1242572.1242660
What is a DOI?

ABSTRACT

Each month, more attacks are launched with the aim of making web users believe that they are communicating with a trusted entity for the purpose of stealing account information, logon credentials, and identity information in general. This attack method, commonly known as "phishing," is most commonly initiated by sending out emails with links to spoofed websites that harvest information. We present a method for detecting these attacks, which in its most general form is an application of machine learning on a feature set designed to highlight user-targeted deception in electronic communication. This method is applicable, with slight modification, to detection of phishing websites, or the emails used to direct victims to these sites. We evaluate this method on a set of approximately 860 such phishing emails, and 6950 non-phishing emails, and correctly identify over 96% of the phishing emails while only mis-classifying on the order of 0.1% of the legitimate emails. We conclude with thoughts on the future for such techniques to specifically identify deception, specifically with respect to the evolutionary nature of the attacks and information available.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
K. Albrecht, N. Burri, and R. Wattenhofer. Spamato - An Extendable Spam Filter System. In 2nd Conference on Email and Anti-Spam (CEAS), Stanford University, Palo Alto, California, USA, July 2005.
 
2
A. Alsaid and C. J. Mitchell. Installing fake root keys in a pc. In EuroPKI, pages 227--239, 2005.
 
3
Anti-Phishing Working Group. Phishing activity trends report, Jan. 2005. http://www.antiphishing.org/reports/apwg_report_jan_2006.pdf.
 
4
Apache Software Foundation. Spamassassin homepage, 2006. http://spamassassin.apache.org/.
 
5
Apache Software Foundation. Spamassassin public corpus, 2006. http://spamassassin.apache.org/publiccorpus/.
 
6
 
7
M. Chandrasekaran, K. Karayanan, and S. Upadhyaya. Towards phishing e-mail detection based on their structural properties. In New York State Cyber Security Conference, 2006.
 
8
N. Chou, R. Ledesma, Y. Teraguchi, and J. C. Mitchell. Client-side defense against web-based identity theft. In NDSS, 2004.
 
9
W. Cohen. Learning to classify English text with ILP methods. In L. De Raedt, editor, Advances in Inductive Logic Programming, pages 124--143. IOS Press, 1996.
 
10
L. Cranor, S. Egelman, J. Hong, and Y. Zhang. Phinding phish: An evaluation of anti-phishing toolbars. Technical report, Carnegie Mellon University, Nov. 2006.
 
11
 
12
FDIC. Putting an end to account-hijacking identity theft, Dec. 2004. http://www.fdic.gov/consumers/consumer/idtheftstudy/identity_theft.pdf.
 
13
I. Fette, N. Sadeh, and A. Tomasic. Learning to detect phishing emails. Technical Report CMU-ISRI-06-112, Institute for Software Research, Carnegie Mellon University, June 2006. http://reports-archive.adm.cs.cmu.edu/anon/isri2006/abstracts/06-112.html.
 
14
F. L. Gandon and N. M. Sadeh. Semantic web technologies to reconcile privacy and context awareness. Journal of Web Semantics, 1(3):241--260, 2004.
 
15
Gilby Productions. Tinyurl, 2006. http://www.tinyurl.com/.
 
16
P. Graham. Better bayesian filtering. In Proceedings of the 2003 Spam Conference, Jan 2003.
 
17
B. Leiba and N. Borenstein. A multifaceted approach to spam reduction. In Proceedings of the First Conference on Email and Anti-Spam (CEAS), 2004.
 
18
T. Meyer and B. Whateley. Spambayes: Effective open-source, bayesian based, email classification system. In Proceedings of the First Conference on Email and Anti-Spam (CEAS), 2004.
 
19
Microsoft. Sender ID framework, 2006. http://www.microsoft.com/senderid.
 
20
 
21
Mozilla. Mozilla thunderbird, 2006. http://www.mozilla.com/thunderbird/.
 
22
J. Nazario. phishingcorpus homepage, Apr. 2006. http://monkey.org/%7Ejose/wiki/doku.php?id=PhishingCorpus.
 
23
Netcraft Ltd. Netcraft toolbar, 2006. http://toolbar.netcraft.com/.
 
24
V. V. Prakash. Vipul's razor, 2006. http://razor.sourceforge.net.
25
 
26
I. Rigoutsos and T. Huynh. Chung-kwei: a pattern-discovery-based system for the automatic identification of unsolicited e-mail messages (spam). In Proceedings of the First Conference on Email and Anti-Spam (CEAS), 2004.
 
27
M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz. A bayesian approach to filtering junk e-mail. In Learning for Text Categorization: Papers from the 1998 Workshop, Madison, Wisconsin, 1998. AAAI Technical Report WS-98-05.
 
28
Yahoo. Domainkeys, 2006. http://antispam.yahoo.com/domainkeys.
 
29
Yahoo. Flickr homepage, 2006. http://www.flickr.com/.
30

CITED BY  8

Collaborative Colleagues:
Ian Fette: colleagues
Norman Sadeh: colleagues
Anthony Tomasic: colleagues