ACM Home Page
Please provide us with feedback. Feedback
Automatically classifying emails into activities
Full text PdfPdf (432 KB)
Source International Conference on Intelligent User Interfaces archive
Proceedings of the 11th international conference on Intelligent user interfaces table of contents
Sydney, Australia
SESSION: Personal assistants I table of contents
Pages: 70 - 77  
Year of Publication: 2006
ISBN:1-59593-287-9
Authors
Mark Dredze  University of Pennsylvania, Philadelphia, PA
Tessa Lau  IBM Almaden Research Center, San Jose, CA
Nicholas Kushmerick  University College Dublin, Dublin, Ireland
Sponsors
SIGCHI: ACM Special Interest Group on Computer-Human Interaction
ACM: Association for Computing Machinery
SIGART: ACM Special Interest Group on Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 155,   Citation Count: 15
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1111449.1111471
What is a DOI?

ABSTRACT

Email-based activity management systems promise to give users better tools for managing increasing volumes of email, by organizing email according to a user's activities. Current activity management systems do not automatically classify incoming messages by the activity to which they belong, instead relying on simple heuristics (such as message threads), or asking the user to manually classify incoming messages as belonging to an activity. This paper presents several algorithms for automatically recognizing emails as part of an ongoing activity. Our baseline methods are the use of message reply-to threads to determine activity membership and a naïve Bayes classifier. Our SimSubset and SimOverlap algorithms compare the people involved in an activity against the recipients of each incoming message. Our SimContent algorithm uses IRR (a variant of latent semantic indexing) to classify emails into activities using similarity based on message contents. An empirical evaluation shows that each of these methods provide a significant improvement to the baseline methods. In addition, we show that a combined approach that votes the predictions of the individual methods performs better than each individual method alone.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Manu Aery and Sharma Chakravarthy. eMailSift: mining-based approaches to email classification. In SIGIR '04: Proc. of the 27th annual intl. ACM SIGIR conf. on information retrieval, pages 580--581. ACM Press, 2004.
2
3
 
4
W. Cohen, V. Carvalho, and T. Mitchell. Learning to classify email into "speech acts". In Proc. Conf. Empirical Methods in Natural Language Processing, 2004.
 
5
Alex Cozzi, Tom Moran, and Clemens Drews. The shared checklist: Reorganizing the user experience around unified activities. In 10th Intl Conf on Human-Computer Interaction (INTERACT 2005), Sept. 2005.
 
6
S. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. Journal of the Society for Information Science, 41(6):391--407, 1990.
7
8
 
9
Y. Huang, D. Govindaraju, T. Mitchell, V. Rocha de Carvalho, and W. Cohen. Inferring ongoing activities of workstation users by clustering email. In Proc. of the 1st Conf. on Email and Anti-Spam, July 2004.
 
10
R. Khoussainov and N. Kushmerick. Email task management: An iterative relational learning approach. In Proc. Conf. Email and Anti-Spam, 2005.
 
11
S. Kiritchenko, S. Matwin, and S. Abu-Hakima. Email classification with temporal features. In Proceedings of Intelligent Information Systems, New Trends in Intelligent Information Processing and Web Mining (IIPWM) 2004, pages 523--534. Springer Verlag, 2004.
 
12
13
 
14
Andrew McCallum, Andres Corrada-Emmanuel, and Xuerui Wang. Topic and Role Discovery in Social Networks. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, July 2005.
15
16
17
 
18
M.F. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.
 
19
Mehran Sahami, Susan Dumais, David Heckerman, and Eric Horvitz. A bayesian approach to filtering junk E-mail. In Learning for Text Categorization: Papers from the 1998 Workshop, Madison, Wisconsin, 1998. AAAI Technical Report WS-98-05.
 
20
R. Segal, J. Crawford, J. Kephart, and B. Leiba. SpamGuru: An Enterprise Anti-Spam Filtering System. In Proceedings of the First Conference on Email and Anti-Spam, July 2004.
 
21
 
22
A. Surendran, J. Platt, and E. Renshaw. Automatic discovery of personal topics to organize email. In Proc. of the 2nd Conf. on Email and Anti-Spam, July 2005.
23
 
24
Ian H. Witten and Eibe Frank. Data Mining: Practical machine learning tools and techniques, 2nd ed. Morgan Kaufmann, 2005.

CITED BY  15

Collaborative Colleagues:
Mark Dredze: colleagues
Tessa Lau: colleagues
Nicholas Kushmerick: colleagues