|
ABSTRACT
Email-based activity management systems promise to give users better tools for managing increasing volumes of email, by organizing email according to a user's activities. Current activity management systems do not automatically classify incoming messages by the activity to which they belong, instead relying on simple heuristics (such as message threads), or asking the user to manually classify incoming messages as belonging to an activity. This paper presents several algorithms for automatically recognizing emails as part of an ongoing activity. Our baseline methods are the use of message reply-to threads to determine activity membership and a naïve Bayes classifier. Our SimSubset and SimOverlap algorithms compare the people involved in an activity against the recipients of each incoming message. Our SimContent algorithm uses IRR (a variant of latent semantic indexing) to classify emails into activities using similarity based on message contents. An empirical evaluation shows that each of these methods provide a significant improvement to the baseline methods. In addition, we show that a combined approach that votes the predictions of the individual methods performs better than each individual method alone.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Manu Aery and Sharma Chakravarthy. eMailSift: mining-based approaches to email classification. In SIGIR '04: Proc. of the 27th annual intl. ACM SIGIR conf. on information retrieval, pages 580--581. ACM Press, 2004.
|
 |
2
|
|
 |
3
|
Victoria Bellotti , Nicolas Ducheneaut , Mark Howard , Ian Smith, Taking email to task: the design and evaluation of a task management centered email tool, Proceedings of the SIGCHI conference on Human factors in computing systems, April 05-10, 2003, Ft. Lauderdale, Florida, USA
[doi> 10.1145/642611.642672]
|
| |
4
|
W. Cohen, V. Carvalho, and T. Mitchell. Learning to classify email into "speech acts". In Proc. Conf. Empirical Methods in Natural Language Processing, 2004.
|
| |
5
|
Alex Cozzi, Tom Moran, and Clemens Drews. The shared checklist: Reorganizing the user experience around unified activities. In 10th Intl Conf on Human-Computer Interaction (INTERACT 2005), Sept. 2005.
|
| |
6
|
S. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. Journal of the Society for Information Science, 41(6):391--407, 1990.
|
 |
7
|
Anton N. Dragunov , Thomas G. Dietterich , Kevin Johnsrude , Matthew McLaughlin , Lida Li , Jonathan L. Herlocker, TaskTracer: a desktop environment to support multi-tasking knowledge workers, Proceedings of the 10th international conference on Intelligent user interfaces, January 10-13, 2005, San Diego, California, USA
[doi> 10.1145/1040830.1040855]
|
 |
8
|
|
| |
9
|
Y. Huang, D. Govindaraju, T. Mitchell, V. Rocha de Carvalho, and W. Cohen. Inferring ongoing activities of workstation users by clustering email. In Proc. of the 1st Conf. on Email and Anti-Spam, July 2004.
|
| |
10
|
R. Khoussainov and N. Kushmerick. Email task management: An iterative relational learning approach. In Proc. Conf. Email and Anti-Spam, 2005.
|
| |
11
|
S. Kiritchenko, S. Matwin, and S. Abu-Hakima. Email classification with temporal features. In Proceedings of Intelligent Information Systems, New Trends in Intelligent Information Processing and Web Mining (IIPWM) 2004, pages 523--534. Springer Verlag, 2004.
|
| |
12
|
|
 |
13
|
|
| |
14
|
Andrew McCallum, Andres Corrada-Emmanuel, and Xuerui Wang. Topic and Role Discovery in Social Networks. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, July 2005.
|
 |
15
|
Luke McDowell , Oren Etzioni , Alon Halevy , Henry Levy, Semantic email, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988706]
|
 |
16
|
|
 |
17
|
Michael J. Muller , Werner Geyer , Beth Brownholtz , Eric Wilcox , David R. Millen, One-hundred days in an activity-centric collaboration environment based on shared objects, Proceedings of the SIGCHI conference on Human factors in computing systems, p.375-382, April 24-29, 2004, Vienna, Austria
[doi> 10.1145/985692.985740]
|
| |
18
|
M.F. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.
|
| |
19
|
Mehran Sahami, Susan Dumais, David Heckerman, and Eric Horvitz. A bayesian approach to filtering junk E-mail. In Learning for Text Categorization: Papers from the 1998 Workshop, Madison, Wisconsin, 1998. AAAI Technical Report WS-98-05.
|
| |
20
|
R. Segal, J. Crawford, J. Kephart, and B. Leiba. SpamGuru: An Enterprise Anti-Spam Filtering System. In Proceedings of the First Conference on Email and Anti-Spam, July 2004.
|
| |
21
|
|
| |
22
|
A. Surendran, J. Platt, and E. Renshaw. Automatic discovery of personal topics to organize email. In Proc. of the 2nd Conf. on Email and Anti-Spam, July 2005.
|
 |
23
|
|
| |
24
|
Ian H. Witten and Eibe Frank. Data Mining: Practical machine learning tools and techniques, 2nd ed. Morgan Kaufmann, 2005.
|
CITED BY 15
|
|
D. M. Sow , J. S. Davis, II , M. R. Ebling , A. Misra , L. Bergman, Uncovering the to-dos hidden in your in-box, IBM Systems Journal, v.45 n.4, p.739-757, October 2006
|
|
|
A. Cozzi , S. Farrell , T. Lau , B. A. Smith , C. Drews , J. Lin , B. Stachel , T. P. Moran, Activity management as a web service, IBM Systems Journal, v.45 n.4, p.695-712, October 2006
|
|
|
|
|
|
|
|
|
|
|
|
Jianqiang Shen , Werner Geyer , Michael Muller , Casey Dugan , Beth Brownholtz , David R Millen, Automatically finding and recommending resources to support knowledge workers' activities, Proceedings of the 13th international conference on Intelligent user interfaces, January 13-16, 2008, Gran Canaria, Spain
|
|
|
Lida Li , Michael J. Muller , Werner Geyer , Casey Dugan , Beth Brownholtz , David R. Millen, Predicting individual priorities of shared activities using support vector machines, Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, November 06-10, 2007, Lisbon, Portugal
|
|
|
|
|
|
|
|
|
Andreas S. Rath , Didier Devaurs , Stefanie N. Lindstaedt, UICO: an ontology-based user interaction context model for automatic task detection on the computer desktop, Proceedings of the 1st Workshop on Context, Information and Ontologies, p.1-10, June 01-01, 2009, Heraklion, Greece
|
|
|
|
|
|
Simone Stumpf , Vidya Rajaram , Lida Li , Weng-Keen Wong , Margaret Burnett , Thomas Dietterich , Erin Sullivan , Jonathan Herlocker, Interacting meaningfully with machine learning systems: Three experiments, International Journal of Human-Computer Studies, v.67 n.8, p.639-662, August, 2009
|
|
|
Tom M. Mitchell , Sophie H. Wang , Yifen Huang , Adam Cheyer, Extracting knowledge about users' activities from raw workstation contents, Proceedings of the 21st national conference on Artificial intelligence, p.181-186, July 16-20, 2006, Boston, Massachusetts
|
|
|
Nicholas Kushmerick , Tessa Lau , Mark Dredze , Rinat Khoussainov, Activity-centric email: a machine learning approach, proceedings of the 21st national conference on Artificial intelligence, p.1634-1637, July 16-20, 2006, Boston, Massachusetts
|
|
|
|
|