|
ABSTRACT
Recommender systems have been evaluated in many, often incomparable, ways. In this article, we review the key decisions in evaluating collaborative filtering recommender systems: the user tasks being evaluated, the types of analysis and datasets being used, the ways in which prediction quality is measured, the evaluation of prediction attributes other than quality, and the user-based evaluation of the system as a whole. In addition to reviewing the evaluation strategies used by prior researchers, we present empirical results from the analysis of various accuracy metrics on one content domain where all the tested metrics collapsed roughly into three equivalence classes. Metrics within each equivalency class were strongly correlated, while metrics from different equivalency classes were uncorrelated.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Charu C. Aggarwal , Joel L. Wolf , Kun-Lung Wu , Philip S. Yu, Horting hatches an egg: a new graph-theoretic approach to collaborative filtering, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.201-212, August 15-18, 1999, San Diego, California, United States
[doi> 10.1145/312129.312230]
|
 |
2
|
|
 |
3
|
Brian Amento , Will Hill , Loren Terveen , Deborah Hix , Peter Ju, An empirical evaluation of user interfaces for topic management of Web sites, Proceedings of the SIGCHI conference on Human factors in computing systems: the CHI is the limit, p.552-559, May 15-20, 1999, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/302979.303156]
|
| |
4
|
|
| |
5
|
Bailey, B. P., Gurak, L. J., and Konstan, J. A. 2001. An examination of trust production in computer-mediated exchange. In Proceedings of the 7th Conference on Human Factors and the Web (July).
|
 |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
Breese, J. S., Heckerman, D., and Kadie, C. 1998. Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (UAI-98). G. F. Cooper, and S. Moral, Eds. Morgan-Kaufmann, San Francisco, Calif., 43--52.
|
 |
10
|
|
| |
11
|
|
| |
12
|
Cleverdon, C. and Kean, M. 1968. Factors Determining the Performance of Indexing Systems. Aslib Cranfield Research Project, Cranfield, England.
|
| |
13
|
Cosley, D., Lam, S. K., Albert, I., Konstan, J. A., and Riedl, J. 2003. Is seeing believing? How recommender interfaces affect users' opinions. CHI Lett. 5.
|
| |
14
|
Dahlen, B. J., Konstan, J. A., Herlocker, J. L., Good, N., Borchers, A., and Riedl, J. 1998. Jump-starting movielens: User benefits of starting a collaborative filtering system with "dead data". TR 98-017. University of Minnesota.
|
 |
15
|
|
 |
16
|
|
| |
17
|
|
| |
18
|
Nathaniel Good , J. Ben Schafer , Joseph A. Konstan , Al Borchers , Badrul Sarwar , Jon Herlocker , John Riedl, Combining collaborative filtering with personal agents for better recommendations, Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence, p.439-446, July 18-22, 1999, Orlando, Florida, United States
|
| |
19
|
Hanley, J. A. and Mcneil, B. J. 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29--36.
|
| |
20
|
Harman, D. 1995. The TREC conferences. Hypertext---Information Retrieval---Multimedia: Synergieeffekte Elektronisher Informationssysteme. In Proceedings of HIM '95.
|
| |
21
|
|
| |
22
|
David Heckerman , David Maxwell Chickering , Christopher Meek , Robert Rounthwaite , Carl Kadie, Dependency networks for inference, collaborative filtering, and data visualization, The Journal of Machine Learning Research, 1, p.49-75, 9/1/2001
[doi> 10.1162/153244301753344614]
|
| |
23
|
|
 |
24
|
Jonathan L. Herlocker , Joseph A. Konstan , Al Borchers , John Riedl, An algorithmic framework for performing collaborative filtering, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.230-237, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312682]
|
 |
25
|
|
| |
26
|
|
| |
27
|
Will Hill , Larry Stead , Mark Rosenstein , George Furnas, Recommending and evaluating choices in a virtual community of use, Proceedings of the SIGCHI conference on Human factors in computing systems, p.194-201, May 07-11, 1995, Denver, Colorado, United States
[doi> 10.1145/223904.223929]
|
 |
28
|
Joseph A. Konstan , Bradley N. Miller , David Maltz , Jonathan L. Herlocker , Lee R. Gordon , John Riedl, GroupLens: applying collaborative filtering to Usenet news, Communications of the ACM, v.40 n.3, p.77-87, March 1997
[doi> 10.1145/245108.245126]
|
| |
29
|
Le, C. T., Lindren, B. R. 1995. Construction and comparison of two receiver operating characteristics curves derived from the same samples. Biom. J. 37, 869--877.
|
| |
30
|
Linton, F., Charron, A., and Joy, D. 1998. OWL: A recommender system for organziation-wide learning. In Proceedings of the 1998 Workshop on Recommender Systems 65--69.
|
 |
31
|
|
 |
32
|
Sean M. McNee , Istvan Albert , Dan Cosley , Prateep Gopalkrishnan , Shyong K. Lam , Al Mamunur Rashid , Joseph A. Konstan , John Riedl, On the recommending of citations for research papers, Proceedings of the 2002 ACM conference on Computer supported cooperative work, November 16-20, 2002, New Orleans, Louisiana, USA
[doi> 10.1145/587078.587096]
|
 |
33
|
Bradley N. Miller , Istvan Albert , Shyong K. Lam , Joseph A. Konstan , John Riedl, MovieLens unplugged: experiences with an occasionally connected recommender system, Proceedings of the 8th international conference on Intelligent user interfaces, January 12-15, 2003, Miami, Florida, USA
[doi> 10.1145/604045.604094]
|
| |
34
|
Miller, B. N., Riedl, J., and Konstan, J. A. 1997. Experiences with GroupLens: Making Usenet useful again. In Proceedings of the 1997 Usenix Technical Conference.
|
 |
35
|
Bamshad Mobasher , Honghua Dai , Tao Luo , Miki Nakagawa, Effective personalization based on association rule discovery from web usage data, Proceedings of the 3rd international workshop on Web information and data management, November 09-01, 2001, Atlanta, Georgia, USA
[doi> 10.1145/502932.502935]
|
| |
36
|
|
| |
37
|
Mui, L., Ang, C., and Mohtashemi, M. 2001. A Probabilistic Model for Collaborative Sanctioning. Technical Memorandum 617. MIT LCS.
|
 |
38
|
William M. Newman, Better or just different? On the benefits of designing interactive systems in terms of critical parameters, Proceedings of the conference on Designing interactive systems: processes, practices, methods, and techniques, p.239-245, August 18-20, 1997, Amsterdam, The Netherlands
[doi> 10.1145/263552.263615]
|
| |
39
|
|
| |
40
|
|
 |
41
|
Al Mamunur Rashid , Istvan Albert , Dan Cosley , Shyong K. Lam , Sean M. McNee , Joseph A. Konstan , John Riedl, Getting to know you: learning new user preferences in recommender systems, Proceedings of the 7th international conference on Intelligent user interfaces, January 13-16, 2002, San Francisco, California, USA
[doi> 10.1145/502716.502737]
|
| |
42
|
|
 |
43
|
Paul Resnick , Neophytos Iacovou , Mitesh Suchak , Peter Bergstrom , John Riedl, GroupLens: an open architecture for collaborative filtering of netnews, Proceedings of the 1994 ACM conference on Computer supported cooperative work, p.175-186, October 22-26, 1994, Chapel Hill, North Carolina, United States
[doi> 10.1145/192844.192905]
|
 |
44
|
|
| |
45
|
Rogers, S. C. 2001. Marketing Strategies, Tactics, and Techniques : A handbook for practitioners. Quorum Books, Westport, Conn.
|
 |
46
|
Badrul Sarwar , George Karypis , Joseph Konstan , John Riedl, Analysis of recommendation algorithms for e-commerce, Proceedings of the 2nd ACM conference on Electronic commerce, p.158-167, October 17-20, 2000, Minneapolis, Minnesota, United States
[doi> 10.1145/352871.352887]
|
| |
47
|
Sarwar, B. M., Karypis, G., Konstan, J. A., and Riedl, J. 2000b. Application of dimensionality reduction in recommender system--A case study. In Proceedings of the ACM WebKDD 2000 Web Mining for E-Commerce Workshop.
|
 |
48
|
Badrul Sarwar , George Karypis , Joseph Konstan , John Reidl, Item-based collaborative filtering recommendation algorithms, Proceedings of the 10th international conference on World Wide Web, p.285-295, May 01-05, 2001, Hong Kong, Hong Kong
[doi> 10.1145/371920.372071]
|
 |
49
|
Badrul M. Sarwar , Joseph A. Konstan , Al Borchers , Jon Herlocker , Brad Miller , John Riedl, Using filtering agents to improve prediction quality in the GroupLens research collaborative filtering system, Proceedings of the 1998 ACM conference on Computer supported cooperative work, p.345-354, November 14-18, 1998, Seattle, Washington, United States
[doi> 10.1145/289444.289509]
|
 |
50
|
|
 |
51
|
|
 |
52
|
|
| |
53
|
|
 |
54
|
|
| |
55
|
Swearingen, K. and Sinha, R. 2001. Beyond algorithms: An HCI perspective on recommender systems. In Proceedings of the SIGIR 2001 Workshop on Recommender Systems.
|
| |
56
|
Swets, J. A. 1963. Information retrieval systems. Science 141, 245--250.
|
| |
57
|
Swets, J. A. 1969. Effectiveness of information retrieval methods. Amer. Doc. 20, 72--89.
|
 |
58
|
|
| |
59
|
Voorhees, E. M. and Harman, D. K. 1999. Overview of the seventh Text REtrieval Conference (TREC-7). In NIST Special Publication 500-242 (July), E. M. Voorhees, and D. K. Harman, Eds. NIST, 1--24.
|
 |
60
|
|
| |
61
|
Whittaker, S., Terveen, L. G., and Nardi, B. 2000. Let's stop pushing the envelope and start addressing it: A reference task agenda for HCI. Human-Computer Interact. 15, 2-3 (Sept.), 75--106.
|
| |
62
|
|
CITED BY 197
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Gui-Rong Xue , Chenxi Lin , Qiang Yang , WenSi Xi , Hua-Jun Zeng , Yong Yu , Zheng Chen, Scalable collaborative filtering using cluster-based smoothing, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tom Crecelius , Mouna Kacimi , Sebastian Michel , Thomas Neumann , Josiane X. Parreira , Ralf Schenkel , Gerhard Weikum, Social recommendations at work, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Victoria Bellotti , Bo Begole , Ed H. Chi , Nicolas Ducheneaut , Ji Fang , Ellen Isaacs , Tracy King , Mark W. Newman , Kurt Partridge , Bob Price , Paul Rasmussen , Michael Roberts , Diane J. Schiano , Alan Walendowski, Activity-based serendipitous recommendations with the Magitti mobile leisure guide, Proceeding of the twenty-sixth annual SIGCHI conference on Human factors in computing systems, April 05-10, 2008, Florence, Italy
|
|
|
|
|
|
|
|
|
Paul André , m.c. schraefel , Jaime Teevan , Susan T. Dumais, Discovery is never by chance: designing for (un)serendipity, Proceeding of the seventh ACM conference on Creativity and cognition, October 26-30, 2009, Berkeley, California, USA
|
|
|
|
|
|
Bharath Kumar Mohan , Benjamin J. Keller , Naren Ramakrishnan, Scouts, promoters, and connectors: the roles of ratings in nearest neighbor collaborative filtering, Proceedings of the 7th ACM conference on Electronic commerce, p.250-259, June 11-15, 2006, Ann Arbor, Michigan, USA
|
|
|
Sheng Zhang , James Ford , Fillia Makedon, A privacy-preserving collaborative filtering scheme with two-way communication, Proceedings of the 7th ACM conference on Electronic commerce, p.316-323, June 11-15, 2006, Ann Arbor, Michigan, USA
|
|
|
|
|
|
Dan Cosley , Dan Frankowski , Loren Terveen , John Riedl, SuggestBot: using intelligent task routing to help people find work in wikipedia, Proceedings of the 12th international conference on Intelligent user interfaces, January 28-31, 2007, Honolulu, Hawaii, USA
|
|
|
Bo Xie , Peng Han , Fan Yang , Rui-Min Shen , Hua-Jun Zeng , Zheng Chen, DCFLA: A distributed collaborative-filtering neighbor-locating algorithm, Information Sciences: an International Journal, v.177 n.6, p.1349-1363, March, 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Shlomo Berkovsky , Yaniv Eytani , Tsvi Kuflik , Francesco Ricci, Enhancing privacy and preserving accuracy of a distributed collaborative filtering, Proceedings of the 2007 ACM conference on Recommender systems, October 19-20, 2007, Minneapolis, MN, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Colin Tattersall , Jocelyn Manderveld , Bert Berg , René Es , José Janssen , Rob Koper, Self Organising Wayfinding Support for Lifelong Learners, Education and Information Technologies, v.10 n.1-2, p.111-123, January 2005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Henriette S.M. Cramer , Vanessa Evers , Maarten W. van Someren , Bob J. Wielinga, Awareness, training and trust in interaction with adaptive spam filters, Proceedings of the 27th international conference on Human factors in computing systems, April 04-09, 2009, Boston, MA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Henriette Cramer , Vanessa Evers , Satyan Ramlal , Maarten Someren , Lloyd Rutledge , Natalia Stash , Lora Aroyo , Bob Wielinga, The effects of transparency on trust in and acceptance of a content-based art recommender, User Modeling and User-Adapted Interaction, v.18 n.5, p.455-496, November 2008
|
|
|
Francesca Carmagnola , Federica Cena , Luca Console , Omar Cortassa , Cristina Gena , Anna Goy , Ilaria Torre , Andrea Toso , Fabiana Vernero, Tag-based user modeling for social multi-device adaptive guides, User Modeling and User-Adapted Interaction, v.18 n.5, p.497-538, November 2008
|
|
|
|
|
|
|
|
|
Ralf Schenkel , Tom Crecelius , Mouna Kacimi , Sebastian Michel , Thomas Neumann , Josiane X. Parreira , Gerhard Weikum, Efficient top-k querying over social-tagging networks, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Patricia Victor , Chris Cornelis , Ankur M. Teredesai , Martine De Cock, Whom should I trust?: the impact of key figures on cold start recommendations, Proceedings of the 2008 ACM symposium on Applied computing, March 16-20, 2008, Fortaleza, Ceara, Brazil
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Shengchao Ding , Shiwan Zhao , Quan Yuan , Xiatian Zhang , Rongyao Fu , Lawrence Bergman, Boosting collaborative filtering based on statistical prediction errors, Proceedings of the 2008 ACM conference on Recommender systems, October 23-25, 2008, Lausanne, Switzerland
|
|
|
Stanley Loh , Fabiana Lorenzi , Gabriel Simões , Leandro Krug Wives , José Palazzo M. de Oliveira, Comparing keywords and taxonomies in the representation of users profiles in a content-based recommender system, Proceedings of the 2008 ACM symposium on Applied computing, March 16-20, 2008, Fortaleza, Ceara, Brazil
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hans G. K. Hummel , Bert Van Den Berg , Adriana J. Berlanga , Hendrik Drachsler , Jose Janssen , Rob Nadolski , Rob Koper, Combining social-based and information-based approaches for personalised recommendation on sequencing learning activities, International Journal of Learning Technology, v.3 n.2, p.152-168, August 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Vinod Krishnan , Pradeep Kumar Narayanashetty , Mukesh Nathan , Richard T. Davies , Joseph A. Konstan, Who predicts better?: results from an online study comparing humans and an online recommender system, Proceedings of the 2008 ACM conference on Recommender systems, October 23-25, 2008, Lausanne, Switzerland
|
|
|
|
|
|
M. Benjamin Dias , Dominique Locher , Ming Li , Wael El-Deredy , Paulo J.G. Lisboa, The value of personalised recommender systems to e-business: a case study, Proceedings of the 2008 ACM conference on Recommender systems, October 23-25, 2008, Lausanne, Switzerland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Darcy A. Davis , Nitesh V. Chawla , Nicholas Blumm , Nicholas Christakis , Albert-László Barabasi, Predicting individual disease risk based on medical history, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Alexander Höhfeld , Patrick Gratz , Angelo Beck , Jean Botev , Hermann Schloss , Ingo Scholtes, Self-organizing collaborative filtering in global-scale massive multi-user virtual environments, Proceedings of the 2009 ACM symposium on Applied Computing, March 08-12, 2009, Honolulu, Hawaii
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Pieter Bellekens , Geert-Jan Houben , Lora Aroyo , Krijn Schaap , Annelies Kaptein, User model elicitation and enrichment for context-sensitive personalization in a multiplatform tv environment, Proceedings of the seventh european conference on European interactive television conference, June 03-05, 2009, Leuven, Belgium
|
|
|
|
|
|
|
|
|
Joseph A. Konstan , Sean M. McNee , Cai-Nicolas Ziegler , Roberto Torres , Nishikant Kapoor , John T. Riedl, Lessons on applying automated recommender systems to information-seeking tasks, proceedings of the 21st national conference on Artificial intelligence, p.1630-1633, July 16-20, 2006, Boston, Massachusetts
|
|
|
|
|
|
|
|
|
|
|
|
Alexander Felfernig , Gerhard Friedrich , Klaus Isak , Kostyantyn Shchekotykhin , Erich Teppan , Dietmar Jannach, Automated debugging of recommender user interface descriptions, Applied Intelligence, v.31 n.1, p.1-14, August 2009
|
|
|
|
|
|
|
|
|
Alexander Felfernig , Klaus Isak , Christian Russ, Knowledge-Based Recommendation: Technologies and Experiences from Projects, Proceeding of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy, p.632-636, May 22, 2006
|
|
|
|
|
|
|
|
|
|
|
|
A. Felfernig , K. Isak , K. Szabo , P. Zachar, The VITA financial services sales support environment, Proceedings of the 19th national conference on Innovative applications of artificial intelligence, p.1692-1699, July 22-26, 2007, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xavier Amatriain , Josep M. Pujol , Nava Tintarev , Nuria Oliver, Rate it again: increasing recommendation accuracy by user re-rating, Proceedings of the third ACM conference on Recommender systems, October 23-25, 2009, New York, New York, USA
|
|
|
|
|
|
Ranieri Baraglia , Fidel Cacheda , Victor Carneiro , Diego Fernandez , Vreixo Formoso , Raffaele Perego , Fabrizio Silvestri, Search shortcuts: a new approach to the recommendation of queries, Proceedings of the third ACM conference on Recommender systems, October 23-25, 2009, New York, New York, USA
|
|
|
Florent Garcin , Boi Faltings , Radu Jurca , Nadine Joswig, Rating aggregation in collaborative filtering systems, Proceedings of the third ACM conference on Recommender systems, October 23-25, 2009, New York, New York, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Taeryong Jeon , Jaewoo Cho , Soojin Lee , Gyeongdong Baek , Sungshin Kim, A movie rating prediction system of user propensity analysis based on collaborative filtering and fuzzy system, Proceedings of the 18th international conference on Fuzzy Systems, p.507-511, August 20-24, 2009, Jeju Island, Korea
|
|