|
ABSTRACT
Collaborative filtering (CF) is valuable in e-commerce, and for direct recommendations for music, movies, news etc. But today's systems have several disadvantages, including privacy risks. As we move toward ubiquitous computing, there is a great potential for individuals to share all kinds of information about places and things to do, see and buy, but the privacy risks are severe. In this paper we describe a new method for collaborative filtering which protects the privacy of individual data. The method is based on a probabilistic factor analysis model. Privacy protection is provided by a peer-to-peer protocol which is described elsewhere, but outlined in this paper. The factor analysis approach handles missing data without requiring default values for them. We give several experiments that suggest that this is most accurate method for CF to date. The new algorithm has other advantages in speed and storage over previous algorithms. Finally, we suggest applications of the approach to other kinds of statistical analyses of survey or questionaire data.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Breese, Heckermen, and Kadie. Empirical analysis of predictive algorithms for collaborative filtering. Technical report, Microsoft Research, October 1998.
|
| |
3
|
|
| |
4
|
J. Canny. Some techniques for privacy in ubicomp and context-aware applications. In UBICOMP-2002, Goteborg, Sweden, Sept. 2002. (submitted).
|
| |
5
|
M. Claypool, A. Gokhale, T. Miranda, P. Murnikov, D. Netes, and M. Sartin. Combining content-based and collaborative filters in an online newspaper. In ACM SIGIR WS on Recommender Systems, 1999.
|
| |
6
|
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1):1--38, 1977.
|
| |
7
|
J. DeTreville, 2002. personal communication.
|
| |
8
|
B. Frey. Turbo factor analysis. Adv. Neural Information Processing, 1999. (submitted).
|
| |
9
|
|
 |
10
|
Dhruv Gupta , Mark Digiovanni , Hiro Narita , Ken Goldberg, Jester 2.0 (poster abstract): evaluation of an new linear time collaborative filtering algorithm, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.291-292, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312718]
|
| |
11
|
Nathaniel Good , J. Ben Schafer , Joseph A. Konstan , Al Borchers , Badrul Sarwar , Jon Herlocker , John Riedl, Combining collaborative filtering with personal agents for better recommendations, Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence, p.439-446, July 18-22, 1999, Orlando, Florida, United States
|
 |
12
|
Jonathan L. Herlocker , Joseph A. Konstan , Al Borchers , John Riedl, An algorithmic framework for performing collaborative filtering, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.230-237, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312682]
|
| |
13
|
O. P. John. The "big five" factor taxonomy: Dimensions of personality in the natural language and in questionnaires. In L. A. Pervin, editor, Handbook of personality: Theory and research. Guilford, NY, 1990.
|
| |
14
|
M. Jordan and C. Bishop. An Introduction to Graphical Models. MIT Press, 2002. In press.
|
 |
15
|
John Kubiatowicz , David Bindel , Yan Chen , Steven Czerwinski , Patrick Eaton , Dennis Geels , Ramakrishna Gummadi , Sean Rhea , Hakim Weatherspoon , Chris Wells , Ben Zhao, OceanStore: an architecture for global-scale persistent storage, Proceedings of the ninth international conference on Architectural support for programming languages and operating systems, p.190-201, November 2000, Cambridge, Massachusetts, United States
|
| |
16
|
D. Pennock and E. Horvitz. Collaborative filtering by personality diagnosis: A hybrid memory- and model-based approach. In IJCAI Workshop on Machine Learning for Information Filtering, Stockholm, Sweden, August 1999.
|
| |
17
|
|
| |
18
|
E. M. Rogers. Diffusion of Innovations, Fourth Edition. The Free Press, 1995.
|
| |
19
|
B. M. Sarwar, G. Karypis, J. A. Konstan, and J. Riedl. Application of dimensionality reduction in recommender system -- a case study. In ACM WebKDD 2000 Web Mining for E-Commerce Workshop, 2000. Full length paper.
|
CITED BY 44
|
|
|
|
|
Andrew Y. Ng, Feature selection, L1 vs. L2 regularization, and rotational invariance, Proceedings of the twenty-first international conference on Machine learning, p.78, July 04-08, 2004, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Marc Davis , John Canny , Nancy Van House , Nathan Good , Simon King , Rahul Nair , Carrie Burgener , Bruce Rinehart , Rachel Strickland , Guy Campbell , Scott Fisher , Nick Reid, MMM2: mobile media metadata for media sharing, Proceedings of the 13th annual ACM international conference on Multimedia, November 06-11, 2005, Hilton, Singapore
|
|
|
Marc Davis , Michael Smith , John Canny , Nathan Good , Simon King , Rajkumar Janakiraman, Towards context-aware face recognition, Proceedings of the 13th annual ACM international conference on Multimedia, November 06-11, 2005, Hilton, Singapore
|
|
|
|
|
|
Philip Bonhard , Clare Harries , John McCarthy , M. Angela Sasse, Accounting for taste: using profile similarity to improve recommender systems, Proceedings of the SIGCHI conference on Human Factors in computing systems, April 22-27, 2006, Montréal, Québec, Canada
|
|
|
|
|
|
Sheng Zhang , Yi Ouyang , James Ford , Fillia Makedon, Analysis of a low-dimensional linear model under recommendation attacks, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
|
|
|
Dan Frankowski , Dan Cosley , Shilad Sen , Loren Terveen , John Riedl, You are what you say: privacy risks of public mentions, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
Sheng Zhang , James Ford , Fillia Makedon, A privacy-preserving collaborative filtering scheme with two-way communication, Proceedings of the 7th ACM conference on Electronic commerce, p.316-323, June 11-15, 2006, Ann Arbor, Michigan, USA
|
|
|
|
|
|
|
|
|
Jeffrey Pang , Ben Greenstein , Michael Kaminsky , Damon McCoy , Srinivasan Seshan, Wifi-reports: improving wireless network selection with collaboration, Proceedings of the 7th international conference on Mobile systems, applications, and services, June 22-25, 2009, Kraków, Poland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Zekeriya Erkin , Alessandro Piva , Stefan Katzenbeisser , R. L. Lagendijk , Jamshid Shokrollahi , Gregory Neven , Mauro Barni, Protection and retrieval of encrypted multimedia content: when cryptography meets signal processing, EURASIP Journal on Information Security, v.7 n.2, p.1-20, January 2007
|
|
|
|
|
|
|
|
|
Kamalika Chaudhuri , Eran Halperin , Satish Rao , Shuheng Zhou, A rigorous analysis of population stratification with limited data, Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, p.1046-1055, January 07-09, 2007, New Orleans, Louisiana
|
|
|
|
|
|
|
|
|
|
|
|
Hao Ma , Haixuan Yang , Michael R. Lyu , Irwin King, SoRec: social recommendation using probabilistic matrix factorization, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
INDEX TERMS
Primary Classification:
H.
Information Systems
H.3
INFORMATION STORAGE AND RETRIEVAL
H.3.3
Information Search and Retrieval
Subjects:
Information filtering
Additional Classification:
D.
Software
D.2
SOFTWARE ENGINEERING
D.2.8
Metrics
G.
Mathematics of Computing
G.3
PROBABILITY AND STATISTICS
General Terms:
Algorithms,
Experimentation,
Human Factors,
Security
Keywords:
CSCW,
collaborative filtering,
missing data,
personalization,
privacy,
recommender systems,
sparse,
surveys
|