| Efficient computation of personal aggregate queries on blogs |
| Full text |
Pdf
(691 KB)
|
Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Las Vegas, Nevada, USA
SESSION: Research papers
table of contents
Pages 632-640
Year of Publication: 2008
ISBN:978-1-60558-193-4
|
|
Authors
|
|
Ka Cheung Sia
|
University of California, Los Angles, Los Angeles, CA, USA
|
|
Junghoo Cho
|
University of California, Los Angles, Los Angeles, CA, USA
|
|
Yun Chi
|
NEC Labs America, Cupertino, CA, USA
|
|
Belle L. Tseng
|
Yahoo! Inc., Sunnyvale, CA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 40, Downloads (12 Months): 286, Citation Count: 0
|
|
|
ABSTRACT
There is an exploding amount of user-generated content on theWeb due to the emergence of "Web 2.0" services, such as Blogger,MySpace, Flickr, and del.icio.us. The participation of a large number of users in sharing their opinion on the Web has inspired researchers to build an effective "information filter" by aggregating these independent opinions. However, given the diverse groups of users on the Web nowadays, the global aggregation of the information may not be of much interest to different groups of users. In this paper, we explore the possibility of computing personalized aggregation over the opinions expressed on the Web based on a user's indication of trust over the information sources. The hope is that by employing such "personalized" aggregation, we can make the recommendation more likely to be interesting to the users. We address the challenging scalability issues by proposing an efficient method, that utilizes two core techniques: Non-Negative Matrix Factorization and Threshold Algorithm, to compute personalized aggregations when there are potentially millions of users and millions of sources within a system. We show that, through experiments on real-life dataset, our personalized aggregation approach indeed makes a significant difference in the items that are recommended and it reduces the query computational cost significantly, often more than 75%, while the result of personalized aggregation is kept accurate enough.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Akshay Java, Pranam Kolari, Tim Finin, Anupam Joshi, and Tim Oates. Feeds That Matters: A Study of Bloglines Subscriptions. In ICWSM, Boulder, Colorado, USA, March 2007.
|
| |
3
|
Bloglines. http://www.bloglines.com.
|
| |
4
|
Natalie S. Glance, Matthew Hurst, and Takashi Tomokiyo. BlogPulse: Automated Trend Discovery for Weblogs. In WWW Conference, 2004.
|
 |
5
|
Daniel Gruhl , R. Guha , David Liben-Nowell , Andrew Tomkins, Information diffusion through blogspace, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988739]
|
 |
6
|
|
 |
7
|
Xuanhui Wang , ChengXiang Zhai , Xiao Hu , Richard Sproat, Mining correlated bursty topic patterns from coordinated text streams, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
[doi> 10.1145/1281192.1281276]
|
 |
8
|
|
 |
9
|
|
 |
10
|
|
 |
11
|
Abhinandan S. Das , Mayur Datar , Ashutosh Garg , Shyam Rajaram, Google news personalization: scalable online collaborative filtering, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
[doi> 10.1145/1242572.1242610]
|
 |
12
|
Chengkai Li , Min Wang , Lipyeow Lim , Haixun Wang , Kevin Chen-Chuan Chang, Supporting ranking and clustering as generalized order-by and group-by, Proceedings of the 2007 ACM SIGMOD international conference on Management of data, June 11-14, 2007, Beijing, China
[doi> 10.1145/1247480.1247496]
|
| |
13
|
Huiming Qu and Alexandros Labrinidis. Preference-Aware Query and Update Scheduling in Web-databases. In ICDE Conference, 2007.
|
| |
14
|
|
 |
15
|
|
 |
16
|
|
| |
17
|
|
| |
18
|
|
 |
19
|
|
 |
20
|
Chris Ding , Tao Li , Wei Peng , Haesun Park, Orthogonal nonnegative matrix t-factorizations for clustering, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
[doi> 10.1145/1150402.1150420]
|
|