ACM Home Page
Please provide us with feedback. Feedback
Efficient identification of starters and followers in social media
Full text PdfPdf (960 KB)
Source Extending Database Technology; Vol. 360 archive
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology table of contents
Saint Petersburg, Russia
SESSION: Research sessions: Workflow techniques table of contents
Pages 708-719  
Year of Publication: 2009
ISBN:978-1-60558-422-5
Authors
Michael Mathioudakis  University of Toronto
Nick Koudas  University of Toronto
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 55,   Downloads (12 Months): 219,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1516360.1516442
What is a DOI?

ABSTRACT

Activity and user engagement in social media such as web logs, wikis, online forums or social networks has been increasing at unprecedented rates. In relation to social behavior in various human activities, user activity in social media indicates the existence of individuals that consistently drive or stimulate 'discussions' in the online world. Such individuals are considered as 'starters' of online discussions in contrast with 'followers' that primarily engage in discussions and follow them.

In this paper, we formalize notions of 'starters' and 'followers' in social media. Motivated by the challenging size of the available information related to online social behavior, we focus on the development of random sampling approaches allowing us to achieve significant efficiency while identifying starters and followers. In our experimental section we utilize BlogScope, our social media warehousing platform under development at the University of Toronto. We demonstrate the scalability and accuracy of our sampling approaches using real data establishing the practical utility of our techniques in a real social media warehousing environment.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Nilesh Bansal and Nick Koudas, BlogScope: A System for Online Analysis of High Volume Text Streams, WebDb, 2007.
 
2
3
 
4
D. Aldous. On the markov chain simulation method for uniform combinatorial distributions and simulated annealing. Probability in the Engineering and Informational Sciences, 1987.
 
5
 
6
 
7
W. Cochran. Sampling Techniques. John Wiley and Sons, 3rd edition, 1977.
8
 
9
 
10
R. Gallager. Discrete Stochastic Processes. Springer, 1st edition, 1995.
 
11
12
13
 
14
 
15
W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13--30, 1963.
16
17
 
18
J. Leskovec, M. McGlohon, C. Faloutsos, N. Glance, and M. Hurst. Cascading behavior in large blog graphs, 2007.
 
19
P. Rusmevichientong, D. M. Pennock, S. Lawrence, and L. C. Giles. Methods for sampling pages uniformly from the world wide web. In AAAI Fall Symposium on Using Uncertainty Within Computation, pages 121--128, 2001.
 
20
21
Collaborative Colleagues:
Michael Mathioudakis: colleagues
Nick Koudas: colleagues