ACM Home Page
Please provide us with feedback. Feedback
What's in a session: tracking individual behavior on the web
Full text PdfPdf (539 KB)
Source
Conference on Hypertext and Hypermedia archive
Proceedings of the 20th ACM conference on Hypertext and hypermedia table of contents
Torino, Italy
SESSION: Tracking and exploiting user behavior table of contents
Pages 173-182  
Year of Publication: 2009
ISBN:978-1-60558-486-7
Authors
Mark Meiss  Indiana University, Bloomington, IN, USA
John Duncan  Indiana University, Bloomington, IN, USA
Bruno Gonçalves  Indiana University, Bloomington, IN, USA
José J. Ramasco  ISI Foundation, Torino, Italy
Filippo Menczer  Indiana University and ISI Foundation, Bloomington and Torino, IN, USA
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 33,   Downloads (12 Months): 113,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557914.1557946
What is a DOI?

ABSTRACT

We examine the properties of all HTTP requests generated by a thousand undergraduates over a span of two months. Preserving user identity in the data set allows us to discover novel properties of Web traffic that directly affect models of hypertext navigation. We find that the popularity of Web sites--the number of users who contribute to their traffic--lacks any intrinsic mean and may be unbounded. Further, many aspects of the browsing behavior of individual users can be approximated by log-normal distributions even though their aggregate behavior is scale-free. Finally, we show that users' click streams cannot be cleanly segmented into sessions using timeouts, affecting any attempt to model hypertext navigation using statistics of individual sessions. We propose a strictly logical definition of sessions based on browsing activity as revealed by referrer URLs; a user may have several active sessions in their click stream at any one time. We demonstrate that applying a timeout to these logical sessions affects their statistics to a lesser extent than a purely timeout-based mechanism.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
 
4
 
5
A. Clauset, C. R. Shalizi, and M. E. J. Newman. Power-law distributions in empirical data. Technical report, arXiv:0706.1062v1 {physics.data--an}, 2007.
 
6
7
 
8
S. Fortunato, A. Flammini, F. Menczer, and A. Vespignani. Topical interests and the mitigation of search engine bias. Proc. Natl. Acad. Sci. USA, 103(34):12684--12689, 2006.
 
9
B. Goncalves, M. Meiss, J. J. Ramasco, A. Flammini, c and F. Menczer. Remembering what we like: Toward an agent-based model of web traffic. In WSDM (Late-breaking papers), 2009.
 
10
B. Goncalves and J. J. Ramasco. Human dynamics revealed through web analytics. Phys. Rev. E, 78:026123, 2008.
11
 
12
J. Luxenburger and G. Weikum. Query-Log Based Authority Analysis for Web Information Search, volume 3306 of Lecture Notes in Computer Science, pages 90--101. Springer Berlin / Heidelberg, 2004.
13
14
 
15
M. Meiss, F. Menczer, and A. Vespignani. Structural analysis of behavioral networks from the Internet. Journal of Physics A, 2008.
16
 
17
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the Web. Technical report, Stanford University Database Group, 1998.
 
18
 
19
F. Qiu, Z. Liu, and J. Cho. Analysis of user web traffic with a focus on search activities. In A. Doan, F. Neven, R. McCann, and G. J. Bex, editors, Proc. 8th International Workshop on the Web and Databases (WebDB), pages 103---108, 2005.
 
20
C. Viecco, A. Tsow, and L. J. Camp. Privacy--aware architecture for sharing web histories. IBM Systems Journal, publication pending.
 
21

Collaborative Colleagues:
Mark Meiss: colleagues
John Duncan: colleagues
Bruno Gonçalves: colleagues
José J. Ramasco: colleagues
Filippo Menczer: colleagues