ACM Home Page
Please provide us with feedback. Feedback
Separating the swarm: categorization methods for user sessions on the web
Full text PdfPdf (463 KB)
Source Conference on Human Factors in Computing Systems archive
Proceedings of the SIGCHI conference on Human factors in computing systems: Changing our world, changing ourselves table of contents
Minneapolis, Minnesota, USA
SESSION: Web Behavior Patterns table of contents
Pages: 243 - 250  
Year of Publication: 2002
ISBN:1-58113-453-3
Authors
Jeffrey Heer  Xerox Palo Alto Research Center, Palo Alto CA
Ed H. Chi  Xerox Palo Alto Research Center, Palo Alto CA
Sponsor
SIGCHI: ACM Special Interest Group on Computer-Human Interaction
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 78,   Citation Count: 11
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/503376.503420
What is a DOI?

ABSTRACT

Understanding user behaviors on Web sites enables site owners to make sites more usable, ultimately helping users to achieve their goals more quickly. Accordingly, researchers have devised methods for categorizing user sessions in hopes of revealing user interests. These techniques build user profiles by combining users' navigation paths with other data features, such as page viewing time, hyperlink structure, and page content. Previously, we have presented complex techniques of combining many of these data features to cluster user profiles. In this paper, we introduce a user study and a systematic evaluation of these different data features and their associated weighting schemes. We present the results of our study, including accuracy measures for a number of clustering approaches, and offer recommendations for Web analysts. While further investigation over more sites is needed to definitively settle on a robust scheme, we have characterized this analytic space


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Banerjee, A. and Ghosh, J. Clickstream Clustering using Weighted Longest Common Subsequences, in Proc. of the Workshop on Web Mining, SIAM Conference on Data Mining (Chicago IL, April 2001), 33--40.
2
 
3
 
4
5
 
6
 
7
Heer, J. and Chi, E.H. Identification of Web User Traffic Composition using Multi-Modal Clustering and Information Scent, in Proc. of the Workshop on Web Mining, SIAM Conference on Data Mining (Chicago IL, April 2001), 51--58.
 
8
Karypis, G. and Han, E. Concept indexing: A fast dimensionality reduction algorithm with applications to document retrieval and categorization. Technical Report TR-00-0016, University of Minnesota, 2000.
 
9
MacQueen, J. Some methods for classification and analysis of multivariate observations, in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 (1967), UC Berkeley Press, 281--297.
 
10
Nielsen, Jakob. Did Poor Usability Kill E-Commerce?, in Jakob Nielsen's Alertbox (August 19, 2001). http://www.useit.com/alertbox/20010819.html
11
 
12
 
13
 
14
Pitkow, J. and Pirolli, P. Mining longest repeated subsequences to predict World Wide Web surfing, in Proceedings of USITS '99: The 2nd USENIX Conference on Internet Technologies and Systems (Boulder CO, October 1999).
 
15
Proc. of the Workshop on Web Mining, SIAM Conference on Data Mining (Chicago IL, April 2001).
 
16
 
17
 
18
 
19
Stabin, T. and Glasson, C.E. First Impression: 7 commerical log processing tools slice and dice logs your way, (1997). Available at http://www.netscapeworld.com/netscapeworld/ nw08-1997/nw-08-loganalysis.html
 
20
Proc. of the SIGKDD Workshop on Web Data Mining (WEBKDD01) (San Francisco CA, August 2001).
 
21
 
22

CITED BY  11