ACM Home Page
Please provide us with feedback. Feedback
Golden Path Analyzer: using divide-and-conquer to cluster Web clickstreams
Full text PdfPdf (434 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Washington, D.C.
SESSION: Industrial/government track table of contents
Pages: 349 - 358  
Year of Publication: 2003
ISBN:1-58113-737-0
Authors
Kamal Ali  San Mateo, CA
Steven P. Ketchpel  San Mateo, CA
Sponsors
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 66,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/956750.956791
What is a DOI?

ABSTRACT

This paper describes a novel algorithm and deployed system Golden Path Analyzer (GPA) that analyzes clickstreams of people trying to complete the same task on a website. It finds the shortest, successful paths taken by users - 'golden paths' - and uses these as seeds for clickstream clusters. Other users are assigned to a cluster if their clickstream is a supersequence of the golden path. The advantages of this approach are that the resulting clusters are easily comprehended, they are few in number, correspond to semantically different strategies used by the users, and jointly partition all the clickstreams. GPA's key contribution over prior work in process funnels is that by not excluding users that make diversions from the golden path, GPA is able to assign more users to fewer clusters. Another key contribution is to use actual full clickstreams as cluster seeds to which supersequences of other users are added. Golden paths correspond to complete clickstreams that are based on actual user page transitions. GPA is particularly useful for site designers to improve processes such as shopping, returns and registration. Its analyses identify which web pages cause many users to deviate from a golden path, which links distract users and the percentage of users taking each golden path. GPA has demonstrated value on more than twenty client projects in diverse industries.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Banerjee, A. and Ghosh, J. Clickstream Clustering using Weighted Longest Common Subsequences, in Proc. of the Workshop on Web Mining, SIAM Conference on Data Mining (Chicago IL, April 2001), 33--40.
 
3
Barron A., Rissanen J. and Yu B. The minimum description length principle in coding and modeling. IEEE Trans. Information Theory, vol 44 (1998), 2743--2760.
 
4
5
 
6
 
7
 
8
 
9
Masand B. and Spiliopoulou M. (eds.). Advances in Web Usage Mining and User Profiling: Proceedings of the WEBKDD'99 Workshop, LNAI 1836. Springer Verlag, July 2000.
 
10
 
11
 
12
Spiliopoulou M., Faulstich L. C. and Winkler K. A Data Miner analyzing the Navigational Behaviour of Web Users. In Proc. of the Workshop on Machine Learning in User Modelling of the ACAI'99 Int. Conf., Creta, Greece, July '99.
 
13
Spiliopoulou, M., Pohle, C. and Faulstich, L. C. Improving the effectiveness of a web site with web usage mining. In {Masand and Spiliopoulou, 2000}, pages 139--159. 2000.
 
14
Spiliopoulou M. and Faulstich L. C. WUM: A Web Utilization Miner. In Workshop on the Web and Data Bases (WebDB98) (1998), 109--115.


Collaborative Colleagues:
Kamal Ali: colleagues
Steven P. Ketchpel: colleagues