|
ABSTRACT
Clickstream data collected at any web site (site-centric data) is inherently incomplete, since it does not capture users' browsing behavior across sites (user-centric data). Hence, models learned from such data may be subject to limitations, the nature of which has not been well studied. Understanding the limitations is particularly important since most current personalization techniques are based on site-centric data only. In this paper, we empirically examine the implications of learning from incomplete data in the context of two specific problems: (a) predicting if the remainder of any given session will result in a purchase and (b) predicting if a given user will make a purchase at any future session. For each of these problems we present new algorithms for fast and accurate data preprocessing of clickstream data. Based on a comprehensive experiment on user-level clickstream data gathered from 20,000 users' browsing behavior, we demonstrate that models built on user-centric data outperform models built on site-centric data for both prediction tasks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Aggarwal, C.C., Sun, Z., and Yu, P.S., 1998, Online Generation of Profile Association Rules'. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining.
|
| |
3
|
Ansari, S., 2000. Integrating E-Commerce and Data Mining: Architecture and Challenges, Web-KDD, Aug., 2000.
|
| |
4
|
Brodley, C., and Kohavi, R., 2000, Peel the Onion, KDD- CUP 2000, Boston, 2000.
|
| |
5
|
Chan, P.K., 1999. A Non-Invasive Learning Approach to Building Web User Profiles. In Proceedings WebKDD 1999.
|
 |
6
|
|
| |
7
|
Johnson, E., Moe, W., Fader, P., Bellman, S., and Lohse, J., 2000, On the Depth and Dynamics of Online Search Behavior, Wharton School Working Paper #00-014, June, 2000.
|
| |
8
|
Khabaza, T., 2001, "As E-asy as Falling Offa Web Log, Data mining Hits the Web", SPSS Data Mining Magazine, January.
|
| |
9
|
Kimbrough, S., Padmanabhan, B., and Zheng, Z., 2000, On Usage Metric for Determining Authoritative Sites, In the Proceedings of WITS 2000, Brisbane, Australia.
|
| |
10
|
Korgaonkar, P., and Wolin, L.D., 1999, A Multivariate analysis of Web usage, J. of Advertising Research, 39, pp 53-68.
|
| |
11
|
|
| |
12
|
Mobasher, B., Dai H., 2000, Discovery of Aggregate Usage Profiles for Web Personalization, Web-KDD, Aug., 2000
|
| |
13
|
Mobasher, B., Cooley, R., Srivastava J., 1999, Automatic Personalization Based on Web Usage Mining, Technical Report of Depaul University, TR 99-010.
|
| |
14
|
Moe, W., and Fader, P., 2000, Which Visits Lead to Purchases? Dynamic Conversion Behavior at e-Commerce Sites, The Wharton School, Working Paper #00-023. Aug. 2000 (A)
|
| |
15
|
Moe, W., and Fader, P., 2000, Capturing Evolving Visit Behavior in Clickstream Data, The Wharton School, Working Paper #00-003, Aug. 2000 (B).
|
| |
16
|
|
| |
17
|
Nasraoui, O., Frigui, H., Joshi, A., Krishnapuram, R., 1999, Mining Web Access Logs Using Relational Competitive Fuzzy Clustering, In the Proceedings of the Eight International Fuzzy Systems Association World Congress, Taipei, August, 1999.
|
| |
18
|
Padmanabhan, B., Zheng, Z., Kimbrough, S., 2001, A Comparison of Site-Centric and User-Centric Data Mining Approaches to Predicting Session-Level Purchase Behavior on the Web, The Wharton School OPIM Dept Working Paper 01-01-03.
|
| |
19
|
Park, Y., Fader, P., 2000, Modeling Browsing Behavior at Multiple Sites, In the Proceedings of Informs Marketing Science Conference, Los Angels, June 2000.
|
| |
20
|
Perkowitz, M., Etzioni, O, 1997, Adaptive web sites: an AI challenge, In Proceedings of the 15th International Joint Conference on Artificial Intelligence.
|
| |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
Sen, S., Padmanabhan, B., Tuzhilin, A., White, N., and Stein, R., 1998. The Identification and Satisfaction Of Consumer Analysis-Driven Information Needs Of Marketers on The WWW, European Journal Of Marketing (32:7/8), pp. 688- 702.
|
| |
25
|
Theusinger, C., Huber, K., 2000. Analyzing the Footsteps of Your Customers, Web-KDD 2000.
|
 |
26
|
Debra VanderMeer , Kaushik Dutta , Anindya Datta , Krithi Ramamritham , Shamkant B. Navanthe, Enabling scalable online personalization on the Web, Proceedings of the 2nd ACM conference on Electronic commerce, p.185-196, October 17-20, 2000, Minneapolis, Minnesota, United States
[doi> 10.1145/352871.352892]
|
|