|
ABSTRACT
Electronic commerce is revolutionizing the way we think about data modeling, by making it possible to integrate the processes of (costly) data acquisition and model induction. The opportunity for improving modeling through costly data acquisition presents itself for a diverse set of electronic commerce modeling tasks, from personalization to customer lifetime value modeling; we illustrate with the running example of choosing offers to display to web-site visitors, which captures important aspects in a familiar setting. Considering data acquisition costs explicitly can allow the building of predictive models at significantly lower costs, and a modeler may be able to improve performance via new sources of information that previously were too expensive to consider. However, existing techniques for integrating modeling and data acquisition cannot deal with the rich environment that electronic commerce presents. We discuss several possible data acquisition settings, the challenges involved in the integration with modeling, and various research areas that may supply parts of an ultimate solution. We also present and demonstrate briefly a unified framework within which one can integrate acquisitions of different types, with any cost structure and any predictive modeling objective.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
S. Ansari, R. Kohavi, L. Mason, and Z. Zheng. Integrating e-commerce and data mining: Architecture and challenges. In Proceedings of WEBKDD, 2000.
|
| |
3
|
A. C. Bemmaor. Predicting behavior from intention-to-buy measures: The parametric case. Journal of Marketing Research (JMR), 32(2), May 1995.
|
| |
4
|
M. Bilgic and L. Getoor. Voila: Efficient feature-value acquisition for classification. In AAAI '07: Proceedings of the 22nd National Conference on Artificial Intelligence, July 2007.
|
| |
5
|
K. Brinker. Incorporating diversity in active learning with support vector machines. In Proceedings of 20th International Conference on Machine Learning (ICML-2003), pages 59--66. AAAI Press, 2003.
|
| |
6
|
|
| |
7
|
I. Dagan and S. P. Engelson. Committee-based sampling for training probabilistic classifiers. In Proceedings of the Twelfth International Conference on Machine Learning (ICML-95), pages 150--157, San Francisco, CA, 1995. Morgan Kaufmann.
|
 |
8
|
Honghua (Kathy) Dai , Lingzhi Zhao , Zaiqing Nie , Ji-Rong Wen , Lee Wang , Ying Li, Detecting online commercial intention (OCI), Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
[doi> 10.1145/1135777.1135902]
|
| |
9
|
D. T. Davis and J. N. Hwang. Attentional focus training by boundary region data selection. In International Joint Conference on Neural Networks, volume 1, pages 676--81. IEEE, 1992.
|
| |
10
|
D. Dobkin, D. Gunopoulos, and S. Kasif. Computing optimal shallow decision trees. In International Workshop on Mathematics and Artificial Intelligence, 1996.
|
 |
11
|
|
| |
12
|
V. V. Fedorov. Theory of Optimal Experiments. Academic Press, New York, NY, 1972.
|
| |
13
|
|
| |
14
|
R. Gilad-Bachrach, A. Navot, and N. Tishby. Query by committee made real. In Advances in Neural Information Processing Systems, 2005.
|
| |
15
|
|
| |
16
|
P. Haase, M. Ehrig, A. Hotho, and B. Schnizler. Personalized information access in a bibliographic peer-to-peer system. In Proceedings of the AAAI Workshop on Semantic Web Personalization, 2004.
|
| |
17
|
S. Hill, F. Provost, and C. Volinsky. Network-based marketing: Identifying likely adopters via consumer networks. Statistical Science, 22(2), May 2006.
|
 |
18
|
Vijay S. Iyengar , Chidanand Apte , Tong Zhang, Active learning using adaptive resampling, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, p.91-98, August 20-23, 2000, Boston, Massachusetts, United States
[doi> 10.1145/347090.347110]
|
| |
19
|
L. P. Kaelbling, M. L. Littman, and A. P. Moore. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4:237--285, 1996.
|
| |
20
|
A. Kapoor and R. Greiner. Learning and classifying under hard budgets. In Proceedings of the European Conference on Machine Learning (ECML-05), Porto, Portugal, October 2005.
|
| |
21
|
K. L. Keller. Conceptualizing, measuring, managing customer-based brand equity. Journal of Marketing, 57(1):1Ü22, 1993.
|
| |
22
|
J. Kiefer. Optimal experimental designs. J. R. Stat. Soc., series B 21:272--304, 1959.
|
 |
23
|
|
| |
24
|
|
| |
25
|
S. Kullbak and R. Leibler. On information and sufficiency. Annals of Mathematical Statistics, 22(1), 1951.
|
| |
26
|
|
| |
27
|
D. D. Lewis and J. Catlett. Heterogeneous uncertainty sampling for supervised learning. In Proceedings of the Eleventh International Conference on Machine Learning (ICML-94), pages 148--156, San Francisco, CA, July 1994. Morgan Kaufmann.
|
| |
28
|
|
| |
29
|
G. Linden, B. Smith, and J. York. Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing, page 76Ü80, 2003.
|
| |
30
|
Michael Lindenbaum , Shaul Markovitch , Dmitry Rusakov, Selective sampling for nearest neighbor classifiers, Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence, p.366-371, July 18-22, 1999, Orlando, Florida, United States
|
| |
31
|
D. Lizotte, O. Madani, and R. Greiner. Budgeted learning of naive-Bayes classifiers. In Proceedings of 19th Conference on Uncertainty in Artificial Intelligence (UAI-2003), Acapulco, Mexico, 2003.
|
| |
32
|
P. Manchanda, J.-P. Dub, K. Y. Goh, and P. K. Chintagunta. The effect of banner advertising on internet purchasing. Journal of Marketing Research (JMR), 43(1):98--108, Feb 2006.
|
 |
33
|
|
| |
34
|
|
| |
35
|
|
| |
36
|
P. Melville, S. M. Yang, M. Saar-Tsechansky, and R. Mooney. Active learning for probability estimation using Jensen-Shannon divergence. In Proceedings of the European Conference on Machine Learning (ECML-05), pages 268--279, Porto, Portugal, October 2005.
|
| |
37
|
|
| |
38
|
Bamshad Mobasher , Sarabjot Singh Anand, Intelligent Techniques for Web Personalization: IJCAI 2003 Workshop, ITWP 2003, Acapulco, Mexico, August 11, 2003, Revised Selected Papers (Lecture Notes ... / Lecture Notes in Artificial Intelligence), Springer-Verlag New York, Inc., Secaucus, NJ, 2005
|
| |
39
|
W. W. Moe. Buying, searching, or browsing: Differentiating between online shoppers using in-store navigational clickstream. Journal of Consumer Psychology, 13(1/2), 2003.
|
| |
40
|
|
 |
41
|
|
 |
42
|
|
 |
43
|
|
| |
44
|
N. V. Raman and J. D. Leckenby. Factors affecting consumers' Web ad visits. European Journal of Marketing, 32(7/8), 1998.
|
| |
45
|
M. Regelson and D. Fain. Predicting click-through rate using keyword clusters. In Proceedings of the 2nd Workshop on Sponsored Search Auctions, 2006.
|
 |
46
|
|
| |
47
|
H. Robbins. Some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society, 55, 1952.
|
| |
48
|
|
| |
49
|
|
| |
50
|
M. Saar-Tsechansky, P. Melville, and F. Provost. Active feature-value acquisition. Technical report, McCombs Research Paper Series No. IROM-08-06, 2006.
|
| |
51
|
|
| |
52
|
|
| |
53
|
M. Saar-Tsechansky and F. Provost. Decision-centric active learning of binary-outcome models. Information Systems Research, 18(1), 2007.
|
| |
54
|
D. R. Self and R. F. Lusch. Direct response marketing: a comparative review. Journal of Marketing, 50(1), 1986.
|
 |
55
|
H. S. Seung , M. Opper , H. Sompolinsky, Query by committee, Proceedings of the fifth annual workshop on Computational learning theory, p.287-294, July 27-29, 1992, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/130385.130417]
|
 |
56
|
|
| |
57
|
|
| |
58
|
|
| |
59
|
P. Turney. Types of cost in inductive concept learning. In Workshop on Cost-Sensitive Learning at the Seventeenth International Conference on Machine Learning (WCSL at ICML-2000), 2000.
|
 |
60
|
|
| |
61
|
|
CITED BY 3
|
|
|
|
|
|
|
|
Foster Provost , Brian Dalessandro , Rod Hook , Xiaohan Zhang , Alan Murray, Audience selection for on-line brand advertising: privacy-friendly social network targeting, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, June 28-July 01, 2009, Paris, France
|
|