ACM Home Page
Please provide us with feedback. Feedback
Large-scale behavioral targeting
Full text MovMov (24:03),  PdfPdf (867 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Paris, France
SESSION: Research track papers table of contents
Pages 209-218  
Year of Publication: 2009
ISBN:978-1-60558-495-9
Authors
Ye Chen  eBay Inc., San Jose, CA, USA
Dmitry Pavlov  Yandex Labs, Burlingame, CA, USA
John F. Canny  University of California, Berkeley, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 58,   Downloads (12 Months): 235,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557019.1557048
What is a DOI?

ABSTRACT

Behavioral targeting (BT) leverages historical user behavior to select the ads most relevant to users to display. The state-of-the-art of BT derives a linear Poisson regression model from fine-grained user behavioral data and predicts click-through rate (CTR) from user history. We designed and implemented a highly scalable and efficient solution to BT using Hadoop MapReduce framework. With our parallel algorithm and the resulting system, we can build above 450 BT-category models from the entire Yahoo's user base within one day, the scale that one can not even imagine with prior systems. Moreover, our approach has yielded 20% CTR lift over the existing production system by leveraging the well-grounded probabilistic model fitted from a much larger training dataset.

Specifically, our major contributions include: (1) A MapReduce statistical learning algorithm and implementation that achieve optimal data parallelism, task parallelism, and load balance in spite of the typically skewed distribution of domain data. (2) An in-place feature vector generation algorithm with linear time complexity O(n) regardless of the granularity of sliding target window. (3) An in-memory caching scheme that significantly reduces the number of disk IOs to make large-scale learning practical. (4) Highly efficient data structures and sparse representations of models and data to enable fast model updates. We believe that our work makes significant contributions to solving large-scale machine learning problems of industrial relevance in general. Finally, we report comprehensive experimental results, using industrial proprietary codebase and datasets.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
S. Agarwal, P. Renaker, and A. Smith. Determining ad targeting information and/or ad creative information using past search queries. U.S. Patent 10/813,925, filed: Mar 31, 2004.
 
3
A. C. Cameron and P. K. Trivedi. Regression Analysis of Count Data. Cambridge University Press, 1998.
4
 
5
J. Canny, S. Zhong, S. Gaffney, C. Brower, P. Berkhin, and G. H. John. Granular data for behavioral targeting. U.S. Patent Application 20090006363.
 
6
E. Chang. Scalable collaborative filtering algorithms for mining social networks. In The NIPS 2008 Workshop on "Beyond Search: Computational Intelligence for the Web, 2008.
 
7
Y. Chen, D. Pavlov, P. Berkhin, and J. Canny. Large-scale behavioral targeting for advertising over a network. U.S. Patent Application 12/351,749, filed: Jan 09, 2009.
 
8
C. Y. Chung, J. M. Koran, L.-J. Lin, and H. Yin. Model for generating user profiles in a behavioral targeting system. U.S. Patent 11/394,374, filed: Mar 29, 2006.
9
10
 
11
D. D. Lee and H. S. Seung. Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems (NIPS), 13:556--562, 2000.
12

Collaborative Colleagues:
Ye Chen: colleagues
Dmitry Pavlov: colleagues
John F. Canny: colleagues