ACM Home Page
Please provide us with feedback. Feedback
Seven pitfalls to avoid when running controlled experiments on the web
Full text MovMov (15:26),  PdfPdf (686 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Paris, France
SESSION: Industrial track papers table of contents
Pages 1105-1114  
Year of Publication: 2009
ISBN:978-1-60558-495-9
Authors
Thomas Crook  Microsoft, Redmond, WA, USA
Brian Frasca  Microsoft, Redmond, WA, USA
Ron Kohavi  Microsoft, Redmond, WA, USA
Roger Longbotham  Microsoft, Redmond, WA, USA
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 33,   Downloads (12 Months): 118,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557019.1557139
What is a DOI?

ABSTRACT

Controlled experiments, also called randomized experiments and A/B tests, have had a profound influence on multiple fields, including medicine, agriculture, manufacturing, and advertising. While the theoretical aspects of offline controlled experiments have been well studied and documented, the practical aspects of running them in online settings, such as web sites and services, are still being developed. As the usage of controlled experiments grows in these online settings, it is becoming more important to understand the opportunities and pitfalls one might face when using them in practice. A survey of online controlled experiments and lessons learned were previously documented in Controlled Experiments on the Web: Survey and Practical Guide (Kohavi, et al., 2009). In this follow-on paper, we focus on pitfalls we have seen after running numerous experiments at Microsoft. The pitfalls include a wide range of topics, such as assuming that common statistical formulas used to calculate standard deviation and statistical power can be applied and ignoring robots in analysis (a problem unique to online settings). Online experiments allow for techniques like gradual ramp-up of treatments to avoid the possibility of exposing many customers to a bad (e.g., buggy) Treatment. With that ability, we discovered that it's easy to incorrectly identify the winning Treatment because of Simpson's paradox.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Bacher, Paul, et al. 2005. Know your Enemy: Tracking Botnets. The Honeynet Project. {Online} March 13, 2005. http://www.honeynet.org/papers/bots/.
 
2
Bomhardt, Christian, Gaul, Wolfgang and Schmidt-Thieme, Lars. 2005. Web Robot Detection - Preprocessing Web Logfiles for Robot Detection. {book auth.} Maurizio Vichi, et al. New Developments in Classification and Data Analysis. s.l. : Springer, 2005.
 
3
Box, George E.P., Hunter, J Stuart and Hunter, William G. 2005. Statistics for Experimenters: Design, Innovation, and Discovery. 2nd. s.l. : John Wiley&Sons, Inc, 2005. 0471718130.
 
4
 
5
Efron, Bradley and Robert J. Tibshirani. 1993. An Introduction to the Bootstrap. New York : Chapman&Hall, 1993. 0-412-04231-2.
 
6
Fieller, E C. 1940. The Biological Standardization of Insulin. Supplement to the Journal of the Royal Statistical Society. 1940, Vol. 7, 1, pp. 1--64.
7
 
8
Hill, Nigel, Roche, Greg and Allen, Rachel. 2007. Customer Satisfaction: The Customer Experience Through the Customer's Eyes. s.l. : Cogent Publishing, 2007.
 
9
Hopkins, Claude. 1923. Scientific Advertising. New York City : Crown Publishers Inc., 1923.
 
10
Keppel, Geoffrey, Saufley, William H and Tokunaga, Howard. 1992. Introduction to Design and Analysis. 2nd. s.l. : W.H. Freeman and Company, 1992.
 
11
 
12
13
 
14
Koselka, Rita. 1996. The New Mantra: MVT. Forbes. March 11, 1996, pp. 114--118.
 
15
Malinas, Gary and Bigelow, John. 2004. Simpson's Paradox. Stanford Encyclopedia of Philosophy. {Online} 2004. {Cited: February 28, 2008.} http://plato.stanford.edu/entries/paradox-simpson/.
 
16
Mason, Robert L, Gunst, Richard F and Hess, James L. 1989. Statistical Design and Analysis of Experiments With Applications to Engineering and Science. s.l. : John Wiley&Sons, 1989. 047185364X .
 
17
 
18
Rao, C. Radhakrishna. 1973. Linear Statistical Inference and Its Applications. 2nd. s.l. : John Wiley&Sons, Inc., 1973.
 
19
Roy, Ranjit K. 2001. Design of Experiments using the Taguchi Approach : 16 Steps to Product and Process Improvement. s.l. : John Wiley&Sons, Inc, 2001. 0-471-36101-1.
 
20
Simpson, Edward H. 1951. The Interpretation of Interaction in Contingency Tables. Journal of the Royal Statistical Society, Ser. B. 1951, Vol. 13, pp. 238--241.
 
21
Spears, Steven J. 2004. Learning to Lead at Toyota. Harvard Business Review. May 2004, pp. 78--86.
 
22
 
23
Wikipedia: Botnet. 2008. Botnet. Wikipedia. {Online} 2008. {Cited: February 28, 2008.} http://en.wikipedia.org/wiki/Botnet.
 
24
Wikipedia: Internet bot. 2008. Internet Bot. Wikipedia. {Online} 2008. {Cited: February 28, 2008.} http://en.wikipedia.org/wiki/Internet_bot.
 
25
Wikipedia: Simpson's Paradox. 2008. Simpson's paradox. Wikipedia. {Online} 2008. {Cited: February 28, 2008.} http://en.wikipedia.org/wiki/Simpson%27s_paradox.

Collaborative Colleagues:
Thomas Crook: colleagues
Brian Frasca: colleagues
Ron Kohavi: colleagues
Roger Longbotham: colleagues