ACM Home Page
Please provide us with feedback. Feedback
Extreme data mining
Full text PdfPdf (134 KB)
Source
International Conference on Management of Data archive
Proceedings of the 2008 ACM SIGMOD international conference on Management of data table of contents
Vancouver, Canada
Pages 1-2  
Year of Publication: 2008
ISBN:978-1-60558-102-6
Author
Sridhar Ramaswamy  Google Inc., Mountain View, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 60,   Downloads (12 Months): 693,   Citation Count: 0
Additional Information:

abstract   index terms  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1376616.1376617
What is a DOI?

ABSTRACT

At Google, the quality and speed of statistical data mining algorithms directly affects the usefulness of our search results and the relevance of our targeted advertising. One of the things that makes planet-wide, high throughput, 24/7 data mining so interesting is that all parts of the software stack are involved. This talk will walk up the stack, from the physical machines in warehouse-sized data centers, through networking and secondary storage abstractions to the distributed numerical methods and high throughput training and serving algorithms needed to support online logs processing and machine learning. We will also discuss the significant infrastructure and algorithmic impacts of batch versus online training: both data mining modes have essential roles in Google.