ACM Home Page
Please provide us with feedback. Feedback
New cached-sufficient statistics algorithms for quickly answering statistical questions
Full text PdfPdf (145 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Philadelphia, PA, USA
SESSION: Conference invited talks table of contents
Pages: 2 - 2  
Year of Publication: 2006
ISBN:1-59593-339-5
Author
Andrew Moore  Google Pittsburgh / Carnegie Mellon University
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 138,   Citation Count: 0
Additional Information:

abstract   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1150402.1150405
What is a DOI?

ABSTRACT

This talk is about recent work on new ways to exploit preprocessed views of data tables for tractably solving big statistical queries. We'll describe deployments of these new algorithms in the realms of detecting killer asteroids and unnatural disease outbreaks.In recent years, several groups have looked at methods for pre-storing general sufficient statistics of the data in spatial data structures such as kd-trees and ball-trees so that both frequentist and Bayesian statistical operations become fast for large datasets. In this talk we will look at two other classes of optimization required in important statistical queries.The first involves iterating over all spatial regions (big and small). The second involves detection of tracks from noisy intermittent observations separated far apart in time and space. We will also discuss the implications that have arisen from making these operations tractable. We will focus particularly on

  • Detecting all asteroids in the solar system larger than Pittsburgh's Cathedral of Learning (data to be collected over 2006-2010).
  • Early detection of emerging diseases based on national monitoring of health-related transactions.
.