ACM Home Page
Please provide us with feedback. Feedback
Querying continuous functions in a database system
Full text PdfPdf (354 KB)
Source
International Conference on Management of Data archive
Proceedings of the 2008 ACM SIGMOD international conference on Management of data table of contents
Vancouver, Canada
SESSION: Research Session 17: Probabilistic II table of contents
Pages 791-804  
Year of Publication: 2008
ISBN:978-1-60558-102-6
Authors
Arvind Thiagarajan  MIT CSAIL, Cambridge, MA, USA
Samuel Madden  MIT CSAIL, Cambridge, MA, USA
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 20,   Downloads (12 Months): 178,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1376616.1376696
What is a DOI?

ABSTRACT

Many scientific, financial, data mining and sensor network applications need to work with continuous, rather than discrete data e.g., temperature as a function of location, or stock prices or vehicle trajectories as a function of time. Querying raw or discrete data is unsatisfactory for these applications -- e.g., in a sensor network, it is necessary to interpolate sensor readings to predict values at locations where sensors are not deployed. In other situations, raw data can be inaccurate owing to measurement errors, and it is useful to fit continuous functions to raw data and query the functions, rather than raw data itself -- e.g., fitting a smooth curve to noisy sensor readings, or a smooth trajectory to GPS data containing gaps or outliers. Existing databases do not support storing or querying continuous functions, short of brute-force discretization of functions into a collection of tuples. We present FunctionDB, a novel database system that treats mathematical functions as first-class citizens that can be queried like traditional relations. The key contribution of FunctionDB is an efficient and accurate algebraic query processor - for the broad class of multi-variable polynomial functions, FunctionDB executes queries directly on the algebraic representation of functions without materializing them into discrete points, using symbolic operations: zero finding, variable substitution, and integration. Even when closed form solutions are intractable, FunctionDB leverages symbolic approximation operations to improve performance. We evaluate FunctionDB on real data sets from a temperature sensor network, and on traffic traces from Boston roads. We show that operating in the functional domain has substantial advantages in terms of accuracy (15-30%) and up to order of magnitude (10x-100x) performance wins over existing approaches that represent models as discrete collections of points.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
PostGIS. http://postgis.refractions.net/.
2
3
4
5
 
6
7
 
8
9
 
10
 
11
 
12
R. A. O. L. Breiman, J. H. Friedman and C. J. Stone. Classification And Regression Trees. Wadsworth International Group, 1984.
 
13
W. Y. Loh. Regression Trees With Unbiased Variable Selection And Interaction Detection. Statistica Sinica, 12:361--386, 2002.
 
14
 
15
16
 
17
 
18
A. Thiagarajan. Representing and Querying Regression Models in an RDBMS. Master's thesis, MIT, Sep 2007.
 
19

Collaborative Colleagues:
Arvind Thiagarajan: colleagues
Samuel Madden: colleagues