ACM Home Page
Please provide us with feedback. Feedback
Time and space optimization for processing groups of multi-dimensional scientific queries
Full text PdfPdf (247 KB)
Source
International Conference on Supercomputing archive
Proceedings of the 18th annual international conference on Supercomputing table of contents
Malo, France
SESSION: Distributed computing table of contents
Pages: 95 - 105  
Year of Publication: 2004
ISBN:1-58113-839-3
Authors
Suresh Aryangat  University of Maryland, College Park, MD
Henrique Andrade  University of Maryland, College Park, MD
Alan Sussman  University of Maryland, College Park, MD
Sponsors
SIGARCH: ACM Special Interest Group on Computer Architecture
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 31,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1006209.1006224
What is a DOI?

ABSTRACT

Data analysis applications in areas as diverse as remote sensing and telepathology require operating on and processing very large datasets. For such applications to execute efficiently, careful attention must be paid to the storage, retrieval, and manipulation of the datasets. This paper addresses the optimizations performed by a high performance database system that processes groups of data analysis requests for these applications, which we call queries. The system performs end-to-end processing of the requests, formulated as PostgreSQL declarative queries. The queries are converted into imperative descriptions, multiple imperative descriptions are merged into a single execution plan, the plan is optimized to decrease execution time via common compiler optimization techniques, and, finally, the plan is optimized to decrease memory consumption. The last two steps are experimentally shown to effectively reduc the amount of time required while conserving memory space as a group of queries is processed by the database.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
H. Andrade, S. Aryangat, T. Kurc, J. Saltz, and A. Sussman. Efficient execution of multi-query data analysis batches using compiler optimization strategies. In Proceedings of the 16th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2003), College Station, TX, October 2003.
2
 
3
 
4
S. Aryangat. Optimizing the execution of data analysis queries. Master's thesis, Department of Computer Science, University of Maryland, December 2003.
 
5
 
6
 
7
 
8
 
9
 
10
 
11
 
12
13
 
14
 
15
High Performance Fortran Forum. High Performance Fortran -- language specification -- version 2.0. Technical report, Rice University, January 1997. Available at http://www.netlib.org/hpf.
 
16
S. Kalluri, Z. Zhang, J. JáJá, D. Bader, N. E. Saleous, E. Vermote, and J. R. G. Townshend. A hierarchical data archiving and processing system to generate custom tailored products from AVHRR data. In 1999 IEEE International Geoscience and Remote Sensing Symposium, pages 2374--2376, 1999.
 
17
 
18
 
19
 
20
 
21
National Oceanic and Atmospheric Administration. NOAA Polar Orbiter User's Guide -- November 1998 Revision. compiled and edited by Katherine B. Kidwell. Available at http://www2.ncdc.noaa.gov/docs/podug/cover.htm.
 
22
PostgreSQL 7.3.2 Developer's Guide. http://www.postgresql.org.
23
24
 
25
M. Stonebraker. The SEQUOIA 2000 project. Data Engineering, 16(1):24--28, 1993.
 
26


Collaborative Colleagues:
Suresh Aryangat: colleagues
Henrique Andrade: colleagues
Alan Sussman: colleagues