ACM Home Page
Please provide us with feedback. Feedback
Lower bounds for processing data with few random accesses to external memory
Full text PdfPdf (567 KB)
Source
Journal of the ACM (JACM) archive
Volume 56 ,  Issue 3  (May 2009) table of contents
Article No. 12  
Year of Publication: 2009
ISSN:0004-5411
Authors
Martin Grohe  Humboldt-Universität, Berlin, Germany
André Hernich  Goethe-Universität, Frankfurt am Main, Germany
Nicole Schweikardt  Goethe-Universität, Frankfurt am Main, Germany
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 22,   Downloads (12 Months): 211,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1516512.1516514
What is a DOI?

ABSTRACT

We consider a scenario where we want to query a large dataset that is stored in external memory and does not fit into main memory. The most constrained resources in such a situation are the size of the main memory and the number of random accesses to external memory. We note that sequentially streaming data from external memory through main memory is much less prohibitive.

We propose an abstract model of this scenario in which we restrict the size of the main memory and the number of random accesses to external memory, but admit arbitrary sequential access. A distinguishing feature of our model is that it allows the usage of unlimited external memory for storing intermediate results, such as several hard disks that can be accessed in parallel.

In this model, we prove lower bounds for the problem of sorting a sequence of strings (or numbers), the problem of deciding whether two given sets of strings are equal, and two closely related decision problems. Intuitively, our results say that there is no algorithm for the problems that uses internal memory space bounded by N1−ϵ and at most o(log N) random accesses to external memory, but unlimited “streaming access”, both for writing to and reading from external memory. (Here, N denotes the size of the input and ϵ is an arbitrary constant greater than 0.) We even permit randomized algorithms with one-sided bounded error. We also consider the problem of evaluating database queries and prove similar lower bounds for evaluating relational algebra queries against relational databases and XQuery and XPath queries against XML-databases.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
4
5
6
 
7
8
 
9
 
10
Erdös, P., and Szekeres, G. 1935. A combinatorial problem in geometry. Compos. Math. 2, 463--470.
11
 
12
Grohe, M., Koch, C., and Schweikardt, N. 2005. The complexity of querying external memory and streaming data. In Proceedings of 15th International Symposium on Fundamentals of Computation Theory (FCT 2005). Lecture Notes in Computer Science, vol. 3623. Springer-Verlag, Berlin, Germany, 1--16.
 
13
14
 
15
 
16
 
17
 
18
 
19
Munro, J. I., and Paterson, M. S. 1980. Selection and sorting with limited storage. Theoret. Comput. Sci. 12, 315--323.
 
20
 
21
Papadimitriou, C. H. 1994. Computational Complexity. Addison-Wesley, Reading, MA.
22
23
 
24
Wagner, K., and Wechsung, G. 1986. Computational Complexity. VEB Deutscher Verlag der Wissenschaften, Berlin, Germany.
 
25
World Wide Web Consortium. 2005. XML Path Language (XPath) 2.0. W3C Candidate Recommendation 3 November 2005. http://www.w3.org/TR/xpath20/.

Collaborative Colleagues:
Martin Grohe: colleagues
André Hernich: colleagues
Nicole Schweikardt: colleagues