|
ABSTRACT
In this paper, the red-blue pebble game is proposed to model the input-output complexity of algorithms. Using the pebble game formulation, a number of lower bound results for the I/O requirement are proven. For example, it is shown that to perform the n-point FFT or the ordinary n×n matrix multiplication algorithm with O(S) memory, at least &Ohgr;(n log n/log S) or &Ohgr;(n3/@@@@S), respectively, time is needed for the I/O. Similar results are obtained for algorithms for several other problems. All of the lower bounds presented are the best possible in the sense that they are achievable by certain decomposition schemes. Results of this paper may provide insight into the difficult task of balancing I/O and computation in special-purpose system designs. For example, for the n-point FFT, the lower bound on I/O time implies that an S-point device achieving a speed-up ratio of order log S over the conventional O(n log n) time implementation is all one can hope for.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Cook, S.A. and Sethi, R. Storage Requirements for Deterministic Polynomial Time Recognizable Languages. J. Comp. and Sys. Sci. 13:25-37, 1976.
|
| |
2
|
Floyd, R.W. Permuting Information in Idealized Two-Level Storage. In Miller, R.E. and Thatcher, J.W. (editor), Complexity of Computer Computations, pages 105-109. Plenum Press, New York, 1972.
|
| |
3
|
Harary, F. Graph Theory. Addison-Wesley, Reading, Massachusetts, 1969.
|
 |
4
|
|
| |
5
|
Kernighan, B.W. and Lin, S. An Effective Heuristic Procedure for Partitioning Graphs. Bell Systems Technical Journal 49:291-308, February, 1970.
|
| |
6
|
|
| |
7
|
Kung, H.T. Special-Purpose Devices for Signal and Image Processing: An Opportunity in VLSI. Technical Report, Carnegie-Mellon University, Department of Computer Science, July, 1980. Presented at SPIE's 24th Annual Technical Symposium, San Diego, California, July 1980. The final version of the paper is to be published in the Symposium Proceedings.
|
| |
8
|
Pippenger, N. Pebbling. In Proc. the Fifth IBM Symposium on Mathematical Foundations of Computer Science. Academic & Scientific Programs, IBM Japan, May, 1980.
|
| |
9
|
Rosenberg, A. L. Private communication.
|
| |
10
|
Song, S.W. I/O Complexity, and Design of Special-Purpose Hardware for Sorting. VLSI Document V075, Carnegie-Mellon University, Department of Computer Science, February, 1981.
|
CITED BY 57
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A. Aggarwal , B. Alpern , A. Chandra , M. Snir, A model for hierarchical memory, Proceedings of the nineteenth annual ACM conference on Theory of computing, p.305-314, January 1987, New York, New York, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
T. Lam , P. Tiwari , M. Tompa, Tradeoffs between communication and space, Proceedings of the twenty-first annual ACM symposium on Theory of computing, p.217-226, May 14-17, 1989, Seattle, Washington, United States
|
|
|
Peter A. Buhr , Anil K. Goel , Naomi Nishimura , Prabhakar Ragde, &mgr;Database: parallelism in a memory-mapped environment (research summary), Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures, p.196-199, June 24-26, 1996, Padua, Italy
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Kayvon Fatahalian , Daniel Reiter Horn , Timothy J. Knight , Larkhoon Leem , Mike Houston , Ji Young Park , Mattan Erez , Manman Ren , Alex Aiken , William J. Dally , Pat Hanrahan, Memory---Sequoia: programming the memory hierarchy, Proceedings of the 2006 ACM/IEEE conference on Supercomputing, November 11-17, 2006, Tampa, Florida
|
|
|
|
|
|
|
|
|
Michael A. Bender , Gerth Stølting Brodal , Rolf Fagerberg , Riko Jacob , Elias Vicari, Optimal sparse matrix dense vector multiplication in the I/O-model, Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures, June 09-11, 2007, San Diego, California, USA
|
|
|
Kamen Yotov , Tom Roeder , Keshav Pingali , John Gunnels , Fred Gustavson, An experimental comparison of cache-oblivious and cache-conscious programs, Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures, June 09-11, 2007, San Diego, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jack Dongarra , Jean-François Pineau , Yves Robert , Frédéric Vivien, Matrix product on heterogeneous master-worker platforms, Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, February 20-23, 2008, Salt Lake City, UT, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Grey Ballard , James Demmel , Olga Holtz , Oded Schwartz, Communication-optimal parallel and sequential Cholesky decomposition: extended abstract, Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures, August 11-13, 2009, Calgary, AB, Canada
|
|