|
ABSTRACT
Data sets in large applications are often too massive to fit completely inside the computers internal memory. The resulting input/output communication (or I/O) between fast internal memory and slower external memory (such as disks) can be a major performance bottleneck. In this article we survey the state of the art in the design and analysis of external memory (or EM) algorithms and data structures, where the goal is to exploit locality in order to reduce the I/O costs. We consider a variety of EM paradigms for solving batched and online problems efficiently in external memory. For the batched problem of sorting and related problems such as permuting and fast Fourier transform, the key paradigms include distribution and merging. The paradigm of disk striping offers an elegant way to use multiple disks in parallel. For sorting, however, disk striping can be nonoptimal with respect to I/O, so to gain further improvements we discuss distribution and merging techniques for using the disks independently. We also consider useful techniques for batched EM problems involving matrices (such as matrix multiplication and transposition), geometric data (such as finding intersections and constructing convex hulls), and graphs (such as list ranking, connected components, topological sorting, and shortest paths). In the online domain, canonical EM applications include dictionary lookup and range searching. The two important classes of indexed data structures are based upon extendible hashing and B-trees. The paradigms of filtering and bootstrapping provide a convenient means in online data structures to make effective use of the data accessed from disk. We also reexamine some of the above EM problems in slightly different settings, such as when the data items are moving, when the data items are variable-length (e.g., text strings), or when the allocated amount of internal memory can change dynamically. Programming tools and environments are available for simplifying the EM programming task. During the course of the survey, we report on some experiments in the domain of spatial databases using the TPIE system (transparent parallel I/O programming environment). The newly developed EM algorithms and data structures that incorporate the paradigms we discuss are significantly faster than methods currently used in practice.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
ABEL, D. J. 1984. A B+-tree structure for large quadtrees. Computer Vision, Graphics, and Image Processing 27, 1 (July), 19-31.
|
| |
2
|
|
 |
3
|
|
| |
4
|
|
| |
5
|
AGARWAL,P.K.AND ERICKSON, J. 1999. Geometric range searching and its relatives. In B. Chazelle, J. E. Goodman, and R. Pollack, Eds, Advances in Discrete and Computational Geometry, Vol. 23 of Contemporary Mathematics, American Mathematical Society, Providence, RI, 1-56.
|
| |
6
|
|
 |
7
|
Pankaj K. Agarwal , Lars Arge , Jeff Erickson, Indexing moving points (extended abstract), Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.175-186, May 15-18, 2000, Dallas, Texas, United States
[doi> 10.1145/335168.335220]
|
| |
8
|
Pankaj K. Agarwal , Lars Arge , Gerth Stølting Brodal , Jeffrey S. Vitter, I/O-efficient dynamic point location in monotone planar subdivisions, Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, p.11-20, January 17-19, 1999, Baltimore, Maryland, United States
|
 |
9
|
Pankaj K. Agarwal , Lars Arge , Jeff Erickson , Paolo G. Franciosa , Jeffry Scott Vitter, Efficient searching with linear constraints, Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.169-178, June 01-04, 1998, Seattle, Washington, United States
[doi> 10.1145/275487.275506]
|
| |
10
|
Pankaj K. Agarwal , Lars Arge , T. M. Murali , Kasturi R. Varadarajan , Jeffrey Scott Vitter, I/O-efficient algorithms for contour-line extraction and planar graph blocking, Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms, p.117-126, January 25-27, 1998, San Francisco, California, United States
|
| |
11
|
Pankaj K. Agarwal , Lars Arge , Octavian Procopiuc , Jeffrey Scott Vitter, A Framework for Index Bulk Loading and Dynamization, Proceedings of the 28th International Colloquium on Automata, Languages and Programming,, p.115-127, July 08-12, 2001
|
 |
12
|
Pankaj K. Agarwal , Mark de Berg , Joachim Gudmundsson , Mikael Hammar , Herman J. Haverkort, Box-trees and R-trees with near-optimal query time, Proceedings of the seventeenth annual symposium on Computational geometry, p.124-133, June 2001, Medford, Massachusetts, United States
[doi> 10.1145/378583.378645]
|
| |
13
|
|
 |
14
|
|
 |
15
|
A. Aggarwal , B. Alpern , A. Chandra , M. Snir, A model for hierarchical memory, Proceedings of the nineteenth annual ACM conference on Theory of computing, p.305-314, January 1987, New York, New York, United States
[doi> 10.1145/28395.28428]
|
| |
16
|
AGGARWAL, A., CHANDRA, A., AND SNIR, M. 1987b. Hierarchical memory with block transfer. In Proceedings of the IEEE Symposium on Foundations of Computer Science, Vol. 28 (Los Angeles), 204- 216.
|
| |
17
|
|
| |
18
|
ALPERN, B., CARTER, L., FEIG, E., AND SELKER, T. 1994. The uniform memory hierarchy model of computation. Algorithmica 12, 2-3, 72-109.
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
|
 |
23
|
|
| |
24
|
|
| |
25
|
|
 |
26
|
Lars Arge , Paolo Ferragina , Roberto Grossi , Jeffrey Scott Vitter, On sorting strings in external memory (extended abstract), Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, p.540-548, May 04-06, 1997, El Paso, Texas, United States
[doi> 10.1145/258533.258647]
|
| |
27
|
|
| |
28
|
|
| |
29
|
Lars Arge , Octavian Procopiuc , Sridhar Ramaswamy , Torsten Suel , Jan Vahrenhold , Jeffrey Scott Vitter, A Unified Approach for Indexed and Non-Indexed Spatial Joins, Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology, p.413-429, March 27-31, 2000
|
| |
30
|
|
| |
31
|
Lars Arge , Octavian Procopiuc , Sridhar Ramaswamy , Torsten Suel , Jeffrey Scott Vitter, Theory and practice of I/O-efficient algorithms for multidimensional batched searching problems, Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms, p.685-694, January 25-27, 1998, San Francisco, California, United States
|
 |
32
|
Lars Arge , Vasilis Samoladas , Jeffrey Scott Vitter, On two-dimensional indexability and optimal range search indexing, Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.346-357, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
[doi> 10.1145/303976.304010]
|
| |
33
|
|
 |
34
|
|
| |
35
|
|
| |
36
|
|
| |
37
|
|
| |
38
|
|
| |
39
|
|
 |
40
|
|
| |
41
|
|
| |
42
|
|
| |
43
|
|
 |
44
|
Rakesh Barve , Elizabeth Shriver , Phillip B. Gibbons , Bruce K. Hillyer , Yossi Matias , Jeffrey Scott Vitter, Modeling and optimizing I/O throughput of multiple disks on a bus, Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, p.83-92, May 01-04, 1999, Atlanta, Georgia, United States
|
| |
45
|
|
| |
46
|
BAYER,R.AND MCCREIGHT, E. 1972. Organization of large ordered indexes. Acta Informatica 1, 173- 189.
|
 |
47
|
|
| |
48
|
|
 |
49
|
Norbert Beckmann , Hans-Peter Kriegel , Ralf Schneider , Bernhard Seeger, The R*-tree: an efficient and robust access method for points and rectangles, Proceedings of the 1990 ACM SIGMOD international conference on Management of data, p.322-331, May 23-26, 1990, Atlantic City, New Jersey, United States
|
| |
50
|
|
 |
51
|
|
| |
52
|
BENTLEY,J.L.AND SAXE, J. B. 1980. Decomposable searching problems I: Static-to-dynamic transformations. Journal of Algorithms 1, 4 (Dec.), 301-358.
|
| |
53
|
|
| |
54
|
BLUM,N.AND MEHLHORN, K. 1980. On the average number of rebalancing operations in weightbalanced trees. Theoretical Computer Science 11, 3 (July), 303-320.
|
 |
55
|
|
| |
56
|
|
| |
57
|
|
| |
58
|
Adam L. Buchsbaum , Michael Goldwasser , Suresh Venkatasubramanian , Jeffery R. Westbrook, On external memory graph traversal, Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, p.859-860, January 09-11, 2000, San Francisco, California, United States
|
| |
59
|
|
| |
60
|
|
| |
61
|
|
 |
62
|
|
| |
63
|
CHAZELLE,B.AND EDELSBRUNNER, H. 1987. Linear space data structures for two types of range search. Discrete and Computational Geometry 2, 113-126.
|
 |
64
|
Peter M. Chen , Edward K. Lee , Garth A. Gibson , Randy H. Katz , David A. Patterson, RAID: high-performance, reliable secondary storage, ACM Computing Surveys (CSUR), v.26 n.2, p.145-185, June 1994
[doi> 10.1145/176979.176981]
|
| |
65
|
|
| |
66
|
Yi-Jen Chiang , Michael T. Goodrich , Edward F. Grove , Roberto Tamassia , Darren Erik Vengroff , Jeffrey Scott Vitter, External-memory graph algorithms, Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms, p.139-149, January 22-24, 1995, San Francisco, California, United States
|
| |
67
|
|
| |
68
|
|
| |
69
|
|
| |
70
|
|
 |
71
|
|
| |
72
|
CORBETT, P., FEITELSON, D., FINEBERG, S., HSU,Y., NITZBERG, B., PROST, J.-P., SNIR, M., TRAVERSAT, B., AND WONG, P. 1996. Overview of the MPI- IO parallel I/O interface. In R., Jain, J., Werth, and J. C. Browne, Eds., Input/Output in Parallel and Distributed Computer Systems, Vol. 362 of The Kluwer International Series in Engineering and Computer Science, Kluwer Academic, Chapter 5, 127-146.
|
 |
73
|
|
| |
74
|
|
| |
75
|
|
| |
76
|
|
| |
77
|
|
| |
78
|
|
 |
79
|
A. Crauser , P. Ferragina , K. Mehlhorn , U. Meyer , E. Ramos, Randomized external-memory algorithms for some geometric problems, Proceedings of the fourteenth annual symposium on Computational geometry, p.259-268, June 07-10, 1998, Minneapolis, Minnesota, United States
[doi> 10.1145/276884.276914]
|
| |
80
|
A. Crauser , P. Ferragina , K. Mehlhorn , U. Meyer , E. A. Ramos, I/O-optimal computation of segment intersections, External memory algorithms, American Mathematical Society, Boston, MA, 1999
|
| |
81
|
|
| |
82
|
|
| |
83
|
|
 |
84
|
Frank Dehne , Wolfgang Dittrich , David Hutchinson, Efficient external memory algorithms by simulating coarse-grained parallel algorithms, Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures, p.106-115, June 23-25, 1997, Newport, Rhode Island, United States
[doi> 10.1145/258492.258503]
|
| |
85
|
|
| |
86
|
DEMUTH, H. B. 1956. Electronic data sorting. Ph.D., Stanford University. A shortened version appears in IEEE Transactions on Computing C-34, 4, (April), 296-310, 1985, Special Issue on Sorting, E. E. Lindstrom, C. K. Wong, and J. S. Vitter, Eds.
|
| |
87
|
DENNING, P. J. 1980. Working sets past and present. IEEE Transactions on Software Engineering SE-6, 64-84.
|
| |
88
|
|
 |
89
|
|
| |
90
|
|
| |
91
|
|
| |
92
|
EDELSBRUNNER, H. 1983a. A new approach to rectangle intersections, part I. International Journal of Computer Mathematics 13, 209-219.
|
| |
93
|
EDELSBRUNNER, H. 1983b. A new approach to rectangle intersections, part II. International Journal of Computer Mathematics 13, 221-229.
|
 |
94
|
|
 |
95
|
|
| |
96
|
|
 |
97
|
|
| |
98
|
|
| |
99
|
|
| |
100
|
FELLER, W. 1968. An Introduction to Probability Theory and its Applications, Vol. 1, 3rd edition. John Wiley, New York.
|
| |
101
|
|
 |
102
|
|
| |
103
|
|
| |
104
|
FLAJOLET, P. 1983. On the performance evaluation of extendible hashing and trie searching. Acta Informatica 20, 4, 345-369.
|
| |
105
|
FLOYD, R. W. 1972. Permuting information in idealized two-level storage. In R. Miller and J. Thatcher, Eds., Complexity of Computer Computations. Plenum, New York, 105-109.
|
| |
106
|
|
| |
107
|
|
 |
108
|
|
 |
109
|
|
| |
110
|
GANGER, G. R. 1995. Generating representative synthetic workloads: An unsolved problem. In Proceedings of the Computer Measurement Group Conference (Dec.), Vol. 21, 1263-1269.
|
| |
111
|
GARDNER, M. 1977. Magic Show. Knopf, New York, Chapter 7.
|
 |
112
|
|
 |
113
|
|
| |
114
|
|
| |
115
|
|
| |
116
|
GOODRICH,M.T.,TSAY, J.-J., VENGROFF,D.E.,AND VITTER, J. S. 1993. External-memory computational geometry. In Proceedings of the IEEE Symposium on Foundations of Computer Science Palo Alto (Nov.), Vol. 34, 714-723.
|
| |
117
|
|
| |
118
|
|
 |
119
|
John Linwood Griffin , Steven W. Schlosser , Gregory R. Ganger , David F. Nagle, Modeling and performance of MEMS-based storage devices, Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, p.56-65, June 18-21, 2000, Santa Clara, California, United States
|
| |
120
|
|
| |
121
|
|
| |
122
|
GUPTA,S.K.S.,LI, Z., AND REIF, J. H. 1995. Generating efficient programs for two-level memories from tensor-products. In Proceedings of the IASTED/ISMM International Conference on Parallel and Distributed Computing and Systems (Washington, DC, Oct.), Vol. 7, 510- 513.
|
| |
123
|
|
 |
124
|
|
 |
125
|
Joseph M. Hellerstein , Elias Koutsoupias , Christos H. Papadimitriou, On the analysis of indexing schemes, Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.249-256, May 11-15, 1997, Tucson, Arizona, United States
[doi> 10.1145/263661.263688]
|
| |
126
|
HELLERSTEIN, L., GIBSON, G., KARP, R. M., KATZ,R.H., AND PATTERSON, D. A. 1994. Coding techniques for handling failures in large disk arrays. Algorithmica 12, 2-3, 182-208.
|
| |
127
|
|
| |
128
|
HINRICHS, K. H. 1985. The grid file system: Implementation and case studies of applications. PhD Thesis, Dept. Information Science, ETH, Zurich.
|
 |
129
|
|
| |
130
|
HUTCHINSON, D., MAHESHWARI, A., SACK, J.-R., AND VELICESCU, R. 1997. Early experiences in implementing the buffer tree. In Proceedings of the Workshop on Algorithm Engineering, Vol. 1.
|
| |
131
|
HUTCHINSON, D., MAHESHWARI, A., AND ZEH, N. 1999. An external memory data structure for shortest path queries. In Proceedings of the International Conference on Computing and Combinatorics (July), Vol. 1627 of Lecture Notes in Computer Science, Springer-Verlag, 51-60.
|
| |
132
|
|
 |
133
|
|
| |
134
|
HUTCHINSON, D. A., SANDERS,P.,AND VITTER,J.S. 2001c. Notes.
|
 |
135
|
Piotr Indyk , Rajeev Motwani , Prabhakar Raghavan , Santosh Vempala, Locality-preserving hashing in multidimensional spaces, Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, p.618-625, May 04-06, 1997, El Paso, Texas, United States
[doi> 10.1145/258533.258656]
|
| |
136
|
|
 |
137
|
|
 |
138
|
|
 |
139
|
|
| |
140
|
|
| |
141
|
KAMEL, I., KHALIL, M., AND KOURAMAJIAN, V. 1996. Bulk insertion in dynamic R-trees. In Proceedings of the International Symposium on Spatial Data Handling, Vol. 4, 3B, 31-42.
|
 |
142
|
Paris C. Kanellakis , Gabriel M. Kuper , Peter Z. Revesz, Constraint query languages (preliminary report), Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.299-313, April 02-04, 1990, Nashville, Tennessee, United States
[doi> 10.1145/298514.298582]
|
| |
143
|
|
| |
144
|
|
| |
145
|
|
| |
146
|
|
| |
147
|
|
| |
148
|
KNUTH, D. E. 1999. MMIXware. Springer, Berlin.
|
| |
149
|
KNUTH, D. E., MORRIS,J.H.,AND PRATT, V. R. 1977. Fast pattern matching in strings. SIAM Journal on Computing 6, 323-350.
|
 |
150
|
George Kollios , Dimitrios Gunopulos , Vassilis J. Tsotras, On indexing mobile objects, Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.261-272, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
[doi> 10.1145/303976.304002]
|
 |
151
|
|
| |
152
|
KRISHNAMURTHY,R.AND WANG, K.-Y. 1985. Multilevel grid files. Technical Report, IBM T. J. Watson Center, Yorktown Heights, NY, November.
|
| |
153
|
|
| |
154
|
KUSPERT, K. 1983. Storage utilization in B*-trees with a generalized overflow technique. Acta Informatica, 19, 35-55.
|
 |
155
|
|
| |
156
|
LAURINI,R.AND THOMPSON, D. 1992. Fundamentals of Spatial Information Systems. Academic Press.
|
 |
157
|
|
| |
158
|
|
| |
159
|
LEISERSON, C. E., RAO,S.,AND TOLEDO, S. 1993. Efficient out-of-core algorithms for linear relaxation using blocking covers. In Proceedings of the IEEE Symposium on Foundations of Computer Science, Vol. 34, 704-713.
|
| |
160
|
LI, Z., MILLS,P.H.,AND REIF, J. H. 1996. Models and resource metrics for parallel and distributed computation. Parallel Algorithms and Applications 8, 35-59.
|
| |
161
|
LITWIN, W. 1980. Linear hashing: A new tool for files and tables addressing. In Proceedings of the International Conference on Very Large Databases (Montreal, Oct.), Vol. 6, 212-223.
|
| |
162
|
LITWIN,W.AND LOMET, D. 1987. A new method for fast data searches with keys. IEEE Software 4, 2 (March), 16-24.
|
 |
163
|
|
 |
164
|
|
| |
165
|
|
| |
166
|
MAHESHWARI,A.AND ZEH, N. External memory algorithms for outerplanar graphs. In Proceedings of the International Conference on Computing and Combinatorics (July), Vol. 1627 of Lecture Notes in Computer Science, Springer-Verlag, 51-60.
|
| |
167
|
|
| |
168
|
|
| |
169
|
MANBER,U.AND WU, S. 1994. GLIMPSE: A tool to search through entire file systems. In USENIX Association, Ed., Proceedings of the Winter USENIX Conference (San Francisco, Jan.), 23- 32.
|
| |
170
|
|
| |
171
|
Yossi Matias , Eran Segal , Jeffrey Scott Vitter, Efficient bundle sorting, Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, p.839-848, January 09-11, 2000, San Francisco, California, United States
|
 |
172
|
|
| |
173
|
MCCREIGHT, E. M. 1985. Priority search trees. SIAM Journal on Computing 14, 2 (May), 257- 276.
|
| |
174
|
MENDELSON, H. 1982. Analysis of extendible hashing. IEEE Transactions on Software Engineering SE-8 (Nov.), 611-619.
|
| |
175
|
|
| |
176
|
Microsoft 1998. TerraServer online database of satellite images, available on the World-Wide Web at http://terraserver.microsoft.com/.
|
| |
177
|
|
 |
178
|
|
| |
179
|
|
| |
180
|
MULLIN, J. K. 1985. Spiral storage: Efficient dynamic hashing with constant performance. The Computer Journal 28, 3 (July), 330-334.
|
| |
181
|
|
| |
182
|
NASA 1999. Earth Observing System (EOS) web page, NASA Goddard Space Flight Center, http://eospso.gsfc.nasa.gov/.
|
| |
183
|
NIEVERGELT,J.AND REINGOLD, E. M. 1973. Binary search tree of bounded balance. SIAM Journal on Computing 2,1.
|
| |
184
|
|
 |
185
|
|
 |
186
|
|
 |
187
|
|
| |
188
|
NODINE, M. H., GOODRICH,M.T.,AND VITTER,J.S. 1996. Blocking for external graph searching. Algorithmica 16, 2 (August), 181-214.
|
| |
189
|
|
| |
190
|
|
 |
191
|
|
 |
192
|
|
| |
193
|
|
| |
194
|
|
 |
195
|
Hwee Hwa Pang , Michael J. Carey , Miron Livny, Partially preemptible hash joins, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.59-68, May 25-28, 1993, Washington, D.C., United States
|
| |
196
|
|
 |
197
|
|
| |
198
|
|
| |
199
|
RAHMAN,N.,AND RAMAN, R. 2000. Adapting radix sort to the memory hierarchy. In Workshop on Algorithm Engineering and Experimentation (Jan.), Vol. 1982 of Lecture Notes in Computer Science. Springer-Verlag.
|
 |
200
|
|
| |
201
|
|
 |
202
|
|
| |
203
|
|
 |
204
|
|
 |
205
|
|
| |
206
|
|
| |
207
|
|
 |
208
|
Simonas Šaltenis , Christian S. Jensen , Scott T. Leutenegger , Mario A. Lopez, Indexing the positions of continuously moving objects, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.331-342, May 15-18, 2000, Dallas, Texas, United States
|
 |
209
|
|
| |
210
|
|
| |
211
|
|
 |
212
|
|
| |
213
|
|
| |
214
|
SANDERS, P. 2000. Personal communication.
|
| |
215
|
|
| |
216
|
Peter Sanders , Sebastian Egner , Jan Korst, Fast concurrent access to parallel disks, Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, p.849-858, January 09-11, 2000, San Francisco, California, United States
|
| |
217
|
|
| |
218
|
SAVAGE,J.E.AND Vitter, J. S. 1987. Parallelism in space-time tradeoffs. In F. P. Preparata, Ed., Advances in Computing Research, Vol. 4, JAI Press, Grreenwich, CT, 117-146.
|
 |
219
|
Steven W. Schlosser , John Linwood Griffin , David F. Nagle , Gregory R. Ganger, Designing computer systems with MEMS-based storage, Proceedings of the ninth international conference on Architectural support for programming languages and operating systems, p.1-12, November 2000, Cambridge, Massachusetts, United States
|
| |
220
|
|
| |
221
|
|
| |
222
|
SELTZER, M., SMITH, K. A., BALAKRISHNAN, H., CHANG, J., MCMAINS,S.,AND PADMANABHAN, V. 1995. File system logging versus clustering: A performance comparison. In Proceedings of the Annual USENIX Technical Conference (New Orleans), 249-264.
|
| |
223
|
|
 |
224
|
Elizabeth Shriver , Arif Merchant , John Wilkes, An analytic behavior model for disk drives with readahead caches and request reordering, Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, p.182-191, June 22-26, 1998, Madison, Wisconsin, United States
|
| |
225
|
SHRIVER,E.A.M.AND NODINE, M. H. 1996. An introduction to parallel I/O models and algorithms. In R. Jain, J. Werth, and J. C. Browne, Eds. Input/Output in Parallel and Distributed Computer Systems, Kluwer Academic, Chapter 2, 31-68.
|
| |
226
|
|
| |
227
|
SIBEYN, J. F. 1997. From parallel to external list ranking. Technical Report MPI-I-97-1-021, Max-Planck-Institut, September.
|
| |
228
|
SIBEYN, J. F. 1999. External selection. In Proceedings of the Symposium on Theoretical Aspects of Computer Science, Vol. 1563 of Lecture Notes in Computer Science, Springer-Verlag, 291-301.
|
| |
229
|
|
| |
230
|
|
 |
231
|
|
| |
232
|
|
| |
233
|
TAMASSIA,R.AND VITTER, J. S. 1996. Optimal cooperative search in fractional cascaded data structures. Algorithmica 15, 2 (Feb.), 154-171.
|
| |
234
|
|
| |
235
|
TIGER/Line (tm). 1992. Technical documentation. Technical Report, USBureau of the Census.
|
| |
236
|
|
| |
237
|
TPIE. 1999. User manual and reference, The manual and software distribution are available on the web at http://www.cs.duke.edu/TPIE/.
|
| |
238
|
ULLMAN,J.D.AND YANNAKAKIS, M. 1991. The input/output complexity of transitive closure. Annals of Mathematics and Artificial Intelligence 3, 331-360.
|
| |
239
|
|
| |
240
|
VAN KREVELD, M., NIEVERGELT, J., ROOS, T., AND WINDMAYER, P. Eds. 1997. Algorithmic Foundations of GIS, Vol. 1340 of Lecture Notes in Computer Science, Springer-Verlag.
|
| |
241
|
|
 |
242
|
|
| |
243
|
VENGROFF,D.E.AND VITTER, J. S. 1996b. I/O- efficient scientific computation using TPIE. In Proceedings of NASA Goddard Conference on Mass Storage Systems (Sept.), Vol. 5, II, 553- 570.
|
| |
244
|
VETTIGER, P., DESPONT, M., DRECHSLER, U., DURIG,U., HABERLE, W., LUTWYCHE, M. I., ROTHUIZEN, E., STUTZ, R., WIDMER, R., AND BINNIG, G. K. 2000. The "Millipede"-More than one thousand tips for future AFM data storage. IBM Journal of Research and Development 44, 3, 323-340.
|
| |
245
|
|
| |
246
|
|
| |
247
|
|
| |
248
|
|
| |
249
|
|
| |
250
|
|
| |
251
|
|
| |
252
|
|
| |
253
|
VITTER,J.S.AND SHRIVER, E. A. M. 1994a. Algorithms for parallel memory I: Two-level memories. Algorithmica 12, 2-3, 110-147.
|
| |
254
|
VITTER,J.S.AND SHRIVER, E. A. M. 1994b. Algorithms for parallel memory II: Hierarchical multilevel memories. Algorithmica 12, 2-3, 148- 169.
|
| |
255
|
VITTER,J.S.AND VENGROFF, D. E. 1999. Notes.
|
 |
256
|
|
 |
257
|
Jeffrey Scott Vitter , Min Wang , Bala Iyer, Data cube approximation and histograms via wavelets, Proceedings of the seventh international conference on Information and knowledge management, p.96-104, November 02-07, 1998, Bethesda, Maryland, United States
[doi> 10.1145/288627.288645]
|
| |
258
|
|
| |
259
|
WANG, M., VITTER,J.S.,LIM, L., AND PADMANABHAN,S. 2001. Wavelet-based cost estimation for spatial queries, July.
|
| |
260
|
|
| |
261
|
WEINER, P. 1973. Linear pattern matching algorithm. In Proceedings of the IEEE Symposium on Switching and Automata Theory (Washington, DC), Vol. 14, 1-11.
|
 |
262
|
|
 |
263
|
Ouri Wolfson , Prasad Sistla , Bo Xu , Jutai Zhou , Sam Chamberlain, DOMINO: databases fOr MovINg Objects tracking, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.547-549, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
| |
264
|
WOMBLE, D., GREENBERG, D., WHEAT,S.,AND RIESEN,R. 1993. Beyond core: Making parallel computer I/O practical. In Proceedings of the DAGS Symposium on Parallel Computation (Hanover, NH, June), 56-63. Vol. 2, Dartmouth Institute for Advanced Graduate Studies.
|
| |
265
|
WU,C.,AND FENG, T. 1981. The universality of the shuffle-exchange network. IEEE Transactions on Computers C-30 (May), 324-332.
|
| |
266
|
YAO, A. C. 1978. On random 2-3 trees. Acta Informatica 9, 159-170.
|
| |
267
|
|
| |
268
|
|
| |
269
|
|
CITED BY 101
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Pankaj K. Agarwal , Lars Arge , Andrew Danner , Bryan Holland-Minkley, Cache-oblivious data structures for orthogonal range searching, Proceedings of the nineteenth annual symposium on Computational geometry, June 08-10, 2003, San Diego, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Martin Isenburg , Yuanxin (Leo) Liu , Jonathan Shewchuk , Jack Snoeyink, Illustrating the streaming construction of 2D delaunay triangulations, Proceedings of the twenty-second annual symposium on Computational geometry, June 05-07, 2006, Sedona, Arizona, USA
|
|
|
Lars Arge , Michael A. Bender , Erik D. Demaine , Bryan Holland-Minkley , J. Ian Munro, Cache-oblivious priority queue and graph algorithm applications, Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, May 19-21, 2002, Montreal, Quebec, Canada
|
|
|
|
|
|
Lars Arge , Jeffrey S. Chase , Patrick Halpin , Laura Toma , Jeffrey S. Vitter , Dean Urban , Rajiv Wickremesinghe, Efficient Flow Computation on Massive Grid Terrain Datasets, Geoinformatica, v.7 n.4, p.283-313, December 2003
|
|
|
Paolo Ferragina , Nick Koudas , Divesh Srivastava , S. Muthukrishnan, Two-dimensional substring indexing, Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.282-288, May 2001, Santa Barbara, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Naga Govindaraju , Jim Gray , Ritesh Kumar , Dinesh Manocha, GPUTeraSort: high performance graphics co-processor sorting for large database management, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
Tamás Sarlós , Adrás A. Benczúr , Károly Csalogány , Dániel Fogaras , Balázs Rácz, To randomize or not to randomize: space optimal summaries for hyperlink analysis, Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mohamed Y. Eltabakh , Wing-Kai Hon , Rahul Shah , Walid G. Aref , Jeffrey S. Vitter, The SBC-tree: an index for run-length compressed sequences, Proceedings of the 11th international conference on Extending database technology: Advances in database technology, March 25-29, 2008, Nantes, France
|
|
|
|
|
|
Pankaj K. Agarwal , Lars Arge , Thomas Mølhave , Bardia Sadri, I/o-efficient efficient algorithms for computing contours on a terrain, Proceedings of the twenty-fourth annual symposium on Computational geometry, June 09-11, 2008, College Park, MD, USA
|
|
|
David Kasik , Andreas Dietrich , Enrico Gobbetti , Fabio Marton , Dinesh Manocha , Philipp Slusallek , Abe Stephens , Sung-Eui Yoon, Massive model visualization techniques: course notes, ACM SIGGRAPH 2008 classes, August 11-15, 2008, Los Angeles, California
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Andrew Danner , Thomas Mølhave , Ke Yi , Pankaj K. Agarwal , Lars Arge , Helena Mitasova, TerraStream: from elevation data to watershed hierarchies, Proceedings of the 15th annual ACM international symposium on Advances in geographic information systems, November 07-09, 2007, Seattle, Washington
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Luca Becchetti , Paolo Boldi , Carlos Castillo , Aristides Gionis, Efficient semi-streaming algorithms for local triangle counting in massive graphs, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
Tiankai Tu , Charles A. Rendleman , David W. Borhani , Ron O. Dror , Justin Gullingsrud , Morten Ø. Jensen , John L. Klepeis , Paul Maragakis , Patrick Miller , Kate A. Stafford , David E. Shaw, A scalable parallel framework for analyzing terascale molecular dynamics simulation trajectories, Proceedings of the 2008 ACM/IEEE conference on Supercomputing, November 15-21, 2008, Austin, Texas
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
INDEX TERMS
Primary Classification:
F.
Theory of Computation
F.2
ANALYSIS OF ALGORITHMS AND PROBLEM COMPLEXITY
F.2.2
Nonnumerical Algorithms and Problems
Subjects:
Computations on discrete structures
Additional Classification:
B.
Hardware
B.4
INPUT/OUTPUT AND DATA COMMUNICATIONS
B.4.3
Interconnections (subsystems)
Subjects:
Parallel I/O
E.
Data
E.1
DATA STRUCTURES
Subjects:
Graphs and networks;
Trees
E.5
FILES
Subjects:
Sorting/searching
F.
Theory of Computation
F.1
COMPUTATION BY ABSTRACT DEVICES
F.1.1
Models of Computation
Subjects:
Bounded-action devices (e.g., Turing machines, random access machines);
Relations between models
F.2
ANALYSIS OF ALGORITHMS AND PROBLEM COMPLEXITY
F.2.2
Nonnumerical Algorithms and Problems
Subjects:
Geometrical problems and computations;
Sorting and searching
H.
Information Systems
H.2
DATABASE MANAGEMENT
H.2.2
Physical Design
Subjects:
Access methods
H.2.8
Database applications
Subjects:
Spatial databases and GIS
H.3
INFORMATION STORAGE AND RETRIEVAL
H.3.3
Information Search and Retrieval
Subjects:
Search process;
Information filtering
General Terms:
Algorithms,
Design,
Experimentation,
Performance,
Theory
Keywords:
B-tree,
I/O,
batched,
block,
disk,
dynamic,
extendible hashing,
external memory,
hierarchical memory,
multidimensional access methods,
multilevel memory,
online,
out-of-core,
secondary storage,
sorting
|