ACM Home Page
Please provide us with feedback. Feedback
Query evaluation techniques for large databases
Full text PdfPdf (9.37 MB)
Source ACM Computing Surveys (CSUR) archive
Volume 25 ,  Issue 2  (June 1993) table of contents
Pages: 73 - 169  
Year of Publication: 1993
ISSN:0360-0300
Author
Goetz Graefe  Portland State Univ., Portland, OR
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 167,   Downloads (12 Months): 1031,   Citation Count: 240
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/152610.152611
What is a DOI?

ABSTRACT

Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On the contrary, modern data models exacerbate the problem: In order to manipulate large sets of complex objects as efficiently as today's database systems manipulate simple records, query-processing algorithms and software will become more complex, and a solid understanding of algorithm and architectural issues is essential for the designer of database management software. This survey provides a foundation for the design and implementation of query execution facilities in new database management systems. It describes a wide array of practical query evaluation techniques for both relational and postrelational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set-matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
4
 
5
 
6
7
 
8
9
10
11
 
12
13
14
 
15
16
 
17
18
 
19
 
20
BAYER, R., AND MCCREIGHTON, E. 1972. Organisation and maintenance of large ordered indices. Acta Informatica 1, 3, 173.
 
21
22
23
24
25
26
27
 
28
 
29
 
30
 
31
 
32
 
33
 
34
35
 
36
BITTON-FRIEDLAND, D. 1982. Design, analysis, and implementation of parallel external sorting algorithms PhD. Thesis, Univ. of Wisconsin-Madison.
 
37
38
 
39
 
40
41
 
42
BLASGEN, M., AND ESWARAN, K. 1977. Storage and access in relational databases. IBM Syst. J. 1G, 4, 363.
 
43
BLASGEN, M., AND ESWARAN, K. 1976. On the evaluation of queries in a relational database system IBM Res. Rep RJ 1745, IBM, San Jose, Calif.
44
 
45
 
46
 
47
 
48
 
49
BROWN, K. P., CAREY, M. J., DEWITT, D. J., MEHTA, M., AND NAUGHTON, J. F. 1992. Scheduling issues for complex database workloads. Computer Science Tech. Rep. 1095, Univ. of Wisconsin--Madison.
 
50
51
52
 
53
 
54
 
55
 
56
CARTER, J. L., AND WEGMAN, M.N. 1979. Universal classes of hash functions. J. Comput. Syst. Scz. 18, 2, 143.
57
58
59
60
 
61
 
62
 
63
64
 
65
 
66
67
 
68
 
69
CLUET, S., DELOBEL, C., LECLUSE, C., ANn RICHARD, P. 1989. Reloops, an algebra based query language for an object-oriented database system. In Proceedings of the 1st International Conference on Deductive and Object-Ortented Databases (Kyoto, Japan, Dec. 4-6).
70
71
72
 
73
DANIELS, D., AND NG, P. 1982. Distributed query compilation and processing in R*. IEEE Database Eng. 5, 3 (Sept.).
 
74
75
 
76
DAVIS, D. D. 1992. Oracle's parallel punch for OLTP. Datamation (Aug. 1), 67.
77
 
78
 
79
 
80
DESHPANDE, V., AND LARSON, P.A. 1991. An algebra for nested relations with support for nulls and aggregates. Computer Science Dept., Univ. of Waterloo, Waterloo, Ontario, Canada.
 
81
 
82
DEWI~?, D.J. 1991. The Wisconsin benchmark: Past, present, and future. In Database and Transaction Processing System Performance Handbook. Morgan-Kaufman, San Mateo, Calif.
 
83
DEWITT, D. J., m'~D G~RBER, R.H. 1985. Multiprocessor hash-based join algorithms. In Proceedings of the International Conference on Very Large Dato Bases (Stockholm, Sweden, Aug.). VLDB Endowment, 151.
84
 
85
DEWITT, D. J., AND HAWTHORN, P. B. 1981. A performance evaluation of database machine architectures. In Proceedings of the International Conference on Very Large Data Bases (Cannes, France, Sept.). VLDB Endowment, 199.
 
86
87
 
88
89
 
90
 
91
 
92
93
94
95
 
96
ENGLERT, S., GRAY, J., KOCHER, R., AND SHAH, P. 1989. A benchmark of nonstop SQL release 2 demonstrating near-linear speedup and scaleup on large databases. Tandem Computers Tech. Rep. 89.4, Tandem Corp., Cupertino, Calif.
 
97
EPSTEIN, R. 1979. Techniques for processing of aggregates in relational database systems UCB/ERL Memo. M79/8, Univ. of California, Berkeley, Calif.
 
98
EPSTEIN, R., AND STONEBRAKER, M. 1980. Analysis of distributed data base processing strategies. In Proceedings o/the International Conference on Very Large Data Bases iMontreal, Canada, Oct.). VLDB Endowment, 92.
99
100
101
 
102
 
103
 
104
FINKEL, R. A., AND BENTLEY, J. L. 1974. Quad trees: A data structure for retrieval on composite keys. Acta Informatzca 4, 1, 1.
105
 
106
107
 
108
 
109
 
110
GOODMAN, J. R., AND WORST, P. J 1988. The Wisconsin Multicube: A new large-scale cachecoherent multiprocessor. Computer Science Tech Rep. 766, Umv. of Wisconsin Madison
111
 
112
GRAEFE, G. 1993a. Volcano, An extensible and parallel datafiow query processing system. IEEE Trans. Knowledge Data Eng. To be published.
 
113
GRAEFE, G. 1993b. Performance enhancements for hybrid hash join Available as Computer Science Tech. Rep. 606, Univ. of Colorado, Boulder.
 
114
GRAEFE, G. 1993c Sort-merge-join: An idea whose time has passed? Revised in Portland State Univ. Computer Science Tech. Rep. 93-4.
 
115
 
116
GRAEFE, G. 1990a. Parallel external sorting in Volcano. Computer Science Tech. Rep. 459, Umv. of Colorado, Boulder.
117
 
118
 
119
GRAEFE, G., AND COLE, R. L. 1993. Fast algorithms for universal quantification in large databases. Portland State Univ. and Univ. of Colorado at Boulder.
 
120
121
 
122
 
123
 
124
GRAEFE, G., AND SHAPIRO, L.D. 1991. Data compression and database performance. In Proceedings of the ACM/IEEE-Computer Science Symposium on Applied Computing. ACM/IEEE, New York.
125
 
126
GRAErE, G., AND WOLNmWICZ, R.H. 1992. Algebraic optimization and parallel execution of computations over scientific databases. In Proceedings of the Workshop on Metadata Management in Sc~ent~ftc Databases (Salt Lake City, Utah, Nov. 3-5).
 
127
GRAEFE, G., COLE, R. L., DAVISON, D. L., McKENNA, W. J., AND WOLNmW~CZ, R.H. 1992. Extensible query optimization and parallel execution in Volcano. In Query Processing for Advanced Database Applications. Morgan-Kaufman, San Mateo, Calif.
 
128
GRASFE, G., LINWLLE, A., AND SHAPIRO, L.D. 1993. Sort versus hash revisited. IEEE Trans. Knowledge Data Eng. To be published.
 
129
GRAY, J. 1990. A census of Tandem system availability between 1985 and 1990. Tandem Computers Tech. Rep. 90.1, Tandem Corp., Cupertino, Calif.
130
 
131
132
133
 
134
 
135
GUIBAS, L., AND SEDGEWICK, R. 1978. A dichromatic framework for balanced trees. In Proceedings of the 19th Symposium on the Foundations of Computer Science.
 
136
 
137
 
138
 
139
 
140
 
141
 
142
143
 
144
HAAs, L. M., SELINGER, P. G., BERTINO, E., DANIELS, D., LINDSAY, B., LEHMAN, G., MASUNAGA, Y., MOHAN, C., NG, P., WmMs, P., AND YOST, R. 1982. R*: A research project on distributed relational database management. IBM Res. Division, San Jose, Calif.
145
 
146
 
147
 
148
HAMMING, R. W. 1977. Digital Filters. Prentice- Hall, Englewood Cliffs, N.J.
149
 
150
151
 
152
 
153
154
155
156
 
157
 
158
 
159
160
161
 
162
163
 
164
165
 
166
167
168
169
170
171
172
173
 
174
175
176
 
177
178
179
 
180
 
181
 
182
KiTSUREGAWA, M., TANAKA, H., AND MOTOOKA, T. 1983. Application of hash to data base machine and its architecture. New Gener. Camput. 1, 1, 63.
 
183
184
185
 
186
187
 
188
 
189
KooI, R. P., AND FRANI~'ORTH, D. 1982. Query optimization in Ingres. IEEE Database Eng. 5, 3 (Sept.), 2.
 
190
 
191
 
192
 
193
 
194
 
195
 
196
197
198
 
199
LARSON, P., AND YANG, H. 1985. Computing queries from derived relations. In Proceedings of the International Conference on Very Large Data Bases (Stockholm, Sweden, Aug.). VLDB Endowment, 259.
200
201
 
202
 
203
 
204
 
205
206
 
207
LOHMAN, G., MOHAN, C., HAAS, L., DANIELS, D., LINDSAY, B., SELINGER, P., AND WILMS, P. 1985. Query processing in R~. In Query Processing m Database Systems. Springer, Berlin, 31.
208
209
210
 
211
Lorenz, R. A., AND NmSSON, J.F. 1979. An access specification language for a relational database management system IBM J. Res. Devel. 23, 3 (May), 286
 
212
 
213
LYNCH, C A., AND BROWNRIGG, E.B. 1981. Application of data compression to a large bibliographic data base In Proceedings of the Internatmnal Conference on Very Large Data Bases (Cannes, France, Sept.). VLDB Endowment, 435
214
215
 
216
 
217
 
218
MAIER, D., GRAEFE, G., SHAPIRO, L., DANIELS, S., KELLER, T., AND VANCE~ B. 1992 Issues in distributed complex object assembly In Proceedings of the Workshop on Distributed Object Management (Edmonton, BC, Canada, Aug.).
219
220
 
221
MEDEIROS, C., AND TOMPA, F. 1985. Understandmg the implications of view update pohcms. In Proceedings of the International Conference on Very Large Data Bases (Stockholm, Sweden, Aug.). VLDB Endowment, 316.
 
222
223
 
224
 
225
 
226
 
227
 
228
 
229
NECHES, P M. 1988. The Ynet: An interconnect structure for a highly concurrent data base computer system. In Proceedings o/ the 2rid Symposium on the Frontiers of Massively Parallel Computatmn (Fairfax, Virginia, Oct.).
 
230
231
232
233
 
234
NYBERG, C., BERCLAY, T., CVETANOVIC, Z., GRAY. J., AND LOMET, D. 1993. AlphaSort: A RISC machine sort. Teeh. Rep. 93.2. DEC San Francisco Systems Center. Digital Equipment Corp., San Francisco.
 
235
 
236
OMIECINSKI, E. 1985. Incremental file reorganization schemes. In Proceedings of the International Conference on Very Large Data Bases (Stockholm, Sweden, Aug.). VLDB Endowment, 346.
 
237
 
238
 
239
OUSTERHOUT, J. 1990. Why aren't operating systems getting faster as fast as hardware. In USENIX Summer Conference (Anaheim, Calif., June). USENIX.
 
240
241
 
242
 
243
 
244
245
246
 
247
 
248
REW, R. K., AND DAWS, G.P. 1990. The Unidata NetCDF: Software for scientific data access. In the 6th Internahonal Con/~rence on Interactive Information and Processing Systems for Meterology, Oceanography, and Hydrology (Anaheim, Calif.).
249
250
251
 
252
ROSENTHAL, A., AND REINER, D.S. 1985. Querying relational views of networks. In Query Processing in Database Systems. Springer, Berlin, 109.
 
253
ROSENTHAL, h., RICH, C., AND SCHOLL, M. 1991. Reducing duplicate work in relational join(s): A modular approach using nested relations. ETH Tech. Rep., Zurich, Switzerland.
 
254
255
256
257
 
258
 
259
RUTH, S. S , AND KEUTZER, P J 1972. Data compression for business files. Datamatlon 18 (Sept.), 62.
 
260
 
261
262
 
263
 
264
SACKS-DAVIS, R., AND RAMAIVIOHANARAO, K. 1983. two-level superimposed coding scheme for partial match retrieval. In{. Syst. 8, 4, 273.
265
 
266
 
267
268
269
 
270
 
271
SCHNEIDER, D.A. 1991. Bit filtering and multiway join query processing. Hewlett-Packard Labs, Pale Alto, Calif. Unpublished Ms
 
272
 
273
274
 
275
SCHOLL, M.H. 1988. The nested relational model Efficient support for a relational database interface. Ph.D. thesis, Technical Univ. Darmstadt. In German.
 
276
277
 
278
279
280
 
281
SEPPI, K., BARNES, J., AND MORRIS, C. 1989. A Bayesian approach to query optimization m large scale data bases The Univ. of Texas at Austin ORP 89-19, Austin.
 
282
SERLIN, O. 1991. The TPC benchmarks. In Database and Transactmn Processing System Performance Handbook. Morgan-Kaufman, San Mateo, Callf
 
283
 
284
SEVEPaNCF~, D.G. 1983. A practitioner's guide to data base compression. Inf. Syst. 8, 1, 51.
285
 
286
287
 
288
 
289
 
290
291
292
293
294
 
295
296
 
297
298
299
300
 
301
302
 
303
STAMOS, J. W., AND YOUNG, H. C. 1989. A symmetric fragment and replicate algorithm for distributed joins. Tech. Rep. RJ7188, IBM Research Labs, San Jose, Calif.
304
 
305
 
306
STONEBRAKER, M. 1986a. The case for sharednothing. IEEE Database Eng. 9, 1 (Mar.).
 
307
308
309
 
310
STONEBRAKER, M., AOKI, P., AND SELTZER, M. 1988a. Parallelism in XPRS. UCB/ERL Memorandum M89/16, Univ. of California, Berkeley.
311
 
312
 
313
 
314
STRAUBE, D. D., AND OZSU, M. T. 1989. Query transformation rules for an object algebra. ept. of Computing Sciences Tech. Rep. 89-23, Univ. of Alberta, Albertm Canada.
 
315
316
317
 
318
TERADATA. 1983. DBC/1012 Data Base Computer, Concepts and Facilities. Teradata Corporation, Los Angeles.
319
 
320
321
322
323
 
324
TUKEY, J. W. 1977. Exploratory Data Analyas. Addison-Wesley, Reading, Mass.
 
325
UNIDATA 1991. NetCDF User's Guide, An Interface fbr Data Access, Versmn I.H. NCAR Tech Note TS-334 + 1A, Boulder, Colo
326
327
 
328
 
329
330
 
331
WHANG, K. Y., WIEDERHOLD G., AND SAGALOWICZ, D. 1985 The property of separahlity and its application to physical database demgn. In Query Processmg m Database Systems. Springer, Berhn, 297.
 
332
WHANG, K. Y., WIEDERHOLD, G., AND SAGLOWICZ, D. 1984. Separability An approach to physical database design. IEEE Trans. Comput. 33, 3 (Mar.), 209.
 
333
 
334
WILSCHUT, A. N. 1993. Parallel query executmn in a mare memory database system. Ph.D. the sis, Univ. of Tweuk, The Netherlands.
 
335
336
 
337
 
338
339
340
 
341
 
342
YOUSSEFL K, ANn WONG, E 1979. Query processing in a relational database management system. In Proceedmg,s of the Internatmnal Conference on Very Large Data Bases (Rio de Janeiro, Oct ). VLDB Endowment, 409.
343
 
344
 
345
346
 
347
ZELLER, H. 1990. Parallel query execution in NonStop SQL. In Dzgest of Papers, 35th Comp- Con Conference. San Francisco.
 
348

CITED BY  240


REVIEW

"Margaret H. Dunham : Reviewer"

Graefe has written the first comprehensive study of database query processing techniques. It is an extremely well written overview of various query processing problems and solutions. As stated in the introduction, its target is to survey algor  more...