ACM Home Page
Please provide us with feedback. Feedback
Parallelism in relational data base systems: architectural issues and design approaches
Full text PdfPdf (2.50 MB)
Source International Symposium on Databases for Parallel and Distributed Systems archive
Proceedings of the second international symposium on Databases in parallel and distributed systems table of contents
Dublin, Ireland
Pages: 4 - 29  
Year of Publication: 1990
ISBN:0-8186-2052-8
Authors
Hamid Pirahesh  Data Base Technology Institute, IBM Almaden Research Center, San Jose, CA
C. Mohan  Data Base Technology Institute, IBM Almaden Research Center, San Jose, CA
Josephine Cheng  Data Base Technology Institute, IBM Santa Teresa Laboratory, San Jose, CA
T. S. Liu  Data Base Technology Institute, IBM Santa Teresa Laboratory, San Jose, CA
Pat Selinger  Data Base Technology Institute, IBM Almaden Research Center, San Jose, CA
Sponsors
IEEE-CS\TCDE : TC on Data Engineering
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 58,   Citation Count: 30
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/319057.319060
What is a DOI?

ABSTRACT

With current systems, some important complex queries may take days to complete because of: (1) the volume of data to be processed, (2) limited aggregate resources. Introducing parallelism addresses the first problem. Cheaper, but powerful computing resources solve the second problem. According to a survey by Brodie,1 only 10% of computerized data is in data bases. This is an argument for both more variety and volume of data to be moved into data base systems. We conjecture that the primary reasons for this low percentage are that data base management systems (DBMSs) still need to provide far greater functionality and improved performance compared to a combination of application programs and file systems. This paper addresses the issues and solutions relating to intraquery parallelism in a relational DBMS supporting SQL. Instead of focussing only on a few algorithms for a subset of the problems, we provide a broad framework for the study of the numerous issues that need to be addressed in supporting parallelism efficiently and flexibly. We also discuss the impact that parallelization of complex queries has on short transactions which have stringent response time constraints. The pros and cons of the shared nothing, shared disks and shared everything architectures for parallelism are enumerated. The impact of parallelism on a number of components of an industrial-strength DBMS are pointed out. The different stages of query processing during which parallelism may be gainfully employed are identified. The interactions between parallelism and the traditional systems' pipelining technique are analyzed. Finally, the performance implications of parallelizing a specific complex query are studied. This gives us a range of sample points for different parameters of a parallel system architecture, namely, I/O and communication bandwidth as a function of aggregate MIPS.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
ABCGK79
AgSe89
AlCo88
 
AnCK89
Andrade, J., Carges, M., Kovach, K. Building a Transaction Processing System on UN/X Systems, Proc. Unix Transaction Processing Workshop, Pittsburgh, May 1989, USENIX Association.
 
AnCo88
Anderson, M., Cole, R. An Integrated Data Base, In IBM Application System/400 Technology, Document Number SA21-9540, IBM, June 1988.
 
Anon85
Anon, et al. A Measure of Transaction Processing Power, Datamation, April 1, 1985.
BeCh81
 
Bhid88
BoPu88
 
Bora88a
Boral, H. Parallelism and Data Management, Technical Report ff umber ACA-ST-156-88. MCC, May 1988.
 
Bora88b
 
Borr81
Borr, A. Transaction Monitoring in Encompass: Reliable Distributed Transaction Processing, Proc. 7th International Conference on Very Large Data Bases, Cannes, September 1981.
 
Borr84
CABK88
 
CaHS85
Carr, C., Huddleston, R.L., Strickland, J. Method and Means for the Retention of Locks Across System, Subsystem, and Communication Failures in a Multiprocessing, Multiprogramming, Shared Data Environment, U.S. Patent 4,480,304, IBM, 1985.
 
ChGY81
Chamberiin, D., Gilbert, A., Yost, R. A History of System R and SQL/Data System, Proc. 7th International Conference on Very Large Data Bases, Cannes, September 1981.
 
ChGr85
Chan, A., Gray, R. Implementing Distributed Read-Only Transactions, IEEE Transactions on Software Engineering, Vol. SE-I 1, No. 2. 1985.
 
CHHIM90
Cheng, J., Haderle, D., Hedges, R., lyer, B., Messinger, T., Mohan, C., Wang, Y. An Efficient Hybrid Join Algorithm: Design, Prototype, Modelling and Measurement, fBM Research Report, IBM Almaden Research Center, March 1990.
 
ChMy88
 
CLSW84
Cheng, J., Loosely, C., Shibamiya, A., Worthington, P. IBM Database 2 Performance: Design, Implementation, and Tuning, IBM Systems Journal, Vol. 23, No. 2, 1984.
 
Corn88
Cornelis, R. Site Autonomy in a Distributed Database Environment, Proc. IEEE Compcon Spring '88, San Francisco, March 1988.
DeGS88
 
DeSB87
 
DGRS89
Duppel, N., Gugel, D., Reuter, A., Schiele, G. Progress Report #S of PROSPECT, University of Stuttgart, November 1989.
 
DGRSZ88
Duppel, N., Gugel. D.. Reuter, A., Schiele, G., Zeller, H. Progress Report #4 of PROSPECT, University of Stuttgart, 1988.
 
DIRY89
 
Duqu87
 
EGKS89
Englert. S.. Gray, J., Kocher, T., Shah, P. A Benchmark of Nonstop SQL Release 2 Demonstrating Near-Linear Speedup and Scaleup on Large Databases, Technical Report 89.4, Tandem Computers, May 1989.
 
Epst88
Epstein, R. RDBMS Architecture Boosts Performance, Reliability, Mini- Micro Systems, August 1988.
 
FTSN89
Serlin O., editor DEC 9000 to Challenge IBM 3090, FT Systems News Letter, Issue No. 86, Oct. 24 1989.
 
GaKi85
Gawlick, D., Kinkade, D. Varieties of Concurrency Control in IMSfVS Fast Path, IEEE Database Engineering, Vol. 8, No. 2, June 1985.
 
GeDe87
Gerber, R., Dewitt, D. The impact of Hardware and Software Alternatives on the Performance of the Gamma Database Machine, Computer Sciences Department Technical Report #708, University of Wisconsin at Madison, July 1987.
Grae90
 
Gray78
 
Haer88
 
HaJa84
Haderle, D., Jackson, R. IBM Database 2 Overview, IBM Systems Journal, Vol. 23, No. 2, 1984.
 
HaSS88
 
HaSS89
Haerder, T., Schoning, H., Sikeler, A. Evaluation of Hardware Architectures for Parallel Execution of Complex Database Operations, Proc. 3rd Annual Parallel Processing Symposium, Fullerton, p584-578, 1989. Also Available as Report No. 23/89, University of Kaiserslautern, April 1989.
HFLP89
 
Hobs87
 
HsDe90
 
IyDi90
 
IyRV89
 
JoRo89
Joshi, A., Rodwell, K. A Relational Database Management System for Production Applications, Digital Technical Journal, No. 8, February 1989.
 
KITM83
Kitsuregawa, M., Tanaka, H., Moto- Oka, T. Application of Hash to Data Base Machine and Its Arachitecture, New Generation Computing, Vol. 1, pp 63-74, 1983.
KrLS86
 
LDHSY89
Lorie, R., Daudenarde, J., Hallmark, G., Stamos, J., Young, H. Adding Intra-Transaction Parallelism to an Existing DBMS: Early Experience, Data Engineering, Vol. 12, No. I, March 1989. Also Available as IBM Research Report RJ6165, ISM Almaden Research Center, 1988.
 
LMHDL85
Lohman, G., Mohan, C., Haas, L., Daniels, D., Lindsay, B., Selinger, P., Wilms, P. Query Processing in R*, In Query Processing in Database Systems, W. Kim, D. Reiner, and D. Batory (Ed-s.), Springer-Verlag, 1985. Also Available as IBM Research Report RJ4272, April 1984.
 
Loos86
Loosley, C. Measuring ISM Database 2 Release 2 - The Benchmark Game, InfoDB, Vol. I, No. 2, 1986.
 
LoYo89
 
McGe77
McGee, W.C. The information Management System IMSIVS - Part IV: Data Communication Facilities, IBM Systems Journal, Vol. 16, No. 2, 1977.
MHLPS89
 
MHWC90
 
MiSu88
 
Moha89
Mohan, C. ARIESIKVL: A Key-Value Locking Method for Concurrency Control of Multiaction Transactions Operating on B-Tree Indexes, IBM Research Report RJ7008, IBM Almaden Research Center, September 1989.
 
Moha90
Mohan, C. Commit-LSN: A Novel and Simple Method for Reducing Locking and Latching in Transaction Processing Systems, IBM Research Report RJ7344, IBM Almaden Research Center, February 1990.
 
MoLe89
Mohan, C., Levine, F. ARIESIIM: An Efficient and High Concurrency index Management Method Using Write- Ahead Logging, IBM Research Report RJ6846, IBM Almaden Research Center, August 1989.
MoLO86
 
MoNa90
Mohan, C., Narang, I. ARIES/SD: A Transaction Recovery and Concurrency Control Method for the Shared Disks Environment, IBM Research Report, IBM Almaden Research Center, Forthcoming, 1990.
 
MoNP90
Mohan. C., Narang, I., Palmer, J. A Case Study of Problems in Migrating to Distributed Computing: Page Recovery Using Multiple Logs in the Shared Disks Environment, IBM Research Report RJ7343, IBM Almaden Research Center, March 1990.
 
MoPi90
Mohan, C., Pirahesh, H. ARIES-RRH: Restricted Repeating of History in the ARIES Transaction Recovery Method, IBM Research Report RJ7342, IBM Almaden Research Center, February 1990.
 
Nech86
Neches, P. The Anatomy of a Data Base Computer - Revisited, Proc. IEEE Compcon Spring '86, San Francisco, March 1986.
 
ObSW83
Obermarck, R., Strickland, J., Watts, V. Mefhod and Means for the Sharing of Data Resources in a Multiprocessing, Multiprogramming Environment, U.S. Patent 4,399,504, IBM, August 1983.
 
OlRo89
 
Onei89
O'Neil, P. Revisiting DBMS Benchmarks, Datamation, p47, 48, 52, 54, September 15, 1989.
PaGK88
PeSt83
 
Rahm87
Rahm, E. Design of Optimistic Methods for Concurrency Control in Database Sharing Systems, Proc. 7th International Conference on Distributed Computing Systems, Berlin, September 1987.
 
Rahm88
Rahm, E. Design and Evaluation of Concurrency and Coherency Control Techniques for Database Sharing Systems, Internal Report 182188, University of Kaiserslautern, 1988.
 
Rahm89a
Rahm, E. A Framework for Workload Allocation in Distributed Transaction Systems, Technical Report, University of Kaiserslautern, September 1989.
 
Rahm89b
Rahm, E. Recovery Considerations for Data Sharing Systems, Technical Report, University of Kaiserslautern, October 1989.
 
Reed78
 
ReSW89
Rengarajan, T.K., Spiro, P., Wright, W. High Availability Mechanisms of VAX DBMS Software, Digital Technical Journal, No. 8, February 1989.
 
Ross89
Ross, F. An Overview of FDDI: The Fiber Distributed Data Interface, IEEE Transactions on Selected Areas in Communications, Vol. 7, No. 7, September 1989.
SACL79
 
SaGa86
 
ScDG89
Schneider, D.. Dewitt, D., Ghandeharizadeh, S. An Overview of the Gamma Database Machine, Proc. IEEE Compcon Spring '89, San Francisco, March 1989.
 
Sche87
 
Scru87
Scrutchin, T. TPF: Performance, Capacity, Availability, Proc. IEEE Compcon Spring '87, San Francisco, February 1987.
SeAd80
 
Serl89
Serlin, O. ClCS is Key to IBM's OLTP Success, UNIX World, November 1989.
 
Shoe86
Shoens, K. Data Sharing vs. Partitioning for Capacity and Availability, Database Engineering, Vol. 9, No. I, March 1986.
 
Siwi77
Siwiec, J.E. A High-Performance DBl DC System, IBM Systems Journal, Vol. 16, No. 2, 1977.
 
SKPO88
 
SMLC86
 
SMMTG84
Sekino, A., Moritani, K., Masai, T., Tasaki, T.. Goto, K. The DCS - A New Approach to Multisystem Data- Sharing, Proc. National Computer Conference, Las Vegas, July 1984.
 
SNOP85
Sheens, K., Narang, I., Obermarck, R., Palmer, J., Silen, S., Traiger, I., Treiber, K. Amoeba Project, Proc. IEEE Compcon Spring '85, San Francisco, February 1985.
 
SQL23
Melton, J. (Ed.) (/SO-ANSI Working Draft) Database Language SQL2 and SQL3, X3H2-90-001 and DBL SEL-3, ISOllEC JTCl ISC21NVG3 N986, December 1989.
 
SSSHD87
 
StAS89
Stonebraker, M., Aoki, P., Seltzer, M. Parallelism in XPRS, Memorandum No. UCBlERL M89/16, University of California, Berkeley, February 1989.
 
Ston86
Stonebraker, M. The Case for Shared Nothing, IEEE Database Engineering, Vol. 9, No. 1, 1986.
 
StUW82
Strickland, J., Uhrowczik, P., Watts, V. IMSIVS: An Evolving System, IBM Systems Journal, Vol. 21, No. 4, 1982.
 
Tand87
Tand88
 
TeGu84
Teng, J., Gumaer, R. Managing IBM Database 2 Buffers to Maximize Performance, IBM Systems Journal, Vol. 23, No. 2, 1984.
 
Tera88
Teradata DBCIIO12 Data Base Computer Concepts and Facilities - Release 3.1, Documber Number CO2-0001-05, Teradata Corp., May 1988.
 
TPC89
Transaction Processing Performance Council TPC Benchmark A, Draft 6-PR Proposed Standard, 1989. Available from ITOM International Co., POB 1450, Los Altos, CA 94023.
 
TuOB89
Turbyfill, C., Orji, C., Bitton, D. AS3AP- A Comparative Relational Database Benchmark, Proc. IEEE-Compcon Spring '89, San Francisco, February 1989.
 
Weih87
 
Wolf88

CITED BY  31

Collaborative Colleagues:
Hamid Pirahesh: colleagues
C. Mohan: colleagues
Josephine Cheng: colleagues
T. S. Liu: colleagues
Pat Selinger: colleagues