ACM Home Page
Please provide us with feedback. Feedback
Optimization of join operations in horizontally partitioned database systems
Full text PdfPdf (1.74 MB)
Source ACM Transactions on Database Systems (TODS) archive
Volume 11 ,  Issue 1  (March 1986) table of contents
Pages: 48 - 80  
Year of Publication: 1986
ISSN:0362-5915
Author
Arie Segev  Univ. of California
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 46,   Citation Count: 12
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/5236.5241
What is a DOI?

ABSTRACT

This paper analyzes the problem of joining two horizontally partitioned relations in a distributed database system. Two types of semijoin strategies are introduced, local and remote. Local semijoins are performed at the site of the restricted relation (or fragment), and remote semijoins can be performed at an arbitrary site. A mathematical model of a semijoin strategy for the case of remote semijoins is developed, and lower bounding and heuristic procedures are proposed. The results of computational experiments are reported. The experiments include an analysis of the heuristics' performance relative to the lower bounds, sensitivity analysis, and error analysis. These results reveal a good performance of the heuristic procedures, and demonstrate the benefit of using semijoin operations to reduce the size of fragments prior to their transmission. The algorithms for the case of remote semijoins were found to be superior to the algorithms for the case of local semijoins. In addition, we found that the estimation accuracy of the selectivity factors has a significant effect on the incurred communication cost.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
ApERS, P. M.G. Distributed query processing: Minimum response time schedules for relations. IR 53, Vrije Univ., Amsterdam, Nov. 1979.
 
2
APERS, P. M. G., HEVNER, A. R., AND YAO, S.B. Optimization algorithms for distributed queries. IEEE Trans. Softw. Eng. SE-9, 1 (Jan. 1983), 57-68.
3
4
 
5
6
 
7
CORNUEJOLS, G., F{SH~R, M. L., AND NEMHAUSER, G.L. Location of bank accounts to optimize float. Manage. Sci. 23, 8 (Apr. 1977), 789-810.
 
8
DANIELS, D. Query compilation in a distributed database system. IBM Res. Rep. RJ3423, Mar. 1982.
 
9
DANIELS, D., SELINGER, P., HAAS, L., LINDSAY, B., MOHAN, C., WALKER, A., AND WILMS, P. An introduction to distributed query compilation in R*. {BM Res. Pep. RJ3497, June 1982.
 
10
11
 
12
ERLENKOTTER, D. A dual-based procedure for uncapacitated facility location. Oper. Res. 26, 1 (1978), 992-1009.
 
13
GAVISH, B., AND SRIKANTH, S.K. Optimal solution methods for large-scale multiple traveling salesman problem. Working Paper, Graduate School of Management, Univ. of Rochester, Rochester, N.Y., 1980.
 
14
GAVlSH, B. Topological design of centralized computer networks--formulations and algorithms. Networks 12 (1982), 355-377.
 
15
GAVISH, B., AND SEGEV, A. Set query optimization in horizontally partitioned distributed database systems. Working Paper QM 8304, Graduate School of Management, Univ. of Rochester, Rochester, N.Y., 1983.
 
16
GEOFFRION, A.M. Lagrangian relaxation and its uses in integer programming. Math. Program. Stud. 2 (1974), 82-I 14.
 
17
GEOFFRION, A. M., AND GRAVES, G.W. Multicommodity distribution system design by Benders decomposition. Manage. Sci. 20 (1974}, 822-844.
 
18
GEOFFRION, A. M., AND MCBRIDGE, R. Lagrangian relaxation applied to capacitated facility location problems. AIIE Trans. 10 (1978), 40-47.
 
19
GOODMAN, N., ET AL. Query processing in SDD-1: A system for distributed databases. Computer Corporation of America, Tech. Rep. CCA-79-06, 1979.
 
20
HELD, S., AND KARP, R. M. The traveling salesman problem and minimum spanning trees. Oper. Res. 18 (I970), 1138-1162.
 
21
HELD, G. D., STONEBRAKER, M., AND WONG, E. INGRES--a relational DBMS. In Proceedings of AFIPS I975 National Computer Con{erence (Arlington, Va., 1975), 409-416.
 
22
HEVNER, A. R., AND YAO, S.B. Query processing in distributed database systems. IEEE Trans. So{tw. Eng. SE-5, 3 (May 1979), 177-187.
 
23
HOROWITZ, E., AND SAHNI, S. Fundamentals of Computer Algorithms. Computer Science Press, Rockville, Md., 1978.
 
24
MULGREW, T. Standard Oil of California. Seminar at the University of California, Berkeley, 1984.
 
25
PAIK, I.-S., AND DELO~EL, C. A strategy for optimizing the distributed query processing. In Proceedings o{ the 1st International Conference on Distributed Computing Systems (Huntsville, Ala., Oct. 1979).
 
26
PEEBLES, R. W., AND MANNINa, E.D. A computer architecture for large {distributed) databases. In Proceedings o{ the Conference on Very Large Databases (Framingham, Mass., Sept. 1975}.
 
27
PELAGATTI, G., AND SCHREIBER, F.A. A model of an access strategy in a distributed database system. In Proceedings of the }FIP-TC2, Database Architecture (Venice, 1979).
 
28
ROTHNIE, J. B., AND GOODMAN, N., ET AL. Query processing in SDD-I: a system for distributed databases. Computer Corporation of America, Tech. Rep. 79-06, Oct. 1979.
 
29
SANDI, C. Subgradient optimization. In Combinatorial Optimization, N. Christofides, A. Mingizzi, P. Toth, and C. Sandi, Eds., Wiley, New York, 1979.
 
30
SEOEV, A. Optimizing 2-way joins in fragmental database systems. Working Paper MS-2, School of Business Administration, Univ. of California, Berkeley, 1983.
 
31
SEGEV, A. Algorithms for distributed data processing. Working Paper, School of Business Administration, Univ. of California, Berkeley, 1985.
 
32
SELINGER, P. G., AND ADIBA, M.E. Access path selection in distributed database management systems. In Proceedings of the International Conference on Databases (July 1980}, 204-215.
 
33
STONEBRAKER, M., AND NEUHOLD, E. A distributed database version of INGRES. In Proceedings of the 3rd Berkeley Workshop on Distributed Data Management and Computer Networks (1977}.
 
34
WILLIAMS, R., ET AL. R*: An overview of the architecture. IBM Res. Rep. RJ3325, Dec. 1981.
35
 
36
WONO, E. Retrieving dispersed data from SDD-I: A system for distributed databases. In Proceedings of the 3rd Berkeley Workshop on Distributed Data Management and Computer Network~ (May 1977), 217-235.
37

CITED BY  12