|
ABSTRACT
This paper analyzes the problem of joining two horizontally partitioned relations in a distributed database system. Two types of semijoin strategies are introduced, local and remote. Local semijoins are performed at the site of the restricted relation (or fragment), and remote semijoins can be performed at an arbitrary site. A mathematical model of a semijoin strategy for the case of remote semijoins is developed, and lower bounding and heuristic procedures are proposed. The results of computational experiments are reported. The experiments include an analysis of the heuristics' performance relative to the lower bounds, sensitivity analysis, and error analysis. These results reveal a good performance of the heuristic procedures, and demonstrate the benefit of using semijoin operations to reduce the size of fragments prior to their transmission. The algorithms for the case of remote semijoins were found to be superior to the algorithms for the case of local semijoins. In addition, we found that the estimation accuracy of the selectivity factors has a significant effect on the incurred communication cost.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
ApERS, P. M.G. Distributed query processing: Minimum response time schedules for relations. IR 53, Vrije Univ., Amsterdam, Nov. 1979.
|
| |
2
|
APERS, P. M. G., HEVNER, A. R., AND YAO, S.B. Optimization algorithms for distributed queries. IEEE Trans. Softw. Eng. SE-9, 1 (Jan. 1983), 57-68.
|
 |
3
|
Philip A. Bernstein , Nathan Goodman , Eugene Wong , Christopher L. Reeve , James B. Rothnie, Jr., Query processing in a system for distributed databases (SDD-1), ACM Transactions on Database Systems (TODS), v.6 n.4, p.602-625, Dec. 1981
[doi> 10.1145/319628.319650]
|
 |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
CORNUEJOLS, G., F{SH~R, M. L., AND NEMHAUSER, G.L. Location of bank accounts to optimize float. Manage. Sci. 23, 8 (Apr. 1977), 789-810.
|
| |
8
|
DANIELS, D. Query compilation in a distributed database system. IBM Res. Rep. RJ3423, Mar. 1982.
|
| |
9
|
DANIELS, D., SELINGER, P., HAAS, L., LINDSAY, B., MOHAN, C., WALKER, A., AND WILMS, P. An introduction to distributed query compilation in R*. {BM Res. Pep. RJ3497, June 1982.
|
| |
10
|
|
 |
11
|
|
| |
12
|
ERLENKOTTER, D. A dual-based procedure for uncapacitated facility location. Oper. Res. 26, 1 (1978), 992-1009.
|
| |
13
|
GAVISH, B., AND SRIKANTH, S.K. Optimal solution methods for large-scale multiple traveling salesman problem. Working Paper, Graduate School of Management, Univ. of Rochester, Rochester, N.Y., 1980.
|
| |
14
|
GAVlSH, B. Topological design of centralized computer networks--formulations and algorithms. Networks 12 (1982), 355-377.
|
| |
15
|
GAVISH, B., AND SEGEV, A. Set query optimization in horizontally partitioned distributed database systems. Working Paper QM 8304, Graduate School of Management, Univ. of Rochester, Rochester, N.Y., 1983.
|
| |
16
|
GEOFFRION, A.M. Lagrangian relaxation and its uses in integer programming. Math. Program. Stud. 2 (1974), 82-I 14.
|
| |
17
|
GEOFFRION, A. M., AND GRAVES, G.W. Multicommodity distribution system design by Benders decomposition. Manage. Sci. 20 (1974}, 822-844.
|
| |
18
|
GEOFFRION, A. M., AND MCBRIDGE, R. Lagrangian relaxation applied to capacitated facility location problems. AIIE Trans. 10 (1978), 40-47.
|
| |
19
|
GOODMAN, N., ET AL. Query processing in SDD-1: A system for distributed databases. Computer Corporation of America, Tech. Rep. CCA-79-06, 1979.
|
| |
20
|
HELD, S., AND KARP, R. M. The traveling salesman problem and minimum spanning trees. Oper. Res. 18 (I970), 1138-1162.
|
| |
21
|
HELD, G. D., STONEBRAKER, M., AND WONG, E. INGRES--a relational DBMS. In Proceedings of AFIPS I975 National Computer Con{erence (Arlington, Va., 1975), 409-416.
|
| |
22
|
HEVNER, A. R., AND YAO, S.B. Query processing in distributed database systems. IEEE Trans. So{tw. Eng. SE-5, 3 (May 1979), 177-187.
|
| |
23
|
HOROWITZ, E., AND SAHNI, S. Fundamentals of Computer Algorithms. Computer Science Press, Rockville, Md., 1978.
|
| |
24
|
MULGREW, T. Standard Oil of California. Seminar at the University of California, Berkeley, 1984.
|
| |
25
|
PAIK, I.-S., AND DELO~EL, C. A strategy for optimizing the distributed query processing. In Proceedings o{ the 1st International Conference on Distributed Computing Systems (Huntsville, Ala., Oct. 1979).
|
| |
26
|
PEEBLES, R. W., AND MANNINa, E.D. A computer architecture for large {distributed) databases. In Proceedings o{ the Conference on Very Large Databases (Framingham, Mass., Sept. 1975}.
|
| |
27
|
PELAGATTI, G., AND SCHREIBER, F.A. A model of an access strategy in a distributed database system. In Proceedings of the }FIP-TC2, Database Architecture (Venice, 1979).
|
| |
28
|
ROTHNIE, J. B., AND GOODMAN, N., ET AL. Query processing in SDD-I: a system for distributed databases. Computer Corporation of America, Tech. Rep. 79-06, Oct. 1979.
|
| |
29
|
SANDI, C. Subgradient optimization. In Combinatorial Optimization, N. Christofides, A. Mingizzi, P. Toth, and C. Sandi, Eds., Wiley, New York, 1979.
|
| |
30
|
SEOEV, A. Optimizing 2-way joins in fragmental database systems. Working Paper MS-2, School of Business Administration, Univ. of California, Berkeley, 1983.
|
| |
31
|
SEGEV, A. Algorithms for distributed data processing. Working Paper, School of Business Administration, Univ. of California, Berkeley, 1985.
|
| |
32
|
SELINGER, P. G., AND ADIBA, M.E. Access path selection in distributed database management systems. In Proceedings of the International Conference on Databases (July 1980}, 204-215.
|
| |
33
|
STONEBRAKER, M., AND NEUHOLD, E. A distributed database version of INGRES. In Proceedings of the 3rd Berkeley Workshop on Distributed Data Management and Computer Networks (1977}.
|
| |
34
|
WILLIAMS, R., ET AL. R*: An overview of the architecture. IBM Res. Rep. RJ3325, Dec. 1981.
|
 |
35
|
|
| |
36
|
WONO, E. Retrieving dispersed data from SDD-I: A system for distributed databases. In Proceedings of the 3rd Berkeley Workshop on Distributed Data Management and Computer Network~ (May 1977), 217-235.
|
 |
37
|
|
|