|
ABSTRACT
The queue-read, queue-write (QRQW) PRAM model [GMR94] permits concurrent reading and writing, but at a cost proportional to the number of readers/writers to a memory location in a given step. The QRQW model reflects the contention properties of most parallel machines more accurately than either the well-studied CRCW or EREW models: the CRCW model does not adequately penalize algorithms with high contention to shared memory locations, while the EREW model is too strict in its insistence on zero contention at each step. Of primary practical and theoretical interest, then, is the design of fast and efficient QRQW algorithms for problems for which all previous algorithms either suffer from high contention, fail to be fast, or fail to be work-optimal.
This paper describes low-contention, fast, work-optimal QRQW PRAM algorithms for the fundamental problems of finding a random permutation, parallel hashing, load balancing, and sorting. There is no known fast, work-optimal EREW algorithm known for finding a random permutation or for parallel hashing. For load balancing, we improve upon the EREW result whenever the ratio of the maximum to the average load is not too large. We show that the logarithmic dependence of the QRQW running time on this ratio is inherent by providing a matching lower bound.
We demonstrate the performance advantage of a QRQW random permutation algorithm, compared with the popular EREW algorithm, by implementing and running both algorithms on the MasPar MP-1.
Finally, we extend the work-time framework for the design of parallel algorithms to account for contention, and relate it to the QRQW PRAM model. We use our QRQW load balancing algorithm, as well as the QRQW linear compaction algorithm in [GMR94], to provide automatic tools for processor allocation—an issue that needs to be handled when translating an algorithm from its work-time presentation into the explicit PRAM description.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
AH92
|
|
| |
AKS83
|
|
 |
BCH+93
|
Guy E. Blelloch , Jonathan C. Hardwick , Siddhartha Chatterjee , Jay Sipelstein , Marco Zagha, Implementation of a portable nested data-parallel language, Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.102-111, May 19-22, 1993, San Diego, California, United States
|
| |
BKK93
|
P. Beame, M. Kik, and M. Kutylowski. Information broadcasting by exclusive-write PRAMs. Parallel Processing Letters, 1993. To appear.
|
| |
Ble89
|
|
| |
BM76
|
|
| |
BMV92
|
O. Berkman, Y. Matias, and U. Vishkin. Randomized range-maxima in nearly-constant parallel time. computational complexity, 2:350-373, 1992.
|
 |
Bre74
|
|
| |
Chl89
|
|
| |
Col88
|
|
| |
CV89
|
|
| |
DM90
|
|
 |
FKS84
|
|
| |
Gil91
|
J. Gil. Fast load balancing on a PRAM. In Proc. of the 3rd 1EEE Symposium on Parallel and Distributed Computing, pages 10-17, December 1991.
|
| |
GM91
|
|
| |
GM94a
|
|
| |
GM94b
|
J. Gil a#ad Y. Matias. Simple fast parallel hashing by oblivious execution. Technical report, AT&T Bell Laboratories, Murray Hill, N J, 1994. Submitted for publication.
|
| |
GMR94
|
Phillip B. Gibbons , Yossi Matias , Vijaya Ramachandran, The QRQW PRAM: accounting for contention in parallel algorithms, Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms, p.638-648, January 23-25, 1994, Arlington, Virginia, United States
|
| |
GMV91
|
|
| |
Hag89
|
|
| |
Hag91
|
|
| |
JáJ92
|
|
| |
KR90
|
|
| |
KRS90
|
|
| |
MA80
|
H. Meijer and S.G. Akl. The design and analysis of a new hybrid sorting algorithm. Information Processing Letters, 10:213-218, 1980.
|
| |
Mas91
|
MasPar Computer Corporation, 749 North Mary Avenue, Sunnyvale, CA 94086. MasPar System Overview, document 9300.0100, revision A 3, March 1991.
|
| |
Mat92
|
Y. Matias. Highly Parallel Randomized Aigorith. mica. PhD thesis, Tel Aviv University, israel, December 1992.
|
| |
MR89
|
G.L. Miller and J. H. ReiL Parallel tree contraction part 1: Fundamentals. In S. Mica}i, editor, Randomness and Computation, volume 5, pages 47-72. JAI Press, 1989.
|
| |
MS91
|
|
 |
MV91
|
|
| |
MV92
|
Y. Matias and U. Vishkin. A note on simulations and integer sorting. Manuscript, 1992.
|
| |
Rei85
|
R. Reischuk. Probabilistic parallel algorithms for sorting and selection. SIAM Journal on Computing, 14(2):396-409, May 1985.
|
| |
Rei93
|
|
| |
RR89
|
|
| |
Sie89
|
A. Siegel. On universal classes of fast high performance hash functions, their time-space tradeoff, and their applications. In Proc. o/the 30th IEEE Syrup. on Foundations of Computer Science, pages 20-25, 1989.
|
 |
Val90
|
|
| |
Yao77
|
A.C.C. Yao. Probabilistic computations: Towards a unified measure of complexity. In Proc. of the 18th IEEE Syrup. on Foundations of Computer Science, pages 222-227, 1977.
|
CITED BY 5
|
|
Phillip B. Gibbons , Yossi Matias , Vijaya Ramachandran, Asynchrony versus bulk-synchrony in QRQW PRAM models, Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing, p.176, May 23-26, 1996, Philadelphia, Pennsylvania, United States
|
|
|
Guy E. Blelloch , Phillip B. Gibbons , Yossi Matias , Marco Zagha, Accounting for memory bank contention and delay in high-bandwidth multiprocessors, Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures, p.84-94, June 24-26, 1995, Santa Barbara, California, United States
|
|
|
|
|
|
Leslie Ann Goldberg , Yossi Matias , Satish Rao, An optical simulation of shared memory, Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures, p.257-267, June 27-29, 1994, Cape May, New Jersey, United States
|
|
|
|
|