|
ABSTRACT
The message passing programs are executed with the Parallel Virtual Machine (PVM) library and the shared memory programs are executed using TreadMarks. The programs are Water and Barnes-Hut from the SPLASH benchmark suite; 3-D FFT, Integer Sort (IS) and Embarrassingly Parallel (EP) from the NAS benchmarks; ILINK, a widely used genetic linkage analysis program; and Successive Over-Relaxation (SOR), Traveling Salesman (TSP), and Quicksort (QSORT). Two different input data sets were used for Water (Water-288 and Water-1728), IS (IS-Small and IS-Large), and SOR (SOR-Zero and SOR-NonZero). Our execution environment is a set of eight HP735 workstations connected by a 100Mbits per second FDDI network. For Water-1728, EP, ILINK, SOR-Zero, and SOR-NonZero, the performance of TreadMarks is within 10%of PVM. For IS-Small, Water-288, Barnes-Hut, 3-D FFT, TSP, and QSORT, differences are on the order of 10%to 30%. Finally, for IS-Large, PVM performs two times better than TreadMarks. More messages and more data are sent in TreadMarks, explaining the performance differences. This extra communication is caused by 1) the separation of synchronization and data transfer, 2) extra messages to request updates for data by the invalidate protocol used in TreadMarks, 3) false sharing, and 4) diff accumulation for migratory data in TreadMarks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
D. Bailey, J. Barton, T. Lasinski, and H. Simon. The NAS parallel benchmarks. Technical Report 103863, NASA, July 1993.
|
| |
3
|
B.N. Bershad and M.J. Zekauskas. Midway: Shared memory parallel programming with entry consistency for distributed memory multiprocessors. Technical Report CMU-CS-91-170, Carnegie-Mellon University, September 1991.
|
 |
4
|
|
 |
5
|
John B. Carter , John K. Bennett , Willy Zwaenepoel, Implementation and performance of Munin, Proceedings of the thirteenth ACM symposium on Operating systems principles, p.152-164, October 13-16, 1991, Pacific Grove, California, United States
|
 |
6
|
|
| |
7
|
R. W. Cottingham Jr., R. M. Idury, and A. A. Sch~ffer. Faster sequential genetic linkage computations. American Journal of Human Genetics, 53:252-263, 1993.
|
| |
8
|
S. Dwarkadas, A.A. Sch~ffer, R.W. Cottingham Jr., A.L. Cox, P. Keleher, and W. Zwaenepoel. Parallelization of general linkage analysis problems. Human Heredity, 44:127-141, 1994.
|
| |
9
|
|
 |
10
|
Kourosh Gharachorloo , Daniel Lenoski , James Laudon , Phillip Gibbons , Anoop Gupta , John Hennessy, Memory consistency and event ordering in scalable shared-memory multiprocessors, Proceedings of the 17th annual international symposium on Computer Architecture, p.15-26, May 28-31, 1990, Seattle, Washington, United States
|
| |
11
|
R.J. Harrison. Portable tools and applications for parallel computers. In International Journal of Quantum Chemistry, volume 40, pages 847-863, February 1990.
|
| |
12
|
J. T. Hecht, Y. Wang, B. Connor, S. H. Blanton, and S. P. Daiger. Non-syndromic cleft lip and palate: No evidence of linkage to hla or factor 13a. American Journal of Human Genetics, 52:1230-1233, 1993.
|
 |
13
|
|
| |
14
|
P. Keleher, S. Dwarkadas, A. Cox, and W. Zwaenepoel. Treadmarks: Distributed shared memory on standard workstations and operating systems. In Proceedings of the 1994 Winter Usenix Conference, pages 115-131, January 1994.
|
| |
15
|
G. M. Lathrop, J. M. Lalouel, C. Julier, and J. Ott. Strategies for multilocus linkage analysis in humans. Proceedings of National Academy of Science, USA, 81:3443-3446, June 1984.
|
 |
16
|
|
| |
17
|
Message Passing Interface Forum. MPI: A message-passing interface standard, version 1.0, May 1994.
|
| |
18
|
Parasoft Corporation, Pasadena, CA. Express user's guide, version 3.2.5, 1992.
|
 |
19
|
|
 |
20
|
|
CITED BY 16
|
|
|
|
|
Mark W. MacBeth , Keith A. McGuigan , Philip J. Hatcher, Executing Java threads in parallel in a distributed-memory environment, Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research, p.16, November 30-December 03, 1998, Toronto, Ontario, Canada
|
|
|
|
|
|
|
|
|
Henri E. Bal , Raoul Bhoedjang , Rutger Hofman , Ceriel Jacobs , Koen Langendoen , Tim Rühl , M. Frans Kaashoek, Performance evaluation of the Orca shared-object system, ACM Transactions on Computer Systems (TOCS), v.16 n.1, p.1-40, Feb. 1998
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|