| Experiences with parallel N-body simulation |
| Full text |
Pdf
(1.02 MB)
|
| Source
|
ACM Symposium on Parallel Algorithms and Architectures
archive
Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
table of contents
Cape May, New Jersey, United States
Pages: 122 - 131
Year of Publication: 1994
ISBN:0-89791-671-9
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 38, Citation Count: 10
|
|
|
ABSTRACT
This paper describes our experiences developing high-performance code for astrophysical N-body simulations. Recent N-body methods are based on an adaptive tree structure. The tree must be built and maintained across physically distributed memory; moreover, the communication requirements are irregular and adaptive. Together with the need to balance the computational work-load among processors, these issues pose interesting challenges and tradeoffs for high-performance implementation.
Our implementation was guided by the need to keep solutions simple and general. We use a technique for implicitly representing a dynamic global tree across multiple processors which substantially reduces the programming complexity as well as the performance overheads of distributed memory architectures. The contributions include methods to vectorize the computation and minimize communication time which are theoretically and experimentally justified.
The code has been tested by varying the number and distribution of bodies on different configurations of the Connection Machine CM-5. The overall performance on instances with 10 million bodies is typically over 30% of the peak machine rate. Preliminary timings compare favorably with other approaches.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Cmmd reference manual. Technical report, Thinking Machine Corporation, 1993.
|
| |
2
|
S.J. Aarseth, M. Henon, and R. Wielen. Astronomy and Astrophysics, 37, 1974.
|
| |
3
|
R. Anderson. personal communication. 1993.
|
| |
4
|
A.W. Appel. An efficient program for many-body simulation. SIAM J. Sci. Star. Comput., 6, 1985.
|
| |
5
|
|
| |
6
|
J. Barnes and P. Hut. A hierarchical O(Nlog N) force-calculation algorithm. Nature, 324:446-449, 1986.
|
| |
7
|
S. Bhatt, M. Chen, C-Y Lin, and P. Liu. Abstractions for parallel N-body simulations. In Proceedings o# SHPCC 9P.
|
| |
8
|
|
| |
9
|
L. Johnsson and Y. Hu. personal communication. 1993.
|
| |
10
|
J. F. Leathrum Jr. and J. Board Jr. The parallel fast multipole algorithm in three dimensions.
|
 |
11
|
|
| |
12
|
L. Nyland, J. Prins, and J. Reif. A data-parallel implementation of the adaptive fast multipole algorithm. In DAGS/PC Symposium, 1993.
|
 |
13
|
|
| |
14
|
|
| |
15
|
J. Singh, C. Holt, T. Totsuka, A. Gupta, and J. Hennessy. Load balancing and data locality in hierarchical n-body methods. Technical Report CSL- TR-92-505, Stanford University, 1992.
|
| |
16
|
|
| |
17
|
|
| |
18
|
M. Warren and J. Salmon. A parallel hashed octtree n-body algorithm, 1993.
|
| |
19
|
F. Zhao and S.L. Johnsson. The parallel multipole method on the connection machine. Technical Report DCS/TR-749, Yale University, 1989.
|
CITED BY 10
|
|
Hongzhang Shan , Jaswinder P. Singh , Leonid Oliker , Rupak Biswas, A comparison of three programming models for adaptive applications on the Origin2000, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.11-es, November 04-10, 2000, Dallas, Texas, United States
|
|
|
|
|
|
Mark Goudreau , Kevin Lang , Satish Rao , Torsten Suel , Thanasis Tsantilas, Towards efficiency and portability: programming with the BSP model, Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures, p.1-12, June 24-26, 1996, Padua, Italy
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|