| Analyzing memory access intensity in parallel programs on multicore |
| Full text |
Pdf
(459 KB)
|
Source
|
International Conference on Supercomputing
archive
Proceedings of the 22nd annual international conference on Supercomputing
table of contents
Island of Kos, Greece
SESSION: Performance evaluation 2
table of contents
Pages 359-367
Year of Publication: 2008
ISBN:978-1-60558-158-3
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 25, Downloads (12 Months): 328, Citation Count: 0
|
|
|
ABSTRACT
As the shared memory bus becomes a major performance bottleneck for many numerical applications on multicore chips, understanding how the increased parallelism on chip strains the memory bandwidth and hence affects the efficiency of parallel codes becomes a critical issue. This paper introduces the notion of memory access intensity to facilitate quantitative analysis of program's memory behavior on multicores which employ state-of-the-art prefetching hardware. Three numerical solvers for large scale sparse linear systems are used to demonstrate the estimation of memory access intensity and its effect on program performance.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Monica D. Lam , Edward E. Rothberg , Michael E. Wolf, The cache performance and optimizations of blocked algorithms, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.63-74, April 08-11, 1991, Santa Clara, California, United States
|
 |
2
|
|
 |
3
|
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
 |
7
|
|
| |
8
|
K. Asanovic and et al. "The Landscape of Parallel Computing Research: A View from Berkeley," EECS Department University of California, Berkeley Technical Report No. UCB/EECS-2006-183 December 18, 2006.
|
| |
9
|
Jack J. Dongarra , L. S. Blackford , J. Choi , A. Cleary , E. D'Azeuedo , J. Demmel , I. Dhillon , S. Hammarling , G. Henry , A. Petitet , K. Stanley , D. Walker , R. C. Whaley, ScaLAPACK user's guide, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1997
|
| |
10
|
|
| |
11
|
Intel® Math Kernel Library, http://www.intel.com/software/products/mkl/.
|
| |
12
|
|
| |
13
|
|
| |
14
|
Sadaf R. Alam, et al. Characterization of Scientific Workloads on Systems with Multi-Core Processors. In International Symposium on Workload Characterization, 2006.
|
| |
15
|
Figure 14 Spike NEW: performance for wide banded system
|
| |
16
|
|
| |
17
|
|
|