|
ABSTRACT
With the advent of multicore and many core architectures, we are facing a problem that is new to parallel computing, namely, the management of hierarchical parallel caches. One major limitation of all earlier models is their inability to model multicore processors with varying degrees of sharing of caches at different levels. We propose a unified memory hierarchy model that addresses these limitations and is an extension of the MHG model developed for a single processor with multi-memory hierarchy. We demonstrate that our unified framework can be applied to a number of multicore architectures for a variety of applications. In particular, we derive lower bounds on memory traffic between different levels in the hierarchy for financial and scientific computations. We also give a multicore algorithms for a financial application that exhibits a constant-factor optimal amount of memory traffic between different cache levels. We implemented the algorithm on a multicore system with two Quad-Core Intel Xeon 5310 1.6GHz processors having a total of 8 cores. Our algorithms outperform compiler optimized and auto-parallelized code by a factor of up to 7.3.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Intel's multicore architecture briefing, 2008. http://www.intel.com/pressroom/archive/releases/20080317fact.htm.
|
| |
2
|
Terflops research chip, 2008. http://techresearch.intel.com/articles/Tera-Scale/1449.htm.
|
| |
3
|
TILE64 processor family, 2008. http://www.tilera.com/products/processors.php.
|
| |
4
|
UltraSPARC T2 Processor -- Overview, 2008. http://www.sun.com/processors/UltraSPARC-T2/.
|
| |
5
|
|
 |
6
|
A. Aggarwal , B. Alpern , A. Chandra , M. Snir, A model for hierarchical memory, Proceedings of the nineteenth annual ACM symposium on Theory of computing, p.305-314, January 1987, New York, New York, United States
[doi> 10.1145/28395.28428]
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
B. Alpern, L. Carter, E. Feig, and T. Selker. The uniform memory hierarchy model of computation. Algorithmica, 12(2/3):72--109, 1994.
|
| |
11
|
B. Alpern, L. Carter, and J. Ferrante. Modeling parallel computers as memory hierarchies. In Proceedings of the 1993 Conference on Programming Models for Massively Parallel Computers, pages 116--123, 1993.
|
| |
12
|
|
| |
13
|
J. C. Cox, S. A. Ross, and M. Rubinstein. Option pricing: A simplified approach. Journal of Financial Economics, 7(3):229--263, September 1979.
|
 |
14
|
David Culler , Richard Karp , David Patterson , Abhijit Sahay , Klaus Erik Schauser , Eunice Santos , Ramesh Subramonian , Thorsten von Eicken, LogP: towards a realistic model of parallel computation, ACM SIGPLAN Notices, v.28 n.7, p.1-12, July 1993
|
 |
15
|
|
 |
16
|
|
 |
17
|
|
 |
18
|
|
| |
19
|
Michael Gschwind , H. Peter Hofstee , Brian Flachs , Martin Hopkins , Yukio Watanabe , Takeshi Yamazaki, Synergistic Processing in Cell's Multicore Architecture, IEEE Micro, v.26 n.2, p.10-24, March 2006
[doi> 10.1109/MM.2006.41]
|
 |
20
|
Anshul Gupta , Fred G. Gustavson , Mahesh Joshi , Sivan Toledo, The design, implementation, and evaluation of a symmetric banded linear solver for distributed-memory parallel computers, ACM Transactions on Mathematical Software (TOMS), v.24 n.1, p.74-101, March 1998
[doi> 10.1145/285861.285865]
|
| |
21
|
|
| |
22
|
|
 |
23
|
|
| |
24
|
|
| |
25
|
|
 |
26
|
|
| |
27
|
Y. Kwok. Mathematical Models of Financial Derivatives. Springer-Verlag, Singapore, 1998.
|
| |
28
|
|
| |
29
|
|
| |
30
|
|
| |
31
|
J. E. Savage and M. Zubair. Cache-optimal algorithms for option pricing, 2008. Submitted for publication.
|
| |
32
|
J. E. Savage and M. Zubair. Memory hierarchy issues in multicore architectures. Technical Report CS-08-08, Department of Computer Science, Brown University, 2008.
|
| |
33
|
M. Tremblay and S. Chaudhry. A third-generation 65nm 16-core 32-thread plus 32-scout-thread CMT SPARC© processor, 2008. http://blogs.sun.com/HPC/resource/RockISSCC08.pdf.
|
 |
34
|
|
 |
35
|
|
| |
36
|
J. S. Vitter and E. A. M. Shriver. Algorithms for parallel memory I: two-level memories. Algorithmica, 12(2/3):110--147, 1994.
|
| |
37
|
J. S. Vitter and E. A. M. Shriver. Algorithms for parallel memory II: hierarchical multilevel memories. Algorithmica, 12(2/3):148--169, 1994.
|
CITED BY
|
|
Thang N. Bui , ThanhVu Nguyen , Joseph R. Rizzo, Jr., Parallel shared memory strategies for ant-based optimization algorithms, Proceedings of the 11th Annual conference on Genetic and evolutionary computation, July 08-12, 2009, Montreal, Québec, Canada
|
|