| Scalable and reliable communication for hardware transactional memory |
| Full text |
Pdf
(382 KB)
|
Source
|
PACT
archive
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
table of contents
Toronto, Ontario, Canada
SESSION: Multicore memory hierarchy design (part 1)
table of contents
Pages 144-154
Year of Publication: 2008
ISBN:978-1-60558-282-5
|
|
Authors
|
|
Seth H. Pugsley
|
University of Utah, Salt Lake City, UT, USA
|
|
Manu Awasthi
|
University of Utah, Salt Lake City, UT, USA
|
|
Niti Madan
|
University of Utah, Salt Lake City, UT, USA
|
|
Naveen Muralimanohar
|
University of Utah, Salt Lake City, UT, USA
|
|
Rajeev Balasubramonian
|
University of Utah, Salt Lake City, UT, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 101, Citation Count: 0
|
|
|
ABSTRACT
In a hardware transactional memory system with lazy versioning and lazy conflict detection, the process of transaction commit can emerge as a bottleneck. This is especially true for a large-scale distributed memory system where multiple transactions may attempt to commit simultaneously and coordination is required before allowing commits to proceed in parallel. In this paper, we propose novel algorithms to implement commit that are more scalable in terms of delay and are free of deadlocks/livelocks. We show that these algorithms have similarities with the token cache coherence concept and leverage these similarities to extend the algorithms to handle message loss and starvation scenarios. The proposed algorithms improve upon the state-of-the-art by yielding up to a 7X reduction in commit delay and up to a 48X reduction in network messages for commit. These translate into overall performance improvements of up to 66% (for synthetic workloads with average transaction length of 200 cycles), 35% (for average transaction length of 1000 cycles), and 8% (for average transaction length of 4000 cycles). For a small group of multi-threaded programs with frequent transaction commits, improvements of up to 8% were observed for a 32-node simulation.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
K. Agarwal, D. Sylvester, and D. Blaauw. Modeling and Analysis of Crosstalk Noise in Coupled RLC Interconnects. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 25(5), May 2006.
|
| |
2
|
|
 |
3
|
Jayaram Bobba , Kevin E. Moore , Haris Volos , Luke Yen , Mark D. Hill , Michael M. Swift , David A. Wood, Performance pathologies in hardware transactional memory, Proceedings of the 34th annual international symposium on Computer architecture, June 09-13, 2007, San Diego, California, USA
|
 |
4
|
|
| |
5
|
S. Burckhardt, R. Alur, and M. Martin. Verifying Safety of a Token Coherence Implementation by Parametric Compositional Refinement. In Proceedings of VMCAI, January 2005.
|
| |
6
|
J. Cantin, M. Lipasti, and J. Smith. Dynamic Verification of Cache Coherence Protocols. In Proceedings of WMPI, June 2001.
|
 |
7
|
|
 |
8
|
Luis Ceze , James Tuck , Pablo Montesinos , Josep Torrellas, BulkSC: bulk enforcement of sequential consistency, Proceedings of the 34th annual international symposium on Computer architecture, June 09-13, 2007, San Diego, California, USA
|
| |
9
|
Hassan Chafi , Jared Casper , Brian D. Carlstrom , Austen McDonald , Chi Cao Minh , Woongki Baek , Christos Kozyrakis , Kunle Olukotun, A Scalable, Non-blocking Approach to Transactional Memory, Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture, p.97-108, February 10-14, 2007
[doi> 10.1109/HPCA.2007.346189]
|
| |
10
|
J. Chung, H. Chafi, A. McDonald, C. Minh, B. Carlstrom, C. Kozyrakis, and K. Olukotun. The Common Case Transactional Behavior of Multithreaded Programs. In Proceedings of HPCA-12, February 2006.
|
 |
11
|
|
| |
12
|
|
| |
13
|
W. Dally. Workshop on On- and Off-Chip Interconnection Networks for Multicore Systems (OCIN), 2006. Workshop program and report at http://www.ece.ucdavis.edu/~ocin06/.
|
| |
14
|
A. Deutsch. The Importance of Inductance and Inductive Coupling for On-chip Wiring. In Proceedings of IEEE 6th Topical Meeting on Electrical Performance of Electronic Packaging, October 1997.
|
 |
15
|
|
| |
16
|
|
| |
17
|
R. Ho. On-Chip Wires: Scaling and Efficiency. PhD thesis, Stanford University, August 2003.
|
| |
18
|
Y. Ismail and E. Friedman. On-Chip Inductance in High Speed Integrated Circuits. Kluwer Publishers, 2001.
|
 |
19
|
Jongman Kim , Dongkook Park , Chrysostomos Nicopoulos , N. Vijaykrishnan , Chita R. Das, Design and analysis of an NoC architecture from performance, reliability and energy perspective, Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems, October 26-28, 2005, Princeton, NJ, USA
[doi> 10.1145/1095890.1095915]
|
| |
20
|
P. Kundu. On-Die Interconnects for Next Generation CMPs. In Workshop on On- and Off-Chip Interconnection Networks for Multicore Systems (OCIN), December 2006.
|
 |
21
|
|
 |
22
|
|
| |
23
|
J. R. Larus and R. Rajwar. Transactional Memory. Morgan & Claypool, 2006.
|
 |
24
|
|
| |
25
|
Michael R. Marty , Jesse D. Bingham , Mark D. Hill , Alan J. Hu , Milo M. K. Martin , David A. Wood, Improving Multiple-CMP Systems Using Token Coherence, Proceedings of the 11th International Symposium on High-Performance Computer Architecture, p.328-339, February 12-16, 2005
[doi> 10.1109/HPCA.2005.17]
|
| |
26
|
|
| |
27
|
|
 |
28
|
Chi Cao Minh , Martin Trautmann , JaeWoong Chung , Austen McDonald , Nathan Bronson , Jared Casper , Christos Kozyrakis , Kunle Olukotun, An effective hybrid transactional memory system with strong isolation guarantees, Proceedings of the 34th annual international symposium on Computer architecture, June 09-13, 2007, San Diego, California, USA
|
| |
29
|
K. Moore, J. Bobba, M. Moravan, M. Hill, and D. Wood. LogTM: Log-Based Transactional Memory. In Proceedings of HPCA-12, February 2006.
|
| |
30
|
|
| |
31
|
|
| |
32
|
|
 |
33
|
Milos Prvulovic , María Jesús Garzarán , Lawrence Rauchwerger , Josep Torrellas, Removing architectural bottlenecks to the scalability of speculative parallelization, Proceedings of the 28th annual international symposium on Computer architecture, p.204-215, June 30-July 04, 2001, Göteborg, Sweden
|
 |
34
|
|
 |
35
|
|
 |
36
|
Jose Renau , Karin Strauss , Luis Ceze , Wei Liu , Smruti Sarangi , James Tuck , Josep Torrellas, Thread-Level Speculation on a CMP can be energy efficient, Proceedings of the 19th annual international conference on Supercomputing, June 20-22, 2005, Cambridge, Massachusetts
[doi> 10.1145/1088149.1088178]
|
 |
37
|
|
| |
38
|
Semiconductor Industry Association. International Technology Roadmap for Semiconductors 2005.
|
| |
39
|
|
| |
40
|
D. Sorin, M. Hill, and D. Wood. Dynamic Verification of End-to-End Multiprocessor Invariants. In Proceedings of DSN, June 2003.
|
 |
41
|
J. Greggory Steffan , Christopher B. Colohan , Antonia Zhai , Todd C. Mowry, A scalable approach to thread-level speculation, Proceedings of the 27th annual international symposium on Computer architecture, p.1-12, June 2000, Vancouver, British Columbia, Canada
|
 |
42
|
|
INDEX TERMS
Primary Classification:
C.
Computer Systems Organization
C.1
PROCESSOR ARCHITECTURES
C.1.4
Parallel Architectures
Subjects:
Distributed architectures
Additional Classification:
B.
Hardware
B.8
Performance and Reliability
B.8.0
General
General Terms:
Algorithms,
Design,
Experimentation,
Measurement,
Performance,
Reliability
Keywords:
algorithms for transaction commit,
handling message loss,
hardware transactional memory,
on-chip network messages,
reliability,
token coherence
|