ACM Home Page
Please provide us with feedback. Feedback
Automatic fence insertion for shared memory multiprocessing
Full text PdfPdf (328 KB)
Source International Conference on Supercomputing archive
Proceedings of the 17th annual international conference on Supercomputing table of contents
San Francisco, CA, USA
SESSION: Parallel architectures table of contents
Pages: 285 - 294  
Year of Publication: 2003
ISBN:1-58113-733-8
Authors
Xing Fang  Purdue University, West Lafayette, IN
Jaejin Lee  Seoul National University, Seoul, Korea
Samuel P. Midkiff  Purdue University, West Lafayette, IN
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 74,   Citation Count: 7
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/782814.782854
What is a DOI?

ABSTRACT

In general, the hardware memory consistency model in a multiprocessor system is not identical to the memory model at the programming language level. Consequently, the programming language memory model must be mapped onto the hardware memory model. Memory fence instructions can be inserted by the compiler where needed to accomplish this mapping. We have developed and implemented several fence insertion and optimization algorithms in our Pensieve compiler project. We present the different fence insertion optimization techniques that were used in this system to guarantee sequential consistency at the language level, and compare them using performance data. Our techniques target two hardware relaxed memory consistency models provided by SMPs based on IBM Power 3 and Intel Pentium 4. Our fence insertion optimization shows up to 17.2% and 32.7% performance improvement on average, with the IBM PowerPC and Intel Pentium 4 (Xeon) multiprocessors respectively.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
 
4
Apple Computer, IBM, and Motorola. PowerPC Microprocessor Common Hardware Reference Platform. Morgan Kaufmann Publishers, Inc., 1995.
 
5
 
6
Intel Corporation, 2002. The IA-32 Intel® Architecture Software Developer's Manual.
7
 
8
Xing Fang. Inserting fences to guarantee sequential consistency. Master's thesis, Department of Computer Science and Engineering, Michigan State University, August 2002. Technical Report MSU-CSE-02-27.
 
9
Michael R. Garey and David S. Johnson. Computers and Intractability. W. H. Freeman and Company, 1979.
10
 
11
James R. Goodman. Cache consistency and sequential consistency. Technical Report CS-TR-91-1006, Department of Computer Science, University of Wisconsin, February 1991.
 
12
 
13
Leslie Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Transactions on Computers, C-28(9):690--691, September 1979.
 
14
 
15
 
16
17
 
18
Zhiyuan Li and Walid Abu-sufah. On reducing data synchronization in multiprocessed loops. IEEE Transactions on Computers, C-36(1):105--109, January 1987.
 
19
Samuel P. Midkiff and David A. Padua. Compiler generated synchronization for do loops. In the 1986 International Conference on Parallel Processing, pages 19--22, August 1986.
 
20
21
22
 
23
Zehra Sura, Chi-Leung Wong, Xing Fang, Jaejin Lee, Samuel P. Midkiff, and David Padua. Automatic implementation of programming language consistency models. In Proceedings of The 15th International Workshop on Languages and Compilers for Parallel Computing (LCPC), July 2002.

CITED BY  7

Collaborative Colleagues:
Xing Fang: colleagues
Jaejin Lee: colleagues
Samuel P. Midkiff: colleagues