| On the combination of hardware and software concurrency extraction methods |
| Full text |
Pdf
(1.28 MB)
|
| Source
|
International Symposium on Microarchitecture
archive
Proceedings of the 20th annual workshop on Microprogramming
table of contents
Colorado Springs, Colorado, United States
Pages: 133 - 141
Year of Publication: 1987
ISBN:0-89791-250-0
|
|
Authors
|
|
Augustus K. Uht
|
University of California, San Diego, Dept. of Computer Science and Engineering, C-014, La Jolla, California and University of California at San Diego, and the Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign
|
|
Constantine D. Polychronopoulos
|
University of Illinois at Urbana-Champaign, Center for Supercomputing Research and Development, Urbana, Illinois
|
|
John F. Kolen
|
University of California, San Diego, Dept. of Computer Science and Engineering, C-014, La Jolla, California and Department of Computer and Information Science, Ohio State University, Columbus, Ohio
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 3, Downloads (12 Months): 8, Citation Count: 10
|
|
|
ABSTRACT
It has been shown that parallelism is a very promising alternative for enhancing computer performance. Parallelism, however, introduces much complexity to the programming effort. This has lead to the development of automatic concurrency extraction techniques. Prior work has demonstrated that static program restructuring via compiler based techniques provides a large degree of parallelism to the target machine. Purely hardware based extraction techniques (without software preprocessing) have also demonstrated significant (but lesser) degrees of parallelism. This paper considers the performance effects of the combination of both hardware and software techniques. The concurrency extracted from a given set of benchmarks by each technique separately, and together, is determined via simulations and/or analysis. The “common parallelism” extracted by the two methods is thus also considered, using new metrics. The analytic techniques for predicting the performance of specific programs are also described.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
Chamberlin, D. D. The Single-Assignment Approach to Parallel Processing. In Fall Joint Computer Conference, pages 263-269. AFIPS. 1971.
|
| |
4
|
Cytron, R. G. Doacross: Beyond Vectorization for Multiprocessors (Extended Abstract). In Proceedinns of the 1986 International Conference on Parallel Proiesshg, pages 836-844. Pennsylvania State University and the IEEE Computer Society, August, 1986.
|
 |
5
|
|
 |
6
|
|
| |
7
|
Kolen. J. F. Characterization of Concurrently Executed Programs. 1987. Undergraduate project report, Dept. of Electrical Engineering and Computer Sciences, University of California at San Diego, La Jolla, CA.
|
| |
8
|
Kuck, D. J., Muraoka, Y. and Chen, S.-C. On the Number of Operations Simultaneously Executable in Fortran-Like Programs and Their Resulting Speedup. IEEE Transactions on Computers C-21(12):1293-1310, December, 1972.
|
 |
9
|
|
| |
10
|
|
| |
11
|
Kuck, D. J., Kuhn. R. H.. Leasure. B.. and Wolfe. M. The Structure of an Advanced Vectorizer for Pipelined Processors. In Proceedings of the Fourth International Computer Software and Applications Conference, ACM, October, 1980.
|
 |
12
|
Y. N. Patt , W. M. Hwu , M. Shebanow, HPS, a new microarchitecture: rationale and introduction, Proceedings of the 18th annual workshop on Microprogramming, p.103-108, December 03-06, 1985, Pacific Grove, California, United States
|
| |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
Thorton, J. E. Parallel Operation in the Control Data 6600. In Proceedings of the Fall Joint Computer Conference, pages 33-40. AFIPS, 1964.
|
| |
18
|
|
| |
19
|
Tjaden, G. S. and Flynn, M. J. Representation of Concurrency with Ordering Matrices. IEEE Transactions on Computers C-22(8):752-761, August, 1973.
|
| |
20
|
Tomasulo, R. M. An Efficient Algorithm for Expoiting Multiple Arithmetic Units. IBM Journal:25-33, January, 1967.
|
| |
21
|
|
| |
22
|
Uht, A. K. An Efficient Hardware Algorithm to Extract Concurrency From General-Purpose Code. In Proceedings of the Nineteenth Annual Hawaii International Conference on System Sciences. University of Hawaii, in cooperation with the ACM and the IEEE Computer Society, January, 1986.
|
| |
23
|
Uht, A. K. and Wedig, R. G. Hardware Extraction of Low-level Concurrency from Serial Instruction Streams. In Proceedings of the International Conference on Parallel Processing, pages 729-736. IEEE Computer Society and the Association for Computing Machinery, August, 1986.
|
| |
24
|
|
| |
25
|
Wedig, R. G. Detection of Concurrency in Directly Executed Language Instruction Streams. PhD thesis, Stanford University, June, 1982.
|
CITED BY 10
|
|
Pohua P. Chang , William Y. Chen , Scott A. Mahlke , Wen-mei W. Hwu, Comparing static and dynamic code scheduling for multiple-instruction-issue processors, Proceedings of the 24th annual international symposium on Microarchitecture, p.25-33, September 1991, Albuquerque, New Mexico, Puerto Rico
|
|
|
|
|
|
|
|
|
J. H. Jacobs , A. K. Uht , R. C. Ord, Modeling the effects of instruction queue loading on a static instruction stream micro-architecture, Proceedings of the 21st annual workshop on Microprogramming and microarchitecture, p.11-20, November 28-December 02, 1988, San Diego, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|