ACM Home Page
Please provide us with feedback. Feedback
Requirements for optimal execution of oops with tests
Full text PdfPdf (782 KB)
Source International Conference on Supercomputing archive
Proceedings of the 2nd international conference on Supercomputing table of contents
St. Malo, France
Pages: 230 - 237  
Year of Publication: 1988
ISBN:0-89791-272-1
Author
A. Uht  Univ. of California, La Jolla, CA
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 13,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/55364.55387
What is a DOI?

ABSTRACT

Both the efficient execution of branch intensive code and knowing the bounds on same are important issues in computing in general and supercomputing in particular. In prior work, it has been suggested, implied, or left as a possible maximum, that the hardware needed to execute code with branches optimally, i.e., oracular performance, is exponentially dependent on the total number of dynamic branches to be executed, this number of branches being proportional at least to the number of iterations of the loop. For classes of code taking at least one cycle per iteration to execute, this is not the case. For loops containing one test (normally in the form of a Boolean recurrence of order 1), it is shown that the hardware necessary varies from exponential to polynomial in the length of the dependency cycle L, while execution time varies from one time cycle per iteration to less than L time cycles per iteration; the variation depends on specific code dependencies.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Baneajee, U. and Gajski, D. Fast Execution of Loops With IF Statements. IEEE Transactions on Computers C-33(11):1030-1033, November, 1984.
 
3
Cytron, R. G. Doacross: Beyond Vectorization for Multiproc~ssors (Extended Abstract). In Proceedings of the 1986 International Conference on Parallel Processing, pages 836-844. Pennsylvania State University and the IEEE Computer Society, August, 1986.
4
5
 
6
 
7
Riseman, E. M. and Foster, C. C. The Inhibition of Potential Parallelism by Conditional Jumps. IEEE Transactions on Computers :1405-1411, December, 1972.
8
 
9
Tomasulo, R. M. An Efficient Algorithm for Expoiting Multiple Arithmetic Units. IBM Journal :25-33, january, 1967.
 
10
 
11
Uht, A. K. and Wedig, R. G. Hardware Extraction of Low-level Concurrency from Serial Instruction Streams. In Proceedings of the International Conference on Parallel Processing, pages 729-736. IEEE Computer Society and the Association for Computing Machinery, August, 1986.
 
12
13