ACM Home Page
Please provide us with feedback. Feedback
Impact of self-scheduling order on performance on multiprocessor systems
Full text PdfPdf (1.20 MB)
Source International Conference on Supercomputing archive
Proceedings of the 2nd international conference on Supercomputing table of contents
St. Malo, France
Pages: 593 - 603  
Year of Publication: 1988
ISBN:0-89791-272-1
Authors
P. Tang  Univ. of Illinois, Urbana, IL
P.-C. Yew  Univ. of Illinois, Urbana, IL
C.-Q. Zhu  Univ. of Illinois, Urbana, IL
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 21,   Citation Count: 7
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/55364.55422
What is a DOI?

ABSTRACT

Processor self-scheduling is an efficient dynamic scheduling for multiprocessors. This paper discusses the impact of the self-scheduling order on the performance of multiply-nested parallel loops. It is shown that, due to data synchronization for cross-iteration data dependencies, the completion time of a multiply-nested loop is reduced when the nesting parallel loops with smaller delays are moved to the inside. The best performance is achieved when a shortest-delay scheduling order is used. The performance of the shortest-delay self-scheduling is compared to other self-scheduling orders and to compile-time static scheduling order proposed elsewhere. Program transformation needed to implement shortest-delay self-scheduling is also included.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
E.L. Lusk and R. A. Overbeek. "Implementation of Monitors with Macros: A Programming Aid for the HEP and other Parallel Processors", Argonne National Laboratory, ANL-83-97, Argonne, Illinois, December 1983.
 
2
 
3
P. Tang and P. Yew. "Processor Self-Scheduling for Multiple-Nested Parallel Loops," Proceedings of 1986 International Conference on Parallel Processing (August 19-22, 1986), pp. 528-535.
 
4
Z. Fang, P. Yew, P. Tang and C. Zhu. "Dynamic Processor Self-Scheduling for General Parallel Nested Loops," Proceedings of the 1987 International Conference on Parallel Processing (August, 1987), pp. 1-10.
 
5
6
 
7
M.D. Guzzi. "Multitasking Runtime Systems for the Cedar Multiproce.,~or", Center for Supercomputing Research and Development, University of Illinois at Urb~ma-Champaign, Rpt No. 604, Urbana, Illinois, July, 1986.
 
8
Alliant Computer Systems Corp. FX/Series-- Architecture Manual., 1986.
 
9
F. Darema-Rogers, D. A. George, A. Norton and G. F. Pfister. "VM/EPEX- A VM/SP Based Environment for Parallel Execution", IBM Research RCl1381, Yorktown Heights, NY, 1985.
 
10
R.L. Graham. "Bounds of Certain Multiprocessing Timing Anomalies," SIAM Journal on Applied Mathematics (1969), Vol. 17, No. 2, pp. 416-429.
 
11
 
12
 
13
P. Tang, P. Yew and C. Zhu. "Algorithms for Generating Data-Level Synchronization Instructions", Center for Supercomputing Research and Development, University of Illinois at Urbana- Champaign, Rpt. No. 733, Urbana, January, 1988.
 
14
P. Tang, P. Yew, Z. Fang and C. Zhu. "Deadlock Prevention in Processor Self-Scheduling for Parallel Nested Loops," Proceedings of the 1987 International Conference o} Parallel Processing (August 1987), pp. 11-18.
 
15
 
16
 
17
 
18
R. Cytron. "Doacross: Beyond Vectorization for Multiprocessors," Proceedings of the 1986 International Conference for Parallel Processing (August, 1986), pp. 836-844.
 
19
M.J. Wolfe. "Optimizing Compiler for Supercomputers", Department of Computer Science, University of Illinois at Urbana-Champaign, Report No. UIUCDCS-R-82-1105, October, 1982.
20
 
21
M. Wolfe. "Advanced Loop Interchanging," Proceedings o/the 1986 International Conference on Parallel ProcesMng (August 1986), pp. 536- 543.
 
22
R. Cytron. "Limited Processor Scheduling of Doacross Loops," Proceedings of the 1987 International Conference on Parallel Processing (August, 1987), pp. 226-234.
 
23
A. Shoshani and E. G. Coffman. "Detection and Prevention of Deadlocks," Proceedings of 4th Annual Princeton Conference an Information Sciences and Systems (Marchl 1970), pp. 355- 360.
24
 
25
A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph and M. Snir. "The NYU Ultraeomputer Designing a MIJ/ID Shared Memory Parallel Computer," IEEE Trans. on Computer (Feb. 1983), Vol. C-32, No. 2, pp. 175-189.
 
26
D.J. Kuck et al. "Parallel Computing Today and Cedar Approach," Science (Feb. 1986), pp. 967- 974.
 
27
M. Wolfe. "Loop Skewing: the Wavefront Method Revisited", Kuck and Associates, inc., Savoy, Illinois, 1987.