| Optimizing scientific application loops on stream processors |
| Full text |
Pdf
(360 KB)
|
Source
|
Language, Compiler and Tool Support for Embedded Systems
archive
Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
table of contents
Tucson, AZ, USA
SESSION: Register allocation
table of contents
Pages 161-170
Year of Publication: 2008
ISBN:978-1-60558-104-0
Also published in ...
|
|
Authors
|
|
Li Wang
|
NUDT, ChangSha, China
|
|
Xuejun Yang
|
NUDT, ChangSha, China
|
|
Jingling Xue
|
UNSW, Sydney, Australia
|
|
Yu Deng
|
NDUT, ChangSha, China
|
|
Xiaobo Yan
|
NUDT, ChangSha, China
|
|
Tao Tang
|
NUDT, ChangSha, China
|
|
Quan Hoang Nguyen
|
UNSW, Sydney, Australia
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 98, Citation Count: 1
|
|
|
ABSTRACT
This paper describes a graph coloring compiler framework to allocate on-chip SRF(Stream Register File) storage for optimizing scientific applications on stream processors. Our framework consists of first applying enabling optimizations such as loop unrolling to expose stream reuse and opportunities for maximizing parallelism, i.e., overlapping kernel execution and memory transfers.Then the three SRF management tasks are solved in a unified manner via graph coloring: (1) placing streams in the SRF, (2) exploiting stream use, and (3) maximizing parallelism. We evaluate the performance of our compiler framework by actually running nine representative scientific computing kernels on our FT64 stream processor. Our preliminary results show that compiler management achieves an average speedup of 2.3x compared to First-Fit allocation. In comparison with the performance results obtained from running these benchmarks on Itanium 2, an average speedup of 2.1x is observed.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Sitij Agrawal , William Thies , Saman Amarasinghe, Optimizing stream programs using linear state space analysis, Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems, September 24-27, 2005, San Francisco, California, USA
[doi> 10.1145/1086297.1086315]
|
 |
2
|
|
 |
3
|
|
| |
4
|
|
 |
5
|
|
| |
6
|
William J. Dally , Francois Labonte , Abhishek Das , Patrick Hanrahan , Jung-Ho Ahn , Jayanth Gummaraju , Mattan Erez , Nuwan Jayasena , Ian Buck , Timothy J. Knight , Ujval J. Kapasi, Merrimac: Supercomputing with Streams, Proceedings of the 2003 ACM/IEEE conference on Supercomputing, p.35, November 15-21, 2003
|
 |
7
|
|
 |
8
|
|
 |
9
|
|
| |
10
|
|
 |
11
|
Michael I. Gordon , William Thies , Saman Amarasinghe, Exploiting coarse-grained task, data, and pipeline parallelism in stream programs, Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, October 21-25, 2006, San Jose, California, USA
|
| |
12
|
|
| |
13
|
|
 |
14
|
Timothy J. Knight , Ji Young Park , Manman Ren , Mike Houston , Mattan Erez , Kayvon Fatahalian , Alex Aiken , William J. Dally , Pat Hanrahan, Compilation for explicitly managed memory hierarchies, Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming, March 14-17, 2007, San Jose, California, USA
[doi> 10.1145/1229428.1229477]
|
| |
15
|
Francois Labonte , Peter Mattson , William Thies , Ian Buck , Christos Kozyrakis , Mark Horowitz, The Stream Virtual Machine, Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, p.267-277, September 29-October 03, 2004
[doi> 10.1109/PACT.2004.29]
|
| |
16
|
V. Lefebvre and P. Feautrier. Storage management in parallel programs. Technical report, Laboratory PRiSM, University of Versailles, France, 1996.
|
| |
17
|
|
 |
18
|
|
 |
19
|
|
| |
20
|
Peter Raymond Mattson. phA programming system for the imagine media processor. PhD thesis, Stanford University, Stanford, CA, USA, 2002. Adviser-William J. Dally.
|
| |
21
|
John D. Owens. phComputer Graphics on a Stream Architecture. PhD thesis, Stanford University, November 2002.
|
| |
22
|
|
 |
23
|
|
 |
24
|
|
| |
25
|
Michael Bedford Taylor , Jason Kim , Jason Miller , David Wentzlaff , Fae Ghodrat , Ben Greenwald , Henry Hoffman , Paul Johnson , Jae-Wook Lee , Walter Lee , Albert Ma , Arvind Saraf , Mark Seneski , Nathan Shnidman , Volker Strumpen , Matt Frank , Saman Amarasinghe , Anant Agarwal, The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs, IEEE Micro, v.22 n.2, p.25-35, March 2002
[doi> 10.1109/MM.2002.997877]
|
| |
26
|
W. Thies, M. Karczmarek, M. Gordon, D. Maze, J. Wong, H. Ho, M. Brown, and S. Amarasinghe. StreamIt: A compiler for streaming applications, December 2001. MIT-LCS Technical Memo TM-622, Cambridge, MA.
|
| |
27
|
|
 |
28
|
Samuel Williams , John Shalf , Leonid Oliker , Shoaib Kamil , Parry Husbands , Katherine Yelick, The potential of the cell processor for scientific computing, Proceedings of the 3rd conference on Computing frontiers, May 03-05, 2006, Ischia, Italy
[doi> 10.1145/1128022.1128027]
|
| |
29
|
Nan Wu, Mei Wen, Ju Ren, Yi He, and Chunyuan Zhang. Register allocation on stream processor with local register file. In phACSAC '06: Proceedings of the 11th Asia-Pacific Computer Systems Architecture Conference, pages 545--551, 2006.
|
| |
30
|
|
 |
31
|
Xuejun Yang , Xiaobo Yan , Zuocheng Xing , Yu Deng , Jiang Jiang , Ying Zhang, A 64-bit stream processor architecture for scientific applications, Proceedings of the 34th annual international symposium on Computer architecture, June 09-13, 2007, San Diego, California, USA
|
CITED BY
|
|
Xuejun Yang , Li Wang , Jingling Xue , Yu Deng , Ying Zhang, Comparability graph coloring for optimizing utilization of stream register files in stream processors, Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, February 14-18, 2009, Raleigh, NC, USA
|
|