| Compiling for vector-thread architectures |
| Full text |
Pdf
(295 KB)
|
Source
|
Code Generation and Optimization
archive
Proceedings of the sixth annual IEEE/ACM international symposium on Code generation and optimization
table of contents
Boston, MA, USA
SESSION: Compiling for multicore and multithreading
table of contents
Pages 205-215
Year of Publication: 2008
ISBN:978-1-59593-978-4
|
|
Authors
|
|
Mark Hampton
|
MIT Computer S ien e and Artificial Intelligence Laboratory, Cambridge, MA, USA
|
|
Krste Asanovic
|
University of California at Berkeley, Berkeley, CA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 24, Downloads (12 Months): 146, Citation Count: 3
|
|
|
ABSTRACT
Vector-thread (VT) architectures exploit multiple forms of parallelism simultaneously. This paper describes a compiler for the Scale VT architecture, which takes advantage of the VT features. We focus on compiling loops, and show how the compiler can transform code that poses difficulties for traditional vector or VLIW processors, such as loops with internal control flow or cross-iteration dependences, while still taking advantage of features not supported by multithreaded designs, such as vector memory instructions. We evaluate the compiler using several embedded benchmarks and show that we can obtain substantial speedups over a single-issue, in-order scalar machine.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
EEMBC. http://www.eembc.org/.
|
| |
2
|
GCC, the GNU Compiler Collection. http://gcc.gnu.org/.
|
| |
3
|
Scale Home Page. http://www--ali.cs.umass.edu/scale/.
|
 |
4
|
J. R. Allen , Ken Kennedy , Carrie Porterfield , Joe Warren, Conversion of control dependence to data dependence, Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, p.177-189, January 24-26, 1983, Austin, Texas
[doi> 10.1145/567067.567085]
|
| |
5
|
|
| |
6
|
|
| |
7
|
Christopher Batten , Ronny Krashinsky , Steve Gerding , Krste Asanovic, Cache Refill/Access Decoupling for Vector Machines, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.331-342, December 04-08, 2004, Portland, Oregon
[doi> 10.1109/MICRO.2004.9]
|
| |
8
|
T. Bernard et al. A microthreaded architecture and its compiler. In Proceedings of the 12th International Workshop on Compilers for Parallel Computers, pages 326--340, January 2006.
|
 |
9
|
|
| |
10
|
L. N. Chakrapani et al. Trimaran: an infrastructure for research in instruction--level parallelism. Lecture Notes in Computer Science, 3602:32--41, 2005.
|
 |
11
|
|
 |
12
|
Katherine E. Coons , Xia Chen , Doug Burger , Kathryn S. McKinley , Sundeep K. Kushwaha, A spatial path scheduling algorithm for EDGE architectures, Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, October 21-25, 2006, San Jose, California, USA
|
 |
13
|
|
| |
14
|
Alexandre E. Eichenberger , Kathryn O'Brien , Kevin O'Brien , Peng Wu , Tong Chen , Peter H. Oden , Daniel A. Prener , Janice C. Shepherd , Byoungro So , Zehra Sura , Amy Wang , Tao Zhang , Peng Zhao , Michael Gschwind, Optimizing Compiler for the CELL Processor, Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, p.161-172, September 17-21, 2005
[doi> 10.1109/PACT.2005.33]
|
| |
15
|
M. M. Islam et al. Limits on thread--level speculative parallelism in embedded applications. In INTERACT--11, pages 40--49, February 2007.
|
| |
16
|
|
 |
17
|
Arun Kejariwal , Alexander V. Veidenbaum , Alexandru Nicolau , Milind Girkarmark , Xinmin Tian , Hideki Saito, Challenges in exploitation of loop parallelism in embedded applications, Proceedings of the 4th international conference on Hardware/software codesign and system synthesis, October 22-25, 2006, Seoul, Korea
[doi> 10.1145/1176254.1176298]
|
| |
18
|
Brucek Khailany , William J. Dally , Ujval J. Kapasi , Peter Mattson , Jinyung Namkoong , John D. Owens , Brian Towles , Andrew Chang , Scott Rixner, Imagine: Media Processing with Streams, IEEE Micro, v.21 n.2, p.35-46, March 2001
[doi> 10.1109/40.918001]
|
 |
19
|
Ronny Krashinsky , Christopher Batten , Mark Hampton , Steve Gerding , Brian Pharris , Jared Casper , Krste Asanovic, The Vector-Thread Architecture, Proceedings of the 31st annual international symposium on Computer architecture, p.52, June 19-23, 2004, München, Germany
|
| |
20
|
Ronny Krashinsky , Christopher Batten , Mark Hampton , Steve Gerding , Brian Pharris , Jared Casper , Krste Asanovic, The Vector-Thread Architecture, IEEE Micro, v.24 n.6, p.84-90, November 2004
[doi> 10.1109/MM.2004.90]
|
| |
21
|
R. M. Krashinsky. Vector--thread architecture and implementation. PhD thesis, Massachusetts Institute of Technology, June 2007.
|
 |
22
|
|
| |
23
|
|
 |
24
|
|
| |
25
|
Ramadass Nagarajan , Sundeep K. Kushwaha , Doug Burger , Kathryn S. McKinley , Calvin Lin , Stephen W. Keckler, Static Placement, Dynamic Issue (SPDI) Scheduling for EDGE Architectures, Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, p.74-84, September 29-October 03, 2004
[doi> 10.1109/PACT.2004.26]
|
| |
26
|
|
 |
27
|
Karthikeyan Sankaralingam , Ramadass Nagarajan , Haiming Liu , Changkyu Kim , Jaehyuk Huh , Doug Burger , Stephen W. Keckler , Charles R. Moore, Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture, Proceedings of the 30th annual international symposium on Computer architecture, June 09-11, 2003, San Diego, California
|
| |
28
|
|
| |
29
|
J. Shin, M. Hall, and J. Chame. Evaluating compiler technology for control-ow optimizations for multimedia extension architectures. In 6th Workshop on Media and Streaming Processors, December 2004.
|
| |
30
|
|
| |
31
|
Aaron Smith , Jon Gibson , Bertrand Maher , Nick Nethercote , Bill Yoder , Doug Burger , Kathryn S. McKinle , Jim Burrill, Compiling for EDGE Architectures, Proceedings of the International Symposium on Code Generation and Optimization, p.185-195, March 26-29, 2006
[doi> 10.1109/CGO.2006.10]
|
| |
32
|
R. Tarjan. Depth first search and linear graph algorithms. SIAM Journal of Computing, 1(2):146--160, June 1972.
|
| |
33
|
X. Tian et al. Exploiting thread-level and instruction-level parallelism for Hyper-Threading Technology. Intel Developer Update Magazine, January 2003.
|
 |
34
|
Robert P. Wilson , Robert S. French , Christopher S. Wilson , Saman P. Amarasinghe , Jennifer M. Anderson , Steve W. K. Tjiang , Shih-Wei Liao , Chau-Wen Tseng , Mary W. Hall , Monica S. Lam , John L. Hennessy, SUIF: an infrastructure for research on parallelizing and optimizing compilers, ACM SIGPLAN Notices, v.29 n.12, p.31-37, Dec. 1994
[doi> 10.1145/193209.193217]
|
 |
35
|
|
|