ACM Home Page
Please provide us with feedback. Feedback
A SIMD optimization framework for retargetable compilers
Full text PdfPdf (1.49 MB)
Source
ACM Transactions on Architecture and Code Optimization (TACO) archive
Volume 6 ,  Issue 1  (March 2009) table of contents
Article No. 2  
Year of Publication: 2009
ISSN:1544-3566
Authors
Manuel Hohenauer  RWTH Aachen University, Germany
Felix Engel  RWTH Aachen University, Germany
Rainer Leupers  RWTH Aachen University, Germany
Gerd Ascheid  RWTH Aachen University, Germany
Heinrich Meyr  RWTH Aachen University, Germany
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 37,   Downloads (12 Months): 228,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1509864.1509866
What is a DOI?

ABSTRACT

Retargetable C compilers are currently widely used to quickly obtain compiler support for new embedded processors and to perform early processor architecture exploration. A partially inherent problem of the retargetable compilation approach, though, is the limited code quality as compared to hand-written compilers or assembly code due to the lack of dedicated optimizations techniques. This problem can be circumvented by designing flexible, retargetable code optimization techniques that apply to a certain range of target architectures. This article focuses on target machines with SIMD instruction support, a common feature in embedded processors for multimedia applications. However, SIMD optimization is known to be a difficult task since SIMD architectures are largely nonuniform, support only a limited set of data types and impose several memory alignment constraints. Additionally, such techniques require complicated loop transformations, which are tailored to the SIMD architecture in order to exhibit the necessary amount of parallelism in the code. Thus, integrating the SIMD optimization and the required loop transformations together in a single retargeting formalism is an ambitious challenge. In this article, we present an efficient and quickly retargetable SIMD code optimization framework that is integrated into an industrial retargetable C compiler. Experimental results for different processors demonstrate that the proposed technique applies to real-life target machines and that it produces code quality improvements close to the theoretical limit.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Associated Computer Experts (ACE). The COSY compiler development system. http://www.ace.nl.
 
2
Advanced RISC Machines Ltd. The ARM11 processor. http://www.arm.com.
3
4
 
5
Cheong, G. and Lam, M. S. 1997. An optimizer for multimedia instruction sets. In Proceedings of the 2nd SUIF Compiler Workshop. Stanford University, CA.
 
6
Coware Inc. Processor Designer. http://www.coware.com.
7
8
 
9
Franchetti, F., Kral, S., Lorenz, J., and Ueberhuber, C. W. 2005. Efficient utilization of SIMD extensions. Proc. IEEE. 93, 2, 409--425.
10
11
 
12
Glöckler, T., Bitterlich, S., and Meyr, H. 2000. ICORE: a low-power application specific instruction set processor for DVB-T acquisition and tracking. In Proceedings of the 13th Annual IEEE International ASIC/SOC Conference. IEEE, Los Alamitos, CA.
 
13
GNU Compiler Collection. Auto-vectorization in GCC. http://gcc.gnu.org/projects/tree-ssa/vectorization.html.
 
14
 
15
 
16
 
17
18
 
19
Intel Corporation. Intel C compiler. http://www.intel.com.
20
 
21
22
23
 
24
 
25
26
 
27
28
 
29
 
30
31
 
32
NXP Semiconductors. The TriMedia media processor. http://www.nxp.com.
 
33
Oraioglu, A. and Veidenbaum, A. 2003. Application specific microprocessors (Guest Editors' Introduction). IEEE Des.Test Comput. 20.
 
34
 
35
Pryanishnikov, I., Krall, A., and Horspool, N. 2003. Pointer alignment analysis for processors with SIMD instructions. In Proceedings of the 5th Workshop on Media and Streaming Processors. ACM, New York.
 
36
Ren, G., Wu, P., and Padua, D. 2003. A preliminary study on the vectorization of multimedia applications for multimedia extensions. In Proceedings of the 16th International Workshop of Languages and Compilers for Parallel Computing. Springer, Berlin, Germany.
37
 
38
Rizzolo, N. and Padua, D. 2005. HiLO: high level optimization of FFTs. In Proccedings of the 18th International Conference on Languages and Compilers for High Performance Computing. Springer, Berlin, Germany, 238--252.
 
39
Tensilica, Inc. Xtensa C compiler. http://www.tensilica.com.
 
40
 
41
 
42
Zivojnovic, V., Velarde, J., Schläger, C., and Meyr, H. 1994. DSPStone—a DSP-oriented benchmarking methodology. In Proceedings of the International Conference on Signal Processing Applications and Technology (ICSPAT). IASTED, Calgary, Alberta.

Collaborative Colleagues:
Manuel Hohenauer: colleagues
Felix Engel: colleagues
Rainer Leupers: colleagues
Gerd Ascheid: colleagues
Heinrich Meyr: colleagues