ACM Home Page
Please provide us with feedback. Feedback
Automating custom-precision function evaluation for embedded processors
Full text PdfPdf (223 KB)
Source International Conference on Compilers, Architecture and Synthesis for Embedded Systems archive
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems table of contents
San Francisco, California, USA
SESSION: Hardware specialization table of contents
Pages: 22 - 31  
Year of Publication: 2005
ISBN:1-59593-149-X
Authors
Ray C. C. Cheung  Imperial College London, London, United Kingdom
Dong-U Lee  University of California, Los Angeles
Oskar Mencer  Imperial College London, London, United Kingdom
Wayne Luk  Imperial College, London, UK
Peter Y. K. Cheung  Imperial College, London, UK
Sponsors
ACM: Association for Computing Machinery
SIGBED: ACM Special Interest Group on Embedded Systems
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 55,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1086297.1086302
What is a DOI?

ABSTRACT

Due to resource and power constraints, embedded processors often cannot afford dedicated floating-point units. For instance, the IBM PowerPC processor embedded in Xilinx Virtex-II Pro FPGAs only supports emulated floating-point arithmetic, which leads to slow operation when floating-point arithmetic is desired. This paper presents a customizable mathematical library using fixed-point arithmetic for elementary function evaluation. We approximate functions via polynomial or rational approximations depending on the user-defined accuracy requirements. The data representation for the inputs and outputs are compatible with IEEE single-precision and double-precision floating-point formats. Results show that our 32-bit polynomial method achieves over 80 times speedup over the single-precision mathematical library from Xilinx, while our 64-bit polynomial method achieves over 30 times speedup.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Accelerated System Performance with APU-Enhanced Processing, Xcell Journal, Xilinx Inc. http://www.xilinx.com/publications/xcellonline/xcell_52/xc_pdf/xc_v4acu52.pdf.
 
2
Excalibur Device Overview Data Sheet, Altera Inc. http://www.altera.com/literature/ds/ds_arm.pdf
 
3
MicroBlaze Processor Reference Guide, Xilinx Inc. http://www.xilinx.com/ise/embedded/mb_ref_guide.pdf
 
4
ML310 User Guide, Xilinx Inc. http://www.xilinx.com/products/boards/ml310/current/pcb/sch/ug068.pdf
 
5
PowerPC 405 ProcessorBlock Reference Guide, Xilinx Inc. http://www.xilinx.com/bvdocs/userguides/ppc405block_ref_guide.pdf
 
6
7
 
8
 
9
10
 
11
 
12
 
13
J. Harrison and T. Kubaska and S. Story and P.T.P. Tang. "The computation of transendental functions on the IA-64 architecture". Intel Technology Journal, Q4, pages 1--7, 1999.
 
14
 
15
D. Lee, A. Abdul Gaffar, O. Mencer, and W. Luk. "Adaptive range reduction for hardware function evaluation". In Proc. IEEE Int'l Conf. on Field-Programmable Technology, pages 169--176, 2004.
 
16
 
17
 
18
 
19
 
20
A. Nesterov. "Optimized math library for TMS320C67x DSP reference manual", 2001.
 
21
22
 
23

Collaborative Colleagues:
Ray C. C. Cheung: colleagues
Dong-U Lee: colleagues
Oskar Mencer: colleagues
Wayne Luk: colleagues
Peter Y. K. Cheung: colleagues