| Traversal caches: a first step towards FPGA acceleration of pointer-based data structures |
| Full text |
Pdf
(340 KB)
|
Source
|
International Conference on Hardware Software Codesign
archive
Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
table of contents
Atlanta, GA, USA
SESSION: Performance enhancement-new techniques for FPGAs and partitioning
table of contents
Pages 61-66
Year of Publication: 2008
ISBN:978-1-60558-470-6
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 17, Downloads (12 Months): 100, Citation Count: 0
|
|
|
ABSTRACT
Field-programmable gate arrays (FPGAs) often achieve order of magnitude speedups compared to microprocessors, but typically have been unable to improve the performance of applications with irregular memory access patterns, such as traversals of pointer-based data structures. Due to the common use of these data structures, the applicability and widespread success of FPGAs has been limited. In this paper, we introduce the traversal cache framework - a first step towards improving the performance of FPGA applications that utilize pointer-based data structures. The traversal cache is a local FPGA memory that stores repeated traversals of pointer-based data structures, allowing for these traversals to be efficiently streamed into the FPGA. Although the cache is generally limited to improving applications that exhibit repeated traversals, we show that many applications in fact have this characteristic. Furthermore, we show that few repetitions are needed to achieve performance improvements. We present experimental results showing that FPGA implementations using the traversal cache framework achieve speedups ranging from 7x to 29x compared to pointer-based software on a 3.2 GHz Xeon.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Brad Calder , Chandra Krintz , Simmi John , Todd Austin, Cache-conscious data placement, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.139-149, October 02-07, 1998, San Jose, California, United States
|
| |
2
|
J. Carter , W. Hsieh , L. Stoller , M. Swanson , L. Zhang , E. Brunvand , A. Davis , C.-C. Kuo , R. Kuramkote , M. Parker , L. Schaelicke , T. Tateyama, Impulse: Building a Smarter Memory Controller, Proceedings of the 5th International Symposium on High Performance Computer Architecture, p.70, January 09-12, 1999
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
GiDEL. GiDEL PROC Boards, 2008. http://www.gidel.com/PROCBoards.htm.
|
| |
8
|
|
 |
9
|
Peter Grun , Nikil Dutt , Alex Nicolau, Memory aware compilation through accurate timing extraction, Proceedings of the 37th conference on Design automation, p.316-321, June 05-09, 2000, Los Angeles, California, United States
[doi> 10.1145/337292.337428]
|
 |
10
|
Zhi Guo , Betul Buyukkurt , Walid Najjar, Input data reuse in compiling window operations onto reconfigurable hardware, Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems, June 11-13, 2004, Washington, DC, USA
|
 |
11
|
Zhi Guo , Walid Najjar , Frank Vahid , Kees Vissers, A quantitative analysis of the speedup factors of FPGAs over processors, Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays, February 22-24, 2004, Monterey, California, USA
[doi> 10.1145/968280.968304]
|
 |
12
|
Brian Holland , Karthik Nagarajan , Chris Conger , Adam Jacobs , Alan D. George, RAT: a methodology for predicting performance in application design migration to FPGAs, Proceedings of the 1st international workshop on High-performance reconfigurable computing technology and applications: held in conjunction with SC07, November 11-11, 2007, Reno, Nevada
[doi> 10.1145/1328554.1328560]
|
 |
13
|
|
 |
14
|
|
| |
15
|
Nallatech Inc. DIMEtalk 3, 2008. http://www.nallatech.com/?node_id=1.2.2&id=19.
|
| |
16
|
Nallatech Inc. Nallatech PCIXM FPGA accelerator card, 2008. http://www.nallatech.com/?node_id=1.2.2&id=41.
|
 |
17
|
P. R. Panda , F. Catthoor , N. D. Dutt , K. Danckaert , E. Brockmeyer , C. Kulkarni , A. Vandercappelle , P. G. Kjeldsberg, Data and memory optimization techniques for embedded systems, ACM Transactions on Design Automation of Electronic Systems (TODAES), v.6 n.2, p.149-206, April 2001
[doi> 10.1145/375977.375978]
|
 |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
Xilinx Inc. MicroBlaze, 2008. http://www.xilinx.com/products/design_resources/proc_central/microblaze.htm.
|
| |
22
|
Xilinx Inc. Virtex IV FX devices, 2008. http://www.xilinx.com/products/silicon_solutions/fpgas/virtex/virtex4/index.htm.
|
|