| Way Stealing: cache-assisted automatic instruction set extensions |
| Full text |
Pdf
(153 KB)
|
| Source
|
Annual ACM IEEE Design Automation Conference
archive
Proceedings of the 46th Annual Design Automation Conference
table of contents
San Francisco, California
SESSION: High-performance platforms: advances in system-level exploration and optimization
table of contents
Pages 31-36
Year of Publication: 2009
ISBN:978-1-60558-497-3
|
|
Authors
|
|
Theo Kluter
|
Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
|
|
Philip Brisk
|
Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
|
|
Paolo Ienne
|
Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
|
|
Edoardo Charbon
|
Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland and Delft University of Technology, Delft, The Netherlands
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 18, Downloads (12 Months): 18, Citation Count: 0
|
|
|
ABSTRACT
This paper introduces Way Stealing, a simple architectural modification to a cache-based processor to increase data bandwidth to and from application-specific Instruction Set Extensions (ISEs). Way Stealing provides more bandwidth to the ISE-logic than the register file alone and does not require expensive coherence protocols, as it does not add memory elements to the processor. When enhanced with Way Stealing, ISE identification flows detect more opportunities for acceleration than prior methods; consequently, Way Stealing can accelerate applications to up to 3.7X, whilst reducing the memory sub-system energy consumption by up to 67%, despite data-cache related restrictions.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
P. Biswas, N. Dutt, L. Pozzi, and P. Ienne. Introduction of architecturally visible storage in instruction set extensions. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, CAD-26(3):435--46, Mar. 2007.
|
| |
2
|
J. Cong, G. Han, and Z. Zhang. Architecture and compiler optimizations for data bandwidth improvement in configurable embedded processors. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 14(9):986--97, Sept. 2006.
|
| |
3
|
J. A. Fisher, P. Faraboschi, and C. Young. Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools. Morgan Kaufmann, San Francisco, Calif., 2005.
|
| |
4
|
T. R. Halfhill. EEMBC releases first benchmarks. Microprocessor Report, 1 May 2000.
|
| |
5
|
P. Ienne and R. Leupers, editors. Customizable Embedded Processors---Design Technologies and Applications. Systems on Silicon Series. Morgan Kaufmann, San Mateo, Calif., 2006.
|
| |
6
|
R. Jayaseelan, H. Liu, and T. Mitra. Exploiting forwarding to improve data bandwidth of instruction-set extensions. In Proceedings of the 43rd Design Automation Conference, pages 43--48, San Francisco, Calif., July 2006.
|
| |
7
|
K. Karuri, A. Chattopadhyay, M. Hohenauer, R. Leupers, G. Ascheid, and H. Meyr. Increasing data-bandwidth to instruction-set extensions through register clustering. In Proceedings of the International Conference on Computer Aided Design, pages 166--71, San Jose, Calif., Nov. 2007.
|
| |
8
|
T. Kluter, P. Brisk, P. Ienne, and E. Charbon. Speculative DMA for Architecturally Visible Storage in Instruction Set Extensions. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, pages 243--48, Atlanta, Ga., Oct. 2008.
|
| |
9
|
L. Pozzi, K. Atasu, and P. Ienne. Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, CAD-25(7):1209--29, July 2006.
|
| |
10
|
G. Ramalingam. On loops, dominators, and dominance frontiers. ACM Transactions on Programming Languages and Systems (TOPLAS), 24(5):455--90, Sept. 2002.
|
| |
11
|
P. Ranganathan, S. V. Adve, and N. P. Jouppi. Reconfigurable caches and their application to media processing. In Proceedings of the 27th Annual International Symposium on Computer Architecture, pages 214--24, Vancouver, June 2000.
|
| |
12
|
D. Tarjan, S. Thoziyoor, and N. P. Jouppi. CACTI 4.0. Technical Report HPL-2006-86, Hewlett-Packard Development Company, Palo Alto, Calif., June 2006.
|
|