|
ABSTRACT
This paper explores a novel way to incorporate hardware-programmable resources into a processor microarchitecture to improve the performance of general-purpose applications. Through a coupling of compile-time analysis routines and hardware synthesis tools, we automatically configure a given set of the hardware-programmable functional units (PFUs) and thus augment the base instruction set architecture so that it better meets the instruction set needs of each application. We refer to this new class of general-purpose computers as PRogrammable Instruction Set Computers (PRISC). Although similar in concept, the PRISC approach differs from dynamically programmable microcode because in PRISC we define entirely-new primitive datapath operations. In this paper, we concentrate on the microarchitectural design of the simplest form of PRISC—a RISC microprocessor with a single PFU that only evaluates combinational functions. We briefly discuss the operating system and the programming language compilation techniques that are needed to successfully build PRISC and, we present performance results from a proof-of-concept study. With the inclusion of a single 32-bit-wide PFU whose hardware cost is less than that of a 1 kilobyte SRAM, our study shows a 22% improvement in processor performance on the SPECint92 benchmarks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
A. Abd-alla and D. Karlgaard. Heuristic Synthesis of Microprogrammed Computer Architecture. IEEE Transactions on Computers, C-23(8):802-807, Aug. 1974.
|
| |
2
|
Alfred V. Aho , Ravi Sethi , Jeffrey D. Ullman, Compilers: principles, techniques, and tools, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1986
|
 |
3
|
J. R. Allen , Ken Kennedy , Carrie Porterfield , Joe Warren, Conversion of control dependence to data dependence, Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, p.177-189, January 24-26, 1983, Austin, Texas
[doi> 10.1145/567067.567085]
|
| |
4
|
J. Arnold et al. The Splash 2 Processor and Applications. Proc. Int. Conf. on Computer Design, Oct. 1993.
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
R. Brayton, R. Rudell, A. Sangiovanni-Vincentelli, and A. Wang. MIS: a Multiple-Level Logic Optimization System. IEEE Transactions on CAD, CAD-6(6): 1062-1081, Nov. 1987.
|
| |
9
|
|
| |
10
|
|
| |
11
|
Digital Equipment Corp. Alpha Architecture Handbook, 1992.
|
| |
12
|
D. Dobberpuhl et al. A 200-MHz 64-bit Dual-issue CMOS Microprocessor. Proc. Int, Solid State Circuits Conf, Feb. 1992.
|
| |
13
|
|
| |
14
|
|
 |
15
|
|
| |
16
|
|
| |
17
|
C. Iseli and E. Sanchez. Beyond Superscalar Using FPGAs. Proc. Int. Conf. on Computer Design, Oct. 1993.
|
| |
18
|
|
| |
19
|
D. Lewis, M. van {erseel, and D. Wong. A Field Programmable Accelerator for Compiled-Code Applications. Proc. Int. Conf on Computer Design, Oct. 1993.
|
| |
20
|
P. Liu and F. Mowle. Techniques of Program Execution with a Writable Control Memory. iEEE Transactions on Computers, C-27(9):816-827, Sept. 1978.
|
 |
21
|
Scott A. Mahlke , David C. Lin , William Y. Chen , Richard E. Hank , Roger A. Bringmann, Effective compiler support for predicated execution using the hyperblock, Proceedings of the 25th annual international symposium on Microarchitecture, p.45-54, December 01-04, 1992, Portland, Oregon, United States
|
| |
22
|
|
| |
23
|
|
| |
24
|
T Rauscher and A. Agrawala. Dynamic Problem-oriented Redefinition of Computer Architecture via Microprogramming. IEEE Transactions on Computers, C-27(i1):i006- 1014, Nov. 1978.
|
 |
25
|
|
| |
26
|
M. Shand and J. Vuillemin. Fast Implementation of RSA Cryptography. Proc. llth Symp. on Computer Arithmetic, 1993.
|
| |
27
|
M. Smith. Tracing with pixie. Computer Systems Lab. Tech. Rep. CSL-TR-91-497, Stanford Univ., Nov. 1991.
|
| |
28
|
Standard Performance Evaluation Corporation (SPEC) Newsletter, Volume 4, Issue 1, Mar. 1992.
|
| |
29
|
J. Stockenberg and A. van Dam. Vertical Migration for Performance Enhancement in Layered Hardware/Firmware/Software Systems. Computer, 11(5):35-50, May 1978,
|
| |
30
|
Donald E. Thomas , Elizabeth D. Lagnese , John A. Nestor , Jayanth V. Rajan , Robert L. Blackburn , Robert A. Walker, Algorithmic and Register-Transfer Level Synthesis: The System Architect's Workbench, Kluwer Academic Publishers, Norwell, MA, 1989
|
| |
31
|
Xilinx Corporation. Programmable Gate Array Book, 1989.
|
CITED BY 67
|
|
|
|
|
|
|
|
|
|
|
Alberto La Rosa , Luciano Lavagno , Claudio Passerone, A software development tool chain for a reconfigurable processor, Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems, November 16-17, 2001, Atlanta, Georgia, USA
|
|
|
Seth Copen Goldstein , Herman Schmit , Matthew Moe , Mihai Budiu , Srihari Cadambi , R. Reed Taylor , Ronald Laufer, PipeRench: a co/processor for streaming multimedia acceleration, ACM SIGARCH Computer Architecture News, v.27 n.2, p.28-39, May 1999
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Stephan Wong , Stamatis Vassiliadis , Sorin Cotofana, Microcoded reconfigurable embedded processors: current developments, Embedded processor design challenges: systems, architectures, modeling, and simulation-SAMOS, Springer-Verlag New York, Inc., New York, NY, 2002
|
|
|
Maya Gokhale , Jan Frigo , Kevin Mccabe , James Theiler , Christophe Wolinski , Dominique Lavenier, Experience with a Hybrid Processor: K-Means Clustering, The Journal of Supercomputing, v.26 n.2, p.131-148, September 2003
|
|
|
|
|
|
|
|
|
|
|
|
Manfred Glesner , Thomas Hollstein , Leandro Soares Indrusiak , Peter Zipf , Thilo Pionteck , Mihail Petrov , Heiko Zimmer , Tudor Murgan, Reconfigurable platforms for ubiquitous computing, Proceedings of the 1st conference on Computing frontiers, April 14-16, 2004, Ischia, Italy
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Partha Biswas , Vinay Choudhary , Kubilay Atasu , Laura Pozzi , Paolo Ienne , Nikil Dutt, Introduction of local memory elements in instruction set extensions, Proceedings of the 41st annual conference on Design automation, June 07-11, 2004, San Diego, CA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mihai Sima , Sorin Cotofana , Stamatis Vassiliadis , Jos T. J. van Eijndhoven , Kees Vissers, A reconfigurable functional unit for TriMedia/CPU64. A case study, Embedded processor design challenges: systems, architectures, modeling, and simulation-SAMOS, Springer-Verlag New York, Inc., New York, NY, 2002
|
|
|
|
|
|
|
|
|
Sami Yehia , Nathan Clark , Scott Mahlke , Krisztiàn Flautner, Exploring the design space of LUT-based transparent accelerators, Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems, September 24-27, 2005, San Francisco, California, USA
|
|
|
Partha Biswas , Nikil Dutt , Paolo Ienne , Laura Pozzi, Automatic identification of application-specific functional units with architecturally visible storage, Proceedings of the conference on Design, automation and test in Europe: Proceedings, March 06-10, 2006, Munich, Germany
|
|
|
|
|
|
|
|
|
|
|
|
Nathan Clark , Manjunath Kudlur , Hyunchul Park , Scott Mahlke , Krisztian Flautner, Application-Specific Processing on a General-Purpose Core via Transparent Instruction Set Customization, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.30-40, December 04-08, 2004, Portland, Oregon
|
|
|
|
|
|
Nathan Clark , Amir Hormati , Scott Mahlke , Sami Yehia, Scalable subgraph mapping for acyclic computation accelerators, Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems, October 22-25, 2006, Seoul, Korea
|
|
|
Hamid Noori , Farhad Mehdipour , Kazuaki Murakami , Koji Inoue , Maziar Goudarzi, Interactive presentation: Generating and executing multi-exit custom instructions for an adaptive extensible processor, Proceedings of the conference on Design, automation and test in Europe, April 16-20, 2007, Nice, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Cesare Alippi , William Fornaciari , Laura Pozzi , Mariagiovanna Sami, A DAG-based design approach for reconfigurable VLIW processors, Proceedings of the conference on Design, automation and test in Europe, p.57-es, January 1999, Munich, Germany
|
|
|
|
|
|
|
|
|
|
|
|
Partha Biswas , Sudarshan Banerjee , Nikil Dutt , Laura Pozzi , Paolo Ienne, ISEGEN: Generation of High-Quality Instruction Set Extensions by Iterative Improvement, Proceedings of the conference on Design, Automation and Test in Europe, p.1246-1251, March 07-11, 2005
|
|
|
|
|
|
|
|
|
A. Chattopadhyay , H. Ishebabi , X. Chen , Z. Rakosi , K. Karuri , D. Kammler , R. Leupers , G. Ascheid , H. Meyr, Prefabrication and postfabrication architecture exploration for partially reconfigurable VLIW processors, ACM Transactions on Embedded Computing Systems (TECS), v.7 n.4, p.1-31, July 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Andrew Putnam , Susan Eggers , Dave Bennett , Eric Dellinger , Jeff Mason , Henry Styles , Prasanna Sundararajan , Ralph Wittig, Performance and power of cache-based reconfigurable computing, ACM SIGARCH Computer Architecture News, v.37 n.3, June 2009
|
|
|
|
INDEX TERMS
Primary Classification:
C.
Computer Systems Organization
C.0
GENERAL
Subjects:
Instruction set design (e.g., RISC, CISC, VLIW)
Additional Classification:
B.
Hardware
B.1
CONTROL STRUCTURES AND MICROPROGRAMMING
B.1.5
Microcode Applications
B.2
ARITHMETIC AND LOGIC STRUCTURES
B.2.0
General
General Terms:
Design,
Experimentation,
Measurement,
Performance
Keywords:
automatic instruction set design,
compile-time optimization,
general-purpose microarchitectures,
logic synthesis,
programmable logic
|