|
ABSTRACT
The pipeline processor is a common paradigm for very high speed computing machinery. Pipeline processors provide high speed because their separate stages can operate concurrently, much as different people on a manufacturing assembly line work concurrently on material passing down the line. Although the concurrency of pipeline processors makes their design a demanding task, they can be found in graphics processors, in signal processing devices, in integrated circuit components for doing arithmetic, and in the instruction interpretation units and arithmetic operations of general purpose computing machinery.
Because I plan to describe a variety of pipeline processors, I will start by suggesting names for their various forms. Pipeline processors, or more simply just pipelines, operate on data as it passes along them. The latency of a pipeline is a measure of how long it takes a single data value to pass through it. The throughput rate of a pipeline is a measure of how many data values can pass through it per unit time.
Pipelines both store and process data; the storage elements and processing logic in them alternate along their length. I will describe pipelines in their complete form later, but first I will focus on their storage elements alone, stripping away all processing logic. Stripped of all processing logic, any pipeline acts like a series of storage elements through which data can pass.
Pipelines can be clocked or event-driven, depending on whether their parts act in response to some widely-distributed external clock, or act independently whenever local events permit. Some pipelines are inelastic; the amount of data in them is fixed. The input rate and the output rate of an inelastic pipeline must match exactly. Stripped of any processing logic, an inelastic pipeline acts like a shift register. Other pipelines are elastic; the amount of data in them may vary. The input rate and the output rate of an elastic pipeline may differ momentarily because of internal buffering. Stripped of all processing logic, an elastic pipeline becomes a flow-through first-in-first-out memory, or FIFO. FIFOs may be clocked or event-driven; their important property is that they are elastic.
I assign the name micropipeline to a particularly simple form of event-driven elastic pipeline with or without internal processing. The micro part of this name seems appropriate to me because micropipelines contain very simple circuitry, because micropipelines are useful in very short lengths, and because micropipelines are suitable for layout in microelectronic form.
I have chosen micropipelines as the subject of this lecture for three reasons. First, micropipelines are simple and easy to understand. I believe that simple ideas are best, and I find beauty in the simplicity and symmetry of micropipelines. Second, I see confusion surrounding the design of FIFOs. I offer this description of micropipelines in the hope of reducing some of that confusion.
The third reason I have chosen my subject addresses the limitations imposed on us by the clocked-logic conceptual framework now commonly used in the design of digital systems. I believe that this conceptual framework or mind set masks simple and useful structures like micropipelines from our thoughts, structures that are easy to design and apply given a different conceptual framework. Because micropipelines are event-driven, their simplicity is not available within the clocked-logic conceptual framework. I offer this description of micropipelines in the hope of focusing attention on an alternative transition-signalling conceptual framework.
We need a new conceptual framework because the complexity of VLSI technology has now reached the point where design time and design cost often exceed fabrication time and fabrication cost. Moreover, most systems designed today are monolithic and resist mid-life improvement. The transition-signalling conceptual framework offers the opportunity to build up complex systems by hierarchical composition from simpler pieces. The resulting systems are easily modified. I believe that the transition-signalling conceptual framework has much to offer in reducing the design time and cost of complex systems and increasing their useful lifetime. I offer this description of micropipelines as an example of the transition-signalling conceptual framework.
Until recently only a hardy few used the transition-signalling conceptual framework for design because it was too hard. It was nearly impossible to design the small circuits of 10 to 100 transistors that form the elemental building blocks from which complex systems are composed. Moreover, it was difficult to prove anything about the resulting compositions. In the past five years, however, much progress has been made on both fronts. Charles Molnar and his colleagues at Washington University have developed a simple way to design the small basic building blocks [9]. Martin Rem's "VLSI Club" at the Technical University of Eindhoven has been working effectively on the mathematics of event-driven systems [6, 10, 11, 19]. These emerging conceptual tools now make transition signalling a lively candidate for widespread use.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Chaney, T.J., and Mo lnar, C.E. Anomalous behavior of synchronizer and arbiter circuits, i EEI:: Trans. Comput. C-22, 4 (Apr. 1973), 421- 422.
|
| |
2
|
Clark, W.A. Macrom~dular computer systems. In Proceedings of the Spring Joint Computer Conference, AFIPS, April 1967,
|
| |
3
|
Clark, W.A., and Molnar. C.E. Macromodular computer systems. Computers in Biomedi:al Research, Vol. 4, R. Stacy and B. Waxman, Eds., Academic Pres:, New York, 1974, 45-85.
|
| |
4
|
|
| |
5
|
|
| |
6
|
Ebergen, J.C. Translating programs into delay-insensitive circuits. Ph.D. dissertation, EJndhoven University of Technology, 1987.
|
| |
7
|
Levy, J.V. Buses, the skeleton of computer structures. In Computer Engineering, C.G. Bell, J.C. Mudge, and J.E. McNamara, Eds., Digital Press, 1978.
|
| |
8
|
Miller, R.E. "Sequenl ial Circuits", Chapter I0, In Switching Theory, Vol 2, Wiley, NY, 19,35.
|
| |
9
|
Molnar, C.E., Fang, 'I'.P., and Rosenberger, F.U. Synthesis of delayinsensitive modules. In Proceedings of the 1985 Chapel Hill Conference on VLSI, H. Fuchs, E~I., Computer Science Press, 1985.
|
| |
10
|
Rem, M., van de Snepscheut, J.L.A., and Udding, J.T. Trace theory and the definition of hierarchical components. In Proceedings of the Caltech Conference on VLSI, 1983.
|
| |
11
|
|
| |
12
|
|
| |
13
|
Seitz, C.L. System Timing. In Introduction to VLSI Systems, C.A. Mead and L.A. Conway, Eds., Addison-Wesley, 1980.
|
| |
14
|
Sproull, R.F., and Sutherland, I.E. A clipping divider. FJCC 1968, Thompson Books, Washington, D.C., 765.
|
 |
15
|
|
| |
16
|
Sutherland, I.E. Asynchronous queue system, U.S. Patent 4,679,213, July 7, 1987.
|
| |
17
|
Sutherland, I.E., Asynchronous first-in-first-out register structure. US Patent Pending.
|
| |
18
|
Sutherland, I.E. Asynchronous pipelined data processing system. US Patent pending.
|
| |
19
|
Udding, J.T. A formal model for defining assifying delay-insensitive circuits and systems. J. Distrib. Comptg. 1, 1986, 197-2(14.
|
CITED BY 104
|
|
|
|
|
Norbert Imlig , Ryusuke Konishi , Tsunemichi Shiozawa , Kiyoshi Oguri , Kouichi Nagami , Hideyuki Ito , Minoru Inamori , Hiroshi Nakada, Communicating logic: an alternative embedded stream processing paradigm, Proceedings of the 2000 conference on Asia South Pacific design automation, p.317-322, January 2000, Yokohama, Japan
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
M. Shams , J. Ebergen , M. Elmasry, A comparison of CMOS implementations of an asynchronous circuits primitive: the C-element, Proceedings of the 1996 international symposium on Low power electronics and design, p.93-96, August 12-14, 1996, Monterey, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
E. Allier , J. Goulier , G. Sicard , A. Dezzani , E. André , M. Renaudin, A 120nm low power asynchronous ADC, Proceedings of the 2005 international symposium on Low power electronics and design, August 08-10, 2005, San Diego, CA, USA
|
|
|
|
|
|
|
|
|
|
|
|
Abhijit Davare , Kelvin Lwin , Alex Kondratyev , Alberto Sangiovanni-Vincentelli, The best of both worlds: the efficient asynchronous implementation of synchronous specifications, Proceedings of the 41st annual conference on Design automation, June 07-11, 2004, San Diego, CA, USA
|
|
|
|
|
|
|
|
|
Erik Brunvand , Steven Nowick , Kenneth Yun, Practical advances in asynchronous design and in asynchronous/synchronous interfaces, Proceedings of the 36th ACM/IEEE conference on Design automation, p.104-109, June 21-25, 1999, New Orleans, Louisiana, United States
|
|
|
S.-Y. Tan , S. B. Furber , W.-F. Yen, The design of an asynchronous VHDL synthesizer, Proceedings of the conference on Design, automation and test in Europe, p.44-51, February 23-26, 1998, Le Palais des Congrés de Paris, France
|
|
|
Hiroshi Saito , Alex Kondratyev , Jordi Cortadella , Luciano Lavagno , Alexander Yakovlev, What is the cost of delay insensitivity?, Proceedings of the 1999 IEEE/ACM international conference on Computer-aided design, p.316-323, November 07-11, 1999, San Jose, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Kenneth Fazel , Lun Li , Mitch Thornton , Robert B. Reese , Cherrice Traver, Performance enhancement in phased logic circuits using automatic slack-matching buffer insertion, Proceedings of the 14th ACM Great Lakes symposium on VLSI, April 26-28, 2004, Boston, MA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
S. C. Smith , R. F. DeMara , J. S. Yuan , D. Ferguson , D. Lamb, Optimization of NULL convention self-timed circuits, Integration, the VLSI Journal, v.37 n.3, p.135-165, August 2004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
J. Viv Woods , Steve B. Furber , Jim D. Garside , Steve Temple , Paul Day , Nigel C. Paver, AMULET1: An Asynchronous ARM Microprocessor, IEEE Transactions on Computers, v.46 n.4, p.385-398, April 1997
|
|
|
|
|
|
|
|
|
David M. Brooks , Pradip Bose , Stanley E. Schuster , Hans Jacobson , Prabhakar N. Kudva , Alper Buyuktosunoglu , John-David Wellman , Victor Zyuban , Manish Gupta , Peter W. Cook, Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors, IEEE Micro, v.20 n.6, p.26-44, November 2000
|
|
|
|
|
|
|
|
|
Peggy B. McGee , Steven M. Nowick , E. G. Coffman, Jr., Efficient performance analysis of asynchronous systems based on periodicity, Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, September 19-21, 2005, Jersey City, NJ, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Girish Venkataramani , Tiberiu Chelcea , Seth Copen Goldstein , Tobias Bjerregaard, SOMA: a tool for synthesizing and optimizing memory accesses in ASICs, Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, September 19-21, 2005, Jersey City, NJ, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tobias Dubois , Erik Jan Marinissen , Mohamed Azimane , Paul Wielage , Erik Larsson , Clemens Wouters, Test quality analysis and improvement for an embedded asynchronous FIFO, Proceedings of the conference on Design, automation and test in Europe, April 16-20, 2007, Nice, France
|
|
|
Myungsu Choi , Zachary Patitz , Byoungjae Jin , Feng Tao , Nohpill Park , Minsu Choi, Designing layout-timing independent quantum-dot cellular automata (QCA) circuits by global asynchrony, Journal of Systems Architecture: the EUROMICRO Journal, v.53 n.9, p.551-567, September, 2007
|
|
|
|
|
|
Paul Wielage , Erik Jan Marinissen , Michel Altheimer , Clemens Wouters, Design and DfT of a high-speed area-efficient embedded asynchronous FIFO, Proceedings of the conference on Design, automation and test in Europe, April 16-20, 2007, Nice, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sylvain Guilley , Philippe Hoogvorst , Yves Mathieu , Renaud Pacalet , Jean Provost, CMOS Structures Suitable for Secured Hardware, Proceedings of the conference on Design, automation and test in Europe, p.21414, February 16-20, 2004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Harry Frederick Jordan : Reviewer"
Ivan Sutherland's Turing Award lecture is important reading for
computer designers. As used in this work, a
micropipeline is a powerful
combination of the concepts of pipelining, asynchronous sequential
logic
more...
|