| Allowing for ILP in an embedded Java processor |
| Full text |
Pdf
(294 KB)
|
| Source
|
International Symposium on Computer Architecture
archive
Proceedings of the 27th annual international symposium on Computer architecture
table of contents
Vancouver, British Columbia, Canada
Pages: 294 - 305
Year of Publication: 2000
ISBN:1-58113-232-8
Also published in ...
|
|
Authors
|
|
Ramesh Radhakrishnan
|
Laboratory for Computer Architecture, Electrical and Computer Engineering Department, The University of Texas at Austin, Austin, Texas
|
|
Deependra Talla
|
Laboratory for Computer Architecture, Electrical and Computer Engineering Department, The University of Texas at Austin, Austin, Texas
|
|
Lizy Kurian John
|
Laboratory for Computer Architecture, Electrical and Computer Engineering Department, The University of Texas at Austin, Austin, Texas
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 11, Downloads (12 Months): 25, Citation Count: 5
|
|
|
ABSTRACT
Java processors are ideal for embedded and network computing applications such as Internet TV's, set-top boxes, smart phones, and other consumer electronics applications. In this paper, we investigate cost-effective microarchitectural techniques to exploit parallelism in Java bytecode streams. Firstly, we propose the use of a fill unit that stores decoded bytecodes into a decoded bytecode cache. This mechanism improves the fetch and decode bandwidth of Java processors by 2 to 3 times. These additional hardware units can also be used to perform optimizations such as instruction folding. This is particularly significant because experiments with the Verilog model of Sun Microsystems pico Java-II core demonstrates that instruction folding lies in the critical path. Moving folding logic from the critical path of the processor to the fill unit allows to improve the clock frequency by 25%. Out-of-order ILP exploitation is not investigated due to the prohibitive cost, but in-order dual-issue with a 64-entry decoded bytecode cache is seen to result in 10% to 14% improvement in execution cycles. Another contribution of the paper is a stack disambiguation technique that allows elimination of false dependencies between different types of stack accesses. Stack disambiguation further exposes parallelism and a dual in-order issue microengine with a 64-entry bytecode cache yields an additional 10% reduction in cycles, leading to an aggregate reduction of 17% to 24% in execution cycles.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Theodore H. Romer , Dennis Lee , Geoffrey M. Voelker , Alec Wolman , Wayne A. Wong , Jean-Loup Baer , Brian N. Bershad , Henry M. Levy, The structure and performance of interpreters, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.150-159, October 01-04, 1996, Cambridge, Massachusetts, United States
|
| |
2
|
Timothy Cramer , Richard Friedman , Terrence Miller , David Seberger , Robert Wilson , Mario Wolczko, Compiling Java Just in Time, IEEE Micro, v.17 n.3, p.36-43, May 1997
[doi> 10.1109/40.591653]
|
| |
3
|
A. Wolfe, "First Java-specific chip takes wing," Electronic Engineering Times, April 1997. http://www, t echweb, corn/wire / news / 1997 / 09 / 0922j ava- .html.
|
| |
4
|
|
| |
5
|
|
| |
6
|
"SPEC JVM 98 Results." http://www.spec.org/osg/jvm98/results/jvm98.html.
|
| |
7
|
"picoJava Technology FAQ." http://www, sun. corn / micro elect ronics / communitysource / picoj ava / t echfaq, ht ml.
|
| |
8
|
L.-C. Chang, L.-R. Ton, M.-F. Kao and C.-P. Chung, "Stack operations folding in Java processors," IEE proceedings on Computers and Digital Techniques, vol. 145, pp. 333-340, Sept 1998.
|
| |
9
|
M. Tremblay, "An Architecture for the New Millenium," in Proceedings of Hot Chips 11, August 1999.
|
| |
10
|
"Community Source Licensing for picoJava Technology." http://www, sun. corn / micro elect ronics / communitysource- /picojava/.
|
 |
11
|
|
 |
12
|
|
 |
13
|
Y. N. Patt , W. M. Hwu , M. Shebanow, HPS, a new microarchitecture: rationale and introduction, Proceedings of the 18th annual workshop on Microprogramming, p.103-108, December 03-06, 1985, Pacific Grove, California, United States
|
 |
14
|
Y. N. Patt , S. W. Melvin , W. M. Hwu , M. C. Shebanow, Critical issues regarding HPS, a high performance microarchitecture, Proceedings of the 18th annual workshop on Microprogramming, p.109-116, December 03-06, 1985, Pacific Grove, California, United States
|
 |
15
|
Y. N. Patt , S. W. Melvin , W. M. Hwu , M. C. Shebanow , C. Chen, Run-time generation of HPS microinstructions from a VAX instruction stream, Proceedings of the 19th annual workshop on Microprogramming, p.75-81, October 15-17, 1986, New York, New York, United States
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
| |
19
|
N. Vijaykrishnan, Issues in the Design of a Java Processor Architecture. PhD thesis, College of Engineering, University of South Florida, Tampa, FL 33620, July 1998.
|
| |
20
|
|
| |
21
|
R. Radhakrishnan, N. Vijaykrishnan, L. John and A. Sivasubramanium, "Architectural issues in java runtime systerns," in Proceedings of 6th International Symposium on High-Performance Computer Architecture (HPCA-6), pp. 387-398, January 2000.
|
| |
22
|
A. Barisone, F. Bellotti, R. Berta, and A. De Gloria, "Instruction Level Characterization of Java Virtual Machine Workload," in Digest of Workshop on Workload Characterization (WWC-99), 1999.
|
| |
23
|
|
| |
24
|
Sun Microsystems, picoJava-II Microarchitecture Guide, March 1999.
|
| |
25
|
"Synopsys Online Documentation," Guidelines and Practices for Synthesis v.1997-08.
|
| |
26
|
J. Rubio, "Characterization of java application at the bytecode level," Master's thesis, Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX 78712, May 1999.
|
| |
27
|
Sun Microsystems, picoJava-II Programmer's Reference Manual, March 1999.
|
| |
28
|
"SPEC JVM98 Benchmarks." http: / / www. sp ec. o rg / osg/jvm 98/.
|
|