| Thread fusion |
| Full text |
Pdf
(252 KB)
|
Source
|
International Symposium on Low Power Electronics and Design
archive
Proceeding of the 13th international symposium on Low power electronics and design
table of contents
Bangalore, India
SESSION: Microarchitectural techniques
table of contents
Pages 363-368
Year of Publication: 2008
ISBN:978-1-60558-109-5
|
|
Authors
|
|
José González
|
UPC-Intel Lab Barcelona, Barcelona, Spain
|
|
Qiong Cai
|
UPC-Intel Lab Barcelona, Barcelona, Spain
|
|
Pedro Chaparro
|
UPC-Intel Lab Barcelona, Barcelona, Spain
|
|
Grigorios Magklis
|
UPC-Intel Lab Barcelona, Barcelona, Spain
|
|
Ryan Rakvic
|
United States Naval Academy, Annapolis, Annapolis, MD, USA
|
|
Antonio González
|
UPC-Intel Lab Barcelona, Barcelona, Spain
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 61, Citation Count: 0
|
|
|
ABSTRACT
This work proposes Thread Fusion as an effective way of reducing power consumption when a Simultaneous Multi-Threaded (SMT) core is executing two threads from a homogeneous parallel application. Two dynamic instances of the same static instruction, each from a different thread are merged (fused) into a single instruction, consuming half of the resources of front-end pipeline stages. When the fused instruction is executed, it is cloned and it proceeds at full bandwidth. Our simulation results show average energy reduction of 10% with less than 1% impact on performance.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Vishal Aslot , Max J. Domeika , Rudolf Eigenmann , Greg Gaertner , Wesley B. Jones , Bodo Parady, SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance, Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming, p.1-10, July 30-31, 2001
|
| |
2
|
S.Y. Borkar. Platform 2015: Intel processor and platform evolution for the next decade. Technical report, Intel White Paper, Mar. 2005.
|
| |
3
|
|
 |
4
|
|
 |
5
|
|
| |
6
|
P. Dubey. Recognition, Mining and Synthesis Moves Computers to the Era of Tera. Technology@Intel Magazine. http://www.intel.com/technology/magazine/computing/recognition-mining-synthesis-0205.htm, 2005
|
| |
7
|
M. Ekman, F. Dahlgren and P.Stenstrom. "Evaluation of Snoop-Energy Reduction Techniques for Chip-Multiprocessors". Workshop on Duplicating, Deconstructing and Debunking, 2002.
|
| |
8
|
Frequent Itemset Mining Implementations Repository. http://fimi.cs.helsinki.fi
|
| |
9
|
S. Gochman, R. Ronnen, I. Anati, A. Berkovits, T. Kurts, A. Naveh, A. Saeed, Z. Speerber, R.C. Valentine. "The Intel Pentium M Processor: Microarchitecture and Performance". Intel Technology Journal vol 7(2), 2003.
|
 |
10
|
Richard A. Hankins , Gautham N. Chinya , Jamison D. Collins , Perry H. Wang , Ryan Rakvic , Hong Wang , John P. Shen, Multiple Instruction Stream Processor, Proceedings of the 33rd annual international symposium on Computer Architecture, p.114-127, June 17-21, 2006
|
| |
11
|
Intel Corp., Computer-Intensive Highly Parallel Applications and Uses. Intel Technology Journal, 9(2), May 2005.
|
| |
12
|
|
| |
13
|
A. Jaleel, M. Mattina and B. Jacob. "Last Level Cache (LLC) performance of data-mining workloads on a CMP - A case study of parallel bioinformatics workloads". Proc. International Symposium on high Performance Computing, 2006.
|
| |
14
|
|
| |
15
|
D.J. Kuck. Platform 2015 software: Enabling innovation in parallelism for the next decade. Technical report, Intel White Paper, Mar. 2005.
|
| |
16
|
|
| |
17
|
David López , Josep Llosa , Mateo Valero , Eduard Ayguadé, Widening resources: a cost-effective technique for aggressive ILP architectures, Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture, p.237-246, November 1998, Dallas, Texas, United States
|
 |
18
|
Iván Martel , Daniel Ortega , Eduard Ayguadé , Mateo Valero, Increasing effective IPC by exploiting distant parallelism, Proceedings of the 13th international conference on Supercomputing, p.348-355, June 20-25, 1999, Rhodes, Greece
[doi> 10.1145/305138.305212]
|
 |
19
|
Kunle Olukotun , Basem A. Nayfeh , Lance Hammond , Ken Wilson , Kunyung Chang, The case for a single-chip multiprocessor, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.2-11, October 01-04, 1996, Cambridge, Massachusetts, United States
|
 |
20
|
|
| |
21
|
J. Pisharath, Y. Liu, B. Ozisikyilmaz, R. Narayanan, W. Liao, A. Choudhary, G. Memik.. NU-MineBench Project. http://cucis.ece.northwestern.edu/projects/DMS/MineBench.html
|
| |
22
|
Sun Microsystems. Throughput Computing. Technical report, Sun White Paper, Nov. 2005.
|
 |
23
|
|
| |
24
|
R.Uhlig, R.Fishtein, O. Gershon, I. Hirsh, and H. Wang. "SoftSDV: A Pre-silicon Software Development Environment for the IA-64 Architecture". Intel Technology Journal, Vol 3, Issue 4,1999.
|
 |
25
|
Sriram Vajapeyam , P. J. Joseph , Tulika Mitra, Dynamic vectorization: a mechanism for exploiting far-flung ILP in ordinary programs, Proceedings of the 26th annual international symposium on Computer architecture, p.16-27, May 01-04, 1999, Atlanta, Georgia, United States
|
|