ACM Home Page
Please provide us with feedback. Feedback
Pangaea: a tightly-coupled IA32 heterogeneous chip multiprocessor
Full text PdfPdf (495 KB)
Source
PACT archive
Proceedings of the 17th international conference on Parallel architectures and compilation techniques table of contents
Toronto, Ontario, Canada
SESSION: CMP architecture design table of contents
Pages 52-61  
Year of Publication: 2008
ISBN:978-1-60558-282-5
Authors
Henry Wong  University of British Columbia, Vancouver, BC, Canada
Anne Bracy  Intel Corporation, Santa Clara, CA, USA
Ethan Schuchman  Intel Corporation, Santa Clara, CA, USA
Tor M. Aamodt  University of British Columbia, Vancouver, BC, Canada
Jamison D. Collins  Intel Corporation, Santa Clara, CA, USA
Perry H. Wang  Intel Corporation, Santa Clara, CA, USA
Gautham Chinya  Intel Corporation, Santa Clara, CA, USA
Ankur Khandelwal Groen  Intel Corporation, Santa Clara, CA, USA
Hong Jiang  Intel Corporation, Santa Clara, CA, USA
Hong Wang  Intel Corporation, Santa Clara, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 24,   Downloads (12 Months): 254,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1454115.1454125
What is a DOI?

ABSTRACT

Moore's Law and the drive towards performance efficiency have led to the on-chip integration of general-purpose cores with special-purpose accelerators. Pangaea is a heterogeneous CMP design for non-rendering workloads that integrates IA32 CPU cores with non-IA32 GPU-class multi-cores, extending the current state-of-the-art CPU-GPU integration that physically "fuses" existing CPU and GPU designs. Pangaea introduces (1) a resource repartitioning of the GPU, where the hardware budget dedicated for 3D-specific graphics processing is used to build more general-purpose GPU cores, and (2) a 3-instruction extension to the IA32 ISA that supports tighter architectural integration and fine-grain shared memory collaborative multithreading between the IA32 CPU cores and the non-IA32 GPU cores. We implement Pangaea and the current CPU-GPU designs in fully-functional synthesizable RTL based on the production quality RTL of an IA32 CPU and an Intel GMA X4500 GPU. On a 65 nm ASIC process technology, the legacy graphics-specific fixed-function hardware has the area of 9 GPU cores and total power consumption of 5 GPU cores. With the ISA extensions, the latency from the time an IA32 core spawns a GPU thread to the time the thread begins execution is reduced from thousands of cycles to fewer than 30 cycles. Pangaea is synthesized on a FPGA-based prototype and runs off-the-shelf IA32 OSes. A set of general-purpose non-graphics workloads demonstrate speedups of up to 8.8x.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
GPGPU: General Purpose Computation using Graphics Hardware. http://www.gpgpu.org.
2
3
4
 
5
6
7
 
8
S. Ghiasi. Aide de Camp: Asymmetric Multi-core Design for Dynamic Thermal Management. Technical Report TR-01-43, 2003.
 
9
E. Grochowski and M. Annavaram. Energy per Instruction Trends in Intel Microprocessors. Technology@Intel Magazine, March 2006.
 
10
 
11
12
13
14
 
15
Intel. G45 Express Chipset. http://www.intel.com/Assets/PDF/prodbrief/319946.pdf.
 
16
Intel. IA Programmers Reference Manual 2008. http://www.intel.com/products/processor/manuals/index.htm.
 
17
Intel. Use MONITOR and MWAIT Streaming SIMD Extensions 3 Instructions. http://softwarecommunity.intel.com/Wiki.
 
18
 
19
20
21
22
23
24
25
 
26
Microsoft. A Roadmap for DirectX. http://msdn.microsoft.com/en-us/library/bb756949.aspx.
 
27
T. Morad, U. Weiser, and A. Kolodny. ACCMP - Asymmetric Cluster Chip-Multiprocessing. Technical Report 488, CCIT, 2004.
 
28
29
30
 
31
Nvidia. Compute Unified Device Architecture (CUDA). http://developer.nvidia.com/object/cuda.html.
 
32
J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Krüger, A. E. Lefohn, and T. J. Purcell. A Survey of General-Purpose Computation on Graphics Hardware. In Eurographics 2005, State of the Art Reports, pages 21--51, Aug. 2005.
 
33
Peakstream Inc. The PeakStream Platform: High Productivity Software Development for Multi-core Processors, 2006.
34
35
 
36
R. Uhlig, R. Fishtein, O. Gershon, I. Hirsh, and H. Wang. SoftSDV: A Pre-silicon Software Development Environment for the IA-64 Architecture. Intel Technology Journal, (Q4):14, 1999.
37
38


Collaborative Colleagues:
Henry Wong: colleagues
Anne Bracy: colleagues
Ethan Schuchman: colleagues
Tor M. Aamodt: colleagues
Jamison D. Collins: colleagues
Perry H. Wang: colleagues
Gautham Chinya: colleagues
Ankur Khandelwal Groen: colleagues
Hong Jiang: colleagues
Hong Wang: colleagues