| A cost effective architecture for vectorizable numerical and multimedia applications |
| Full text |
Pdf
(294 KB)
|
| Source
|
ACM Symposium on Parallel Algorithms and Architectures
archive
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
table of contents
Crete Island, Greece
Pages: 103 - 112
Year of Publication: 2001
ISBN:1-58113-409-6
|
|
Authors
|
|
Francisca Quintana
|
Departamento de Informatica y Sistemas, Universidad de Las Palmas de Gran Canaria, Islas Canarias, Spain
|
|
Jesus Corbal
|
Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Barcelona, Spain
|
|
Roger Espasa
|
Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Barcelona, Spain
|
|
Mateo Valero
|
Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Barcelona, Spain
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 1, Downloads (12 Months): 15, Citation Count: 0
|
|
|
ABSTRACT
This paper analyzes the performance of vector-dominated regions of code in numerical and multimedia applications in a superscalar+vector architecture and compares it to an 8-way superscalar processor. The ability to split a program's execution into scalar and vector regions allows us to show that (1) as expected, the vector unit is much better than the wide issue superscalar at executing the vector-dominated regions of the code; (2) on the scalar regions, the 8-way superscalar, although better than a 4-way superscalar, is clearly not worth the extra complexity in terms of extra transistors and potential cycle time limitations. Overall, the vector-enhanced superscalar is from 6% to 303% better than an 8-way superscalar. We also present detailed data on the performance of the memory system, which is usually the key limiting factor when running numerical and multimedia applications. We evaluate two additional cache designs that try to alleviate problems created by non-unit stride memory references.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Joel Emer. Simultaneous Multithreading: Multiplying Alpha's Performance. Presentation at the MicroProcessor Forum'99, October,1999.
|
| |
2
|
K. Diefendorff. Power4 Focuses on Memory Bandwidth. MicroProcessor Report, pages 11-17, October, 1999.
|
| |
3
|
Harsh Sharangpani. Intel Itanium Processor Microarchitecture Overview. Presentation at the MicroProcessor Forum'99, October,1999.
|
| |
4
|
|
 |
5
|
Francisca Quintana , Jesus Corbal , Roger Espasa , Mateo Valero, Adding a vector unit to a superscalar processor, Proceedings of the 13th international conference on Supercomputing, p.1-10, June 20-25, 1999, Rhodes, Greece
[doi> 10.1145/305138.305148]
|
| |
6
|
Peter Bannon. Alpha 21364:A Scalable Single-chip SMP. Technical Report, http://www.digital.com/alphaoem/microprocessorforum.htm, Compaq Computer Corporation, 1998.
|
| |
7
|
|
 |
8
|
Thomas M. Conte , Kishore N. Menezes , Patrick M. Mills , Burzin A. Patel, Optimization of instruction fetch mechanisms for high issue rates, Proceedings of the 22nd annual international symposium on Computer architecture, p.333-344, June 22-24, 1995, S. Margherita Ligure, Italy
|
| |
9
|
Chunho Lee , Miodrag Potkonjak , William H. Mangione-Smith, MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.330-335, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
10
|
Roger Espasa , Mateo Valero , James E. Smith, Out-of-order vector architectures, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.160-170, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
11
|
|
| |
12
|
Jude A. Rivers , Gary S. Tyson , Edward S. Davidson , Todd M. Austin, On high-bandwidth data cache design for multi-issue processors, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.46-56, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
|