|
ABSTRACT
This paper aims to provide a quantitative understanding of the performance of image and video processing applications on general-purpose processors, without and with media ISA extensions. We use detailed simulation of 12 benchmarks to study the effectiveness of current architectural features and identify future challenges for these workloads.Our results show that conventional techniques in current processors to enhance instruction-level parallelism (ILP) provide a factor of 2.3X to 4.2X performance improvement. The Sun VIS media ISA extensions provide an additional 1.1X to 4.2X performance improvement. The ILP features and media ISA extensions significantly reduce the CPU component of execution time, making 5 of the image processing benchmarks memory-bound.The memory behavior of our benchmarks is characterized by large working sets and streaming data accesses. Increasing the cache size has no impact on 8 of the benchmarks. The remaining benchmarks require relatively large cache sizes (dependent on the display sizes) to exploit data reuse, but derive less than 1.2X performance benefits with the larger caches. Software prefetching provides 1.4X to 2.5X performance improvement in the image processing benchmarks where memory is a significant problem. With the addition of software prefetching, all our benchmarks revert to being compute-bound.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Ravi Bhargava , Lizy K. John , Brian L. Evans , Ramesh Radhakrishnan, Evaluating MMX technology using DSP and multimedia applications, Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture, p.37-46, November 1998, Dallas, Texas, United States
|
| |
3
|
|
| |
4
|
D. A. CarIson et aI. Multimedia Extensions for a 550MHz RISC Microprocessor. In IEEE Journal of Solid-State Circuits, 1997.
|
| |
5
|
Thomas M. Conte , Pradeep K. Dubey , Matthew D. Jennings , Ruby B. Lee , Alex Peleg , Salliah Rathnam , Mike Schlansker , Peter Song , Andrew Wolfe, Challenges to Combining General-Purpose and Multimedia Processors, Computer, v.30 n.12, p.33-37, December 1997
[doi> 10.1109/2.642799]
|
| |
6
|
|
| |
7
|
J. Eyre. Assessing General-Purpose Processors for DSP Applications. Berkeley Design Technology Inc. presentation, 1998.
|
| |
8
|
International Organisation for Standardisation - ISO/IEC JTC I/SC29/WG 11MPEG 98/N2457. MPEG-4 Applications Document, 1998.
|
| |
9
|
E. Killian. MIPS Extension for Digital Media with 3D. Slides presented at Microprocessor Forum, October 1996.
|
| |
10
|
|
| |
11
|
Chunho Lee , Miodrag Potkonjak , William H. Mangione-Smith, MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.330-335, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
S. Oberman et al. AMD 3DNow! Technology and the K6-2 Microprocessor. In HOTCHIPSIO, 1998.
|
| |
16
|
V. S. Pai et al. RSIM: A Simulator for Shared-Memory Multiprocessor and Uniprocessor Systems that Exploit ILP. In Proc. 3rd Workshop on Computer Architecture Education, 1997.
|
| |
17
|
|
| |
18
|
|
| |
19
|
M. Phillip et al. AltiVec Technology: Accelerating Media Processing Across the Spectrum. In HOTCHIPSIO, Aug 1998.
|
 |
20
|
Parthasarathy Ranganathan , Vijay S. Pai , Hazim Abdel-Shafi , Sarita V. Adve, The interaction of software prefetching with ILP processors in shared-memory systems, Proceedings of the 24th annual international symposium on Computer architecture, p.144-156, June 01-04, 1997, Denver, Colorado, United States
|
 |
21
|
Parthasarathy Ranganathan , Kourosh Gharachorloo , Sarita V. Adve , Luiz André Barroso, Performance of database workloads on shared-memory systems with out-of-order processors, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.307-318, October 02-07, 1998, San Jose, California, United States
|
| |
22
|
D. S. Rice. High-Performance Image Processing Using Special-Purpose CPU Instructions: The UltraSPARC Visual Instruction Set. Master's thesis, Stanford University, 1996.
|
| |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
|
CITED BY 37
|
|
Jesus Corbal , Roger Espasa , Mateo Valero, MOM: a matrix SIMD instruction set architecture for multimedia applications, Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM), p.15-es, November 14-19, 1999, Portland, Oregon, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Christopher J. Hughes , Praful Kaul , Sarita V. Adve , Rohit Jain , Chanik Park , Jayanth Srinivasan, Variability in the execution of multimedia applications and implications for architecture, ACM SIGARCH Computer Architecture News, v.29 n.2, p.254-265, May 2001
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jaffrey Draper , J. Tim Barrett , Jeff Sondeen , Sumit Mediratta , Chang Woo Kang , Ihn Kim , Gokhan Daglikoca, A Prototype Processing-In-Memory (PIM) Chip for the Data-Intensive Architecture (DIVA) System, Journal of VLSI Signal Processing Systems, v.40 n.1, p.73-84, May 2005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jan-Willem van de Waerdt , Stamatis Vassiliadis , Sanjeev Das , Sebastian Mirolo , Chris Yen , Bill Zhong , Carlos Basto , Jean-Paul van Itegem , Dinesh Amirtharaj , Kulbhushan Kalra , Pedro Rodriguez , Hans van Antwerpen, The TM3270 Media-Processor, Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, p.331-342, November 12-16, 2005, Barcelona, Spain
|
|
|
|
|
|
Jeff Draper , Jacqueline Chame , Mary Hall , Craig Steele , Tim Barrett , Jeff LaCoss , John Granacki , Jaewook Shin , Chun Chen , Chang Woo Kang , Ihn Kim , Gokhan Daglikoca, The architecture of the DIVA processing-in-memory chip, Proceedings of the 16th international conference on Supercomputing, June 22-26, 2002, New York, New York, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|