|
ABSTRACT
Current MMX-like extensions provide a mechanism for general purpose processors to meet the growing performance demand of multimedia applications. However, the computing performance of these extensions is often limited because they only operate on a single data stream. To overcome this obstacle, this paper presents an architecture named "multi-streaming SIMD architecture" that enables one SIMD instruction to simultaneously manipulate multiple data streams. The proposed architecture is a Processor-In-Memory-like register-file architecture including SIMD operating logics for general-purposed processors to further extend current MMX-like extensions to obtain high performance. To efficiently and flexibly realize the proposed architecture, an operation cell is designed by fusing the logic gates and the storage cells together. The operation cells then are used to compose a register file with the ability of performing SIMD operations called "Multimedia Operation Storage Unit (MOSU)". Further, many MOSUs are used to compose a multi-streaming SIMD computing engine that can simultaneously manipulate multiple data streams and exploit the subword parallelisms of the elements in each data stream. Three instruction modes (global, coupling, and isolated modes) are defined for the MMX-like extensions to modulate the amount of parallel data streams and to efficiently utilize the computation resources. Simulation results show that when the multi-streaming SIMD architecture has four 4-register MOSUs, it provides a factor of 3.3x to 5.5x performance improvement compared with Intel's MMX extensions on eleven multimedia kernels.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
AMD Corp., "AMD Stream Computing, Sep. 2008. DOI=http://ati.amd.com/technology/streamcomputing/sdkdwnld.html
|
| |
11
|
Brucek Khailany , William J. Dally , Ujval J. Kapasi , Peter Mattson , Jinyung Namkoong , John D. Owens , Brian Towles , Andrew Chang , Scott Rixner, Imagine: Media Processing with Streams, IEEE Micro, v.21 n.2, p.35-46, March 2001
[doi> 10.1109/40.918001]
|
| |
12
|
William J. Dally , Francois Labonte , Abhishek Das , Patrick Hanrahan , Jung-Ho Ahn , Jayanth Gummaraju , Mattan Erez , Nuwan Jayasena , Ian Buck , Timothy J. Knight , Ujval J. Kapasi, Merrimac: Supercomputing with Streams, Proceedings of the 2003 ACM/IEEE conference on Supercomputing, p.35, November 15-21, 2003
|
| |
13
|
Michael Gschwind , H. Peter Hofstee , Brian Flachs , Martin Hopkins , Yukio Watanabe , Takeshi Yamazaki, Synergistic Processing in Cell's Multicore Architecture, IEEE Micro, v.26 n.2, p.10-24, March 2006
[doi> 10.1109/MM.2006.41]
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
 |
17
|
|
| |
18
|
|
| |
19
|
P.T. Hulina, L.D. Coraor, L. Kurian, and E. John, Design and VLSI Implementation of an Address Generation Coprocessor, in IEE Proc. Computers and Digital Techniques, vol. 142, no. 2, pp. 145--151, Mar.1995.
|
| |
20
|
Intel Corp, "Using MMX Instructions to Convert RGB to YUV Color Conversion, MMX? Technology Manuals and Application Notes. DOI=http://softwarecommunity.intel.com/articles/eng/1713.htm
|
| |
21
|
NVIDIA Corp, "CUDA Programming Guide 1.1," 2007. DOI=http://developer.download.nvidia.com/compute/cuda/1_0/NVIDIA_CUDA_Programming_Guide_1.0.pdf
|
| |
22
|
I. Buck, Brook Specification v0.2," 2003. DOI=http://merrimac.stanford.edu/brook/
|
| |
23
|
S. Rixner, W.J. Dally, B. Khailany, P. Mattson, U.J. Kapasi, and J.D. Owens, Register organization for media processing," in Sixth International Symposium on High-Performance Computer Architecture, Jan.2000, pp. 375 -- 386.
|
| |
24
|
B. Froba, and A. Ernst, "Face detection with the modified census transform," in IEEE International Conference on Automatic Face and Gesture Recognition, May. 2004, pp. 91 -- 96.
|
| |
25
|
Ze-Nian Li and Mark S. Drew, Fundamentals of Multimedia, Prentice-Hall, 2004.
|
| |
26
|
|
| |
27
|
Intel Corp, Intel SSE4 programming Reference, May 2007. DOI=http://www.intel.com
|
| |
28
|
|
| |
29
|
|
| |
30
|
Intel Corp, MMX? Technology Manuals and Application Notes.DOI=http://softwarecommunity.intel.com/articles/eng/1713.htm
|
| |
31
|
Intel Corp, "Intel XScale(R) Core Developer's Manual," DOI=http://www.intel.com/design/intelxscale/273473.htm
|
| |
32
|
Intel Corp, "Intel 64 and IA32 Architecture Optimization Reference manual," November 2007. DOI=http://www.intel.com
|
|