|
ABSTRACT
Computational acceleration on graphics processing units (GPUs) can make advanced magnetic resonance imaging (MRI) reconstruction algorithms attractive in clinical settings, thereby improving the quality of MR images across a broad spectrum of applications. At present, MR imaging is often limited by high noise levels, significant imaging artifacts, and/or long data acquisition (scan) times. Advanced image reconstruction algorithms can mitigate these limitations and improve image quality by simultaneously operating on scan data acquired with arbitrary trajectories and incorporating additional information such as anatomical constraints. However, the improvements in image quality come at the expense of a considerable increase in computation. This paper describes the acceleration of an advanced reconstruction algorithm on NVIDIA's Quadro FX 5600. Optimizations such as register allocating the voxel data, tiling the scan data, and storing the scan data in the Quadro's constant memory dramatically reduce the reconstruction's required bandwidth to on-chip memory. The Quadro's special functional units provide substantial acceleration of the trigonometric computations in the algorithm's inner loops, and experimentally-tuned code transformations increase the reconstruction's performance by an additional 20%. The reconstruction of a 3D image with 128^3 voxels ultimately achieves 150 GFLOPS and requires less than two minutes on the Quadro, while reconstruction on a quad-core CPU is thirteen times slower. Furthermore, relative to the true image, the error exhibited by the advanced reconstruction is only 12%, while conventional reconstruction techniques incur error of 42%. In short, the acceleration afforded by the GPU greatly increases the appeal of the advanced reconstruction for clinical MRI applications.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
C. B. Ahn, J. H. Kim, and Z. H. Cho. High-speed spiral-scan echo planar NMR imaging. IEEE Trans. Med. Imag., 5(1):2--7, 1986.
|
| |
2
|
AMD Stream Processor. http://ati.amd.com/products/streamprocessor/index.html.
|
| |
3
|
O. Bockenbach, M. Knaup, and M. Kachelrieß. Implementation of a cone-beam backprojection algorithm on the Cell Broadband Engine processor. In SPIE Medical Imaging 2007: Physics of Medical Imaging, 2007.
|
| |
4
|
I. Buck. Brook Specification v0.2, October 2003.
|
 |
5
|
Brian Cabral , Nancy Cam , Jim Foran, Accelerated volume rendering and tomographic reconstruction using texture mapping hardware, Proceedings of the 1994 symposium on Volume visualization, p.91-98, October 17-18, 1994, Tysons Corner, Virginia, United States
[doi> 10.1145/197938.197972]
|
| |
6
|
Cg. http://developer.nvidia.com/page/cg main.html.
|
 |
7
|
|
| |
8
|
DirectX Developer Center. http://www.msdn.com/directx/.
|
| |
9
|
J. Dongarra. Compressed Row Storage (CRS). http://netlib.org/utk/papers/templates/node91.html.
|
| |
10
|
J. A. Fessler, S. Lee, V. T. Olafsson, H. R. Shi, and D. C. Noll. Toeplitz-based iterative image reconstruction for MRI with correction for magnetic field inhomogeneity. IEEE Trans. Signal Process., 53(9):3393--3402, 2005.
|
| |
11
|
J. A. Fessler and B. P. Sutton. Nonuniform fast Fourier transforms using min-max interpolation. IEEE Trans. Signal Process., 51(2):560--574, 2003.
|
| |
12
|
J. Haldar, D. Hernando, S.-K. Song, and Z.-P. Liang. Anatomically-constrained reconstruction from noisy data. Magnetic Resonance in Medicine (in press).
|
| |
13
|
J. P. Haldar, D. Hernando, M. D. Budde, Q. Wang, S.-K. Song, and Z.-P. Liang. High-resolution MR metabolic imaging. In Proc. IEEE EMBS, pages 4324--4326, 2007.
|
| |
14
|
M. Hestenes and E. Stiefel. Methods of conjugate gradients for solving linear systems. Journal of Research of the National Bureau of Standards, 49(6):409--436, 1952.
|
| |
15
|
J. I. Jackson, C. H. Meyer, D. G. Nishimura, and A. Macovski. Selection of a convolution function for Fourier inversion using gridding. IEEE Trans. Med. Imag., 10(3):473--478, 1991.
|
| |
16
|
T. Jansen, B. von Rymon-Lipinski, N. Hanssen, and E. Keeve. Fourier volume rendering on the GPU using a split-stream FFT. 9th International Fall Workshop on Vision, Modeling, and Visualization, 2004.
|
| |
17
|
C. Koay, J. Sarlls, and E. Ozarslan. Three dimensional analytical magnetic resonance imaging phantom in the Fourier domain. Magn. Reson. Med., 58:430--436, 2007.
|
| |
18
|
P. C. Lauterbur. Image formation by induced local interactions: Examples employing nuclear magnetic resonance. Nature, 242:190--191, 1973.
|
| |
19
|
K. Mueller, F. Xu, and N. Neophytou. Why do commodity graphics hardware boards (GPUs) work so well for acceleration of computed tomography? In SPIE Electronic Imaging 2007, Computational Imaging V Keynote, 2007.
|
| |
20
|
K. Mueller and R. Yagel. Rapid 3-D cone-beam reconstruction with the simultaneous algebraic reconstruction technique (SART) using 2-D texture mapping hardware. IEEE Transactions on Medical Imaging, 19(12):1227--1237, 2000.
|
| |
21
|
J. Nickolls and I. Buck. NVIDIA CUDA software and GPU parallel computing architecture. Microprocessor Forum, May 2007.
|
| |
22
|
NVIDIA Corporation. CUDA CUFFT Library, version 1.1, 2007.
|
| |
23
|
NVIDIA Corporation. NVIDIA CUDA Programming Guide, version 1.1, 2007.
|
| |
24
|
J. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Krüger, A. Lefohn, and T. Purcell. A survey of general-purpose computation on graphics hardware. Computer Graphics Forum, 26(1):80--113, March 2007.
|
| |
25
|
K. P. Pruessmann, M. Weiger, P. Börnert, and P. Boesiger. Advances in sensitivity encoding with arbitrary k-space trajectories. Magn. Res. Med., 46(4):638--651, 2001.
|
 |
26
|
Shane Ryoo , Christopher I. Rodrigues , Sara S. Baghsorkhi , Sam S. Stone , David B. Kirk , Wen-mei W. Hwu, Optimization principles and application performance evaluation of a multithreaded GPU using CUDA, Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, February 20-23, 2008, Salt Lake City, UT, USA
[doi> 10.1145/1345206.1345220]
|
| |
27
|
S. Ryoo, C. Rodrigues, S. Stone, S. Baghsorkhi, S. Ueng, and W. Hwu. Program optimization study on a 128-core GPU. First Workshop on General Purpose Processing on Graphics Processing Units (GPGPU), 2007.
|
 |
28
|
Shane Ryoo , Christopher I. Rodrigues , Sam S. Stone , Sara S. Baghsorkhi , Sain-Zee Ueng , John A. Stratton , Wen-mei W. Hwu, Program optimization space pruning for a multithreaded gpu, Proceedings of the sixth annual IEEE/ACM international symposium on Code generation and optimization, April 05-09, 2008, Boston, MA, USA
[doi> 10.1145/1356058.1356084]
|
| |
29
|
M. Sakamoto and M. Murase. Parallel implementation for 3-D CT image reconstruction on Cell Broadband Engine. In International Conference on Multimedia and Expo, 2007.
|
| |
30
|
T. Schiwietz, T. Chang, P. Speier, and R. Westermann. MR image reconstruction using the GPU. In SPIE Medical Imaging 2006, 2006.
|
| |
31
|
H. Schomberg and J. Timmer. The gridding method for image reconstruction by Fourier transformation. IEEE Trans. Med. Imag., 14(3):596{607, 1995. IEEE
|
| |
32
|
M. Segal and K. Akeley. The OpenGL Graphics System: A Specification (Version 2.0). Silicon Graphics, Inc., October 2004.
|
| |
33
|
T. Sørensen, T. Schae ter, K. Noe, and M. Hansen. Accelerating the non-equispaced fast Fourier transform on commodity graphics hardware. IEEE Transactions on Medical Imaging (in press).
|
| |
34
|
S. Stone, H. Yi, J. Haldar, W. Hwu, B. Sutton, and Z. Liang. How GPUs can improve the quality of magnetic resonance imaging. First Workshop on General Purpose Processing on Graphics Processing Units (GPGPU), 2007.
|
| |
35
|
T. Sumanaweera and D. Liu. Medical image reconstruction with the FFT. In M. Pharr, editor, GPU Gems 2: Programming Techniques for High-Performance Graphics and General-Purpose Computation, pages 765--784. Addison-Wesley, March 2005.
|
| |
36
|
B. P. Sutton, D. C. Noll, and J. A. Fessler. Fast, iterative image reconstruction for MRI in the presence of field inhomogeneities. IEEE Trans. Med. Imag., 22(2):178--188, 2003.
|
 |
37
|
|
| |
38
|
|
| |
39
|
F. T. A. W. Wajer. Non-Cartesian MRI Scan Time Reduction through Sparse Sampling. PhD thesis, Technische Universiteit Delft, Delft, Netherlands, 2001.
|
| |
40
|
X. Xue, A. Cheryauka, and D. Tubbs. Acceleration of uoro-CT reconstruction for a mobile C-Arm on GPU and FPGA hardware: A simulation study. In SPIE Medical Imaging 2006, 2006.
|
CITED BY 2
|
|
S. S. Stone , J. P. Haldar , S. C. Tsao , W. -m. W. Hwu , B. P. Sutton , Z. -P. Liang, Accelerating advanced MRI reconstructions on GPUs, Journal of Parallel and Distributed Computing, v.68 n.10, p.1307-1318, October, 2008
|
|
|
|
|