|
ABSTRACT
We perform a detailed flop and bandwidth analysis of Jos Stam's Stable Fluids algorithm on the CPU, GPU, and Cell. In all three cases, we find that the algorithm is bandwidth bound, with the cores sitting idle up to 96% of the time. Knowing this, we propose two modifications to accelerate the algorithm. First, a Mehrstellen discretization for the pressure solver which reduces the running time of the solver by a third. Second, a static caching scheme that eliminates roughly 99% of the random lookups in the advection stage. We observe a 2x speedup in the advection stage using this scheme. Both modifications apply equally well to all three architectures.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Collatz, L. 1960. The Numerical Treatment of Differential Equations. Springer-Verlag.
|
| |
3
|
Crane, K., Tariq, S., and Llamas, I. 2007. GPU Gems 3. ch. Real-time Simulation and Rendering of 3D Fluids.
|
| |
4
|
|
| |
5
|
|
 |
6
|
|
 |
7
|
|
| |
8
|
Nolan Goodnight , Cliff Woolley , Gregory Lewin , David Luebke , Greg Humphreys, A multigrid solver for boundary value problems using programmable graphics hardware, Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, July 26-27, 2003, San Diego, California
|
| |
9
|
|
| |
10
|
Harada, T., Koshizuka, S., and Kawaguchi, Y. 2007. Smoothed particle hydrodynamics on gpus. In Computer Graphics International, 63--70.
|
 |
11
|
|
 |
12
|
|
| |
13
|
IBM, 2007. Cell broadband engine programming handbook. http://www-01.ibm.com/chips/techlib/techlib.nsf/products/Cell_Broadband_Engine.
|
| |
14
|
Intel, 2007. Dual-core intel xeon processor 5100 series datasheet. http://www.intel.com/design/xeon/datashts/313355.htm.
|
 |
15
|
Byungmoon Kim , Yingjie Liu , Ignacio Llamas , Xiangmin Jiao , Jarek Rossignac, Simulation of bubbles in foam with the volume control method, ACM SIGGRAPH 2007 papers, August 05-09, 2007, San Diego, California
|
| |
16
|
|
 |
17
|
|
| |
18
|
Lax, P., and Wendroff, B. 1960. Systems of conservation laws. Communications on Pure and Applied Mathematics, 217--237.
|
| |
19
|
Li, W., Wei, X., and Kaufman, A. 2003. Implementing lattice boltzmann computation on graphics hardware. The Visual Computer, 444--456.
|
| |
20
|
|
| |
21
|
Nvidia, 2007. Geforce 8 series. http://www.nvidia.com/page/geforce8.html.
|
| |
22
|
Nvidia. 2007. Nvidia CUDA Programming Guide.
|
| |
23
|
SLEPc, 2007. Scalable library for eigenvalue problem computations. http://www.grycap.upv.es/slepc/.
|
| |
24
|
Smith, R., 2007. Ode: Open dynamics engine. http://www.ode.org/.
|
| |
25
|
|
| |
26
|
Stam, J. 2003. Real-time fluid dynamics for games. In Proceedings of the Game Developer Conference.
|
|