ACM Home Page
Please provide us with feedback. Feedback
Improving 3D geometry transformations on a simultaneous multithreaded SIMD processor
Full text PdfPdf (219 KB)
Source International Conference on Supercomputing archive
Proceedings of the 15th international conference on Supercomputing table of contents
Sorrento, Italy
Pages: 236 - 245  
Year of Publication: 2001
ISBN:1-58113-410-X
Authors
Claude Limousin  Laboratoire de Recherche en Informatique, Université Paris-Sud, F-91405 Orsay Cedex
Julien Sebot  Laboratoire de Recherche en Informatique, Université Paris-Sud, F-91405 Orsay Cedex
Alexis Vartanian  Laboratoire de Recherche en Informatique, Université Paris-Sud, F-91405 Orsay Cedex
Nathalie Drach-Temam  Laboratoire de Recherche en Informatique, Université Paris-Sud, F-91405 Orsay Cedex
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 24,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/377792.377839
What is a DOI?

ABSTRACT

In this paper we evaluate the performance of an SMT processor used as the geometry processor for a 3D polygonal rendering engine. To evaluate this approach, we consider PMesa (a parallel version of Mesa) which parallelizes the geometry stage of the 3D pipeline. We show that SMT is suitable for 3D geometry and we characterize the execution of the geometry stage in term of memory hierarchy, which is the main bottleneck. The results show that latency is not fully recovered by SMT; the use of L2 data prefetching does not succeed in increasing the performance. We show that this problem comes from a pollution of the instruction window by the threads experiencing L2 cache misses, thus reducing the window available for the other threads. We thus propose dcPRED, a hardware mechanism to predict L2 misses and control this pollution. Coupled with L2 data prefetching, dcPRED achieves gains up to 21% over the baseline SMT.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Michael Abrash. InsideXbox Graphics. http://www.ddj.com/articles/2000/0008/0008a/0008a.htm, 2000.
2
 
3
Apple. AltiVec Home Page. http://developper.apple.com/hardware/altivec, may 1999.
 
4
Jean-Luc Bechennec. Architecture Simulation Framework. http://www.lri.fr/~osmose, 1998.
 
5
 
6
 
7
 
8
Peter N. Glaskowsky. 3DLabs ies with jetstream. Microprocessor Report, 12(15):20-21, November 1998.
9
 
10
 
11
S. Hily and A. Seznec. Standard memory hierarchy does not at simultaneous multithreading. In Proceedins of the 4thInternational Symposium on High-Performance Computer Architecture, 1998.
12
 
13
 
14
 
15
 
16
 
17
Microsoft. Microsoft DirectX 3 SDK : Direct3D Overview, 1996.
 
18
 
19
Motorola. Motorola's high-performance vector parallel processing expansion to the PowerPC architecture. http://www.motorola.com/SPS/PowerPC/AltiVec/, 1999.
20
 
21
 
22
23
 
24
M. Pontius and N. Bagherzadeh. Multithreaded extensions enhance multimedia performance. In MTEAC 99, Jan 1999.
25
 
26
 
27
 
28
Mark Segal and Kurt Akelay. The OpenGL Graphics System, 1996.
29
30
 
31
Chriss Wynn. Opengl vertex programming on future-generation gpus. Nvidia document, 2000.


Collaborative Colleagues:
Claude Limousin: colleagues
Julien Sebot: colleagues
Alexis Vartanian: colleagues
Nathalie Drach-Temam: colleagues