ACM Home Page
Please provide us with feedback. Feedback
Synchronized access to streams in SIMD vector multiprocessors
Full text PdfPdf (1.07 MB)
Source International Conference on Supercomputing archive
Proceedings of the 8th international conference on Supercomputing table of contents
Manchester, England
Pages: 23 - 32  
Year of Publication: 1994
ISBN:0-89791-665-4
Authors
Montse Peiron  Department d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Campus Nerd - Mòdul D6, cr. Gran Capità s/núm, 08071- Barcelona, SPAIN
Mateo Valero  Department d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Campus Nerd - Mòdul D6, cr. Gran Capità s/núm, 08071- Barcelona, SPAIN
Eduard Ayguadé  Department d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Campus Nerd - Mòdul D6, cr. Gran Capità s/núm, 08071- Barcelona, SPAIN
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 12,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/181181.181204
What is a DOI?

ABSTRACT

The synchronized and simultaneous access to several vectors that form a single stream is typical in SIMD vector multiprocessors as well as in MIMD superscalar multiprocessors with decoupled access. In this paper we propose a block-interleaved storage scheme and an out-of-order access mechanism that allows conflict-free access to streams with an arbitrary initial address and constant stride between elements. The memory system can have any degree of unmatchness and we consider the use of either a crossbar or a multistage interconnection network. A maximal number of conflict-free families including the most commonly used strides can be obtained. We describe the hardware for address calculation and control and show that their additional costs are minimal compared with the cost of the hardware for in-order access. Finally, we evaluate the applicability of this technique to real loops from some programs of the Perfect Club and SPEC suites.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
P. Budnik and D.J. Kuck, "The Organization and Use of Parallel Memories", IEEE Trans. on Computers, vol. 20, no. 12 pp. 1566-1569, 1971.
 
2
J. Frailong, W. Jalby and J. Lenfant "XOR-schemes: A Flexible Data Organization Parallel Memories", Int. Conf. on Parallel Processing, pp. 276-283, 1985.
 
3
4
5
 
6
7
 
8
 
9
10
 
11
D.A. Calahan and D.H. Barley, "Measurement and Analysis of Memory Conflicts on Vector Multiprocessors", Performance Evaluation of Supercomputers, Elsevier Science Publishers, pp. 83-106, 1988
 
12
 
13
 
14
15
 
16
D.A. Calahan, "Characterization of Memory Conflict Loading on the Cray-2" Int. Conf. on Parallel Processing, pp. 299- 302,1988.
17
18
19
 
20
H. Tamura Y. Shinnkai and F. Isobe "The Supercomputer FACOM VP System", Fujisu Techical Journal, 1985.
 
21
 
22
M. Peiron, M. Valero, E. Ayguade and T. Lang, "Conflict- Free Acess Streams in Multiprocessor Systems" Research Report DAC RR-93/04, 1993.
 
23
D.H. Lawrie "Access and Afignment of Data in an Array Processor", IEEE Trans. on Computers, vol. 24, no. 12, pp. 1145-1155, 1975.
 
24
R. Espasa et al, "Quantitative Analysis of Vector Code", Research Report CEPBA/UPC, 1994.


Collaborative Colleagues:
Montse Peiron: colleagues
Mateo Valero: colleagues
Eduard Ayguadé: colleagues

Peer to Peer - Readers of this Article have also read: