ACM Home Page
Please provide us with feedback. Feedback
A model for dataflow based vector execution
Full text PdfPdf (1.28 MB)
Source International Conference on Supercomputing archive
Proceedings of the 8th international conference on Supercomputing table of contents
Manchester, England
Pages: 11 - 22  
Year of Publication: 1994
ISBN:0-89791-665-4
Authors
W. Marcus Miller  IBM Corporation, Networking Systems Division, Research Triangle Park, NC
Walid A. Najjar  Department of Computer Science, Colorado State University, Fort Collins, CO
A. P. Wim Böhm  Department of Computer Science, Colorado State University, Fort Collins, CO
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 24,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/181181.181197
What is a DOI?

ABSTRACT

Although the dataflow model has been shown to allow the exploitation of parallelism at all levels, research of the past decade has revealed several fundamental problems: Synchronization at the instruction level, token matching, coloring and re-labeling operations have a negative impact on performance by significantly increasing the number of non-compute “overhead” cycles. Recently, many novel Hybrid von-Neumann Data Driven machines have been proposed to alleviate some of these problems. The major objective has been to reduce or eliminate unnecesssary synchronization costs through simplified operand matching schemes and increased task granularity. Moreover, the results from recent studies quantifying locality suggest sufficient spatial and temporal locality is present in dataflow execution to merit its exploitation. In this paper we present a data structure for exploiting locality in a data driven environment: the Vector Cell. A Vector Cell consists of a number of fixed length chunks of data elements. Each chunk is tagged with a presence bit, providing intra-chunk strictness and inter-chunk non-strictness to data structure access. We describe the semantics of the model, processor architecture and instruction set as well as a Sisal to dataflow vectorizing compiler back-end. The model is evaluated by comparing its performance to those of both a classical fine-grain dataflow processor employing I-structures and a conventional pipelined vector processor. Results indicate the model is surprisingly resilient to long memory and communication latencies, and is able to dynamically exploit the underlying parallelism across multiple processing elements at run time.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Arvind and R. A. Iannucd. Instruction set definition for a tagged-token data-flow machine. TechnicM Report CSG Memo 212-3, Laboratory for Computer S~ence, MIT, 1983.
 
2
Arvind, R. S. Nikh~, and K. K. Pingali. ~Structures: Data Structures for ParMl~ Computing. MIT Computation Structures Group Memo 269, Laboratory for Computer ScOnce, MIT, 1987.
 
3
P. S. Barth, R. S. Nikh~, and Arvid. M-structures: Extending a paralld, non-strict, functional language with state. Technical Report Computation Structures Group Report 327, M/T, Laboratory for Computer S~ence, Cambridge, MA, March 1991.
 
4
 
5
6
 
7
D. C. Cann. The Optim~ing S~al Compiler: Version 12.0. Techn~al Report CS-92-114, Colorado State UnN versity, 1992.
 
8
9
 
10
 
11
J. B. Dennis and G. R. Gao. Maximum pipelining of array operations on static data-flow machine. In Int. Conf. on Parallel Processing, page 331pp, August 1983.
 
12
J. B. Dennis and K. S. Weng. An abstract imp~mentation for concurent computation with streams. Int. Conf. on Parallel Processin~ pages 35-45, August 1979.
13
 
14
J. J. Dongarra and A. Hinds. Comparison of the Cray X-MP-4, Fujitsu VP-200, and Hitachi S-810/20. IEEE Compute~ 16:289-323, December 1985.
 
15
G. K. Egan, J. J. Webb, and A. P. W. BShm. Some A~ chitectural Features of the CSIRA C H Data-Flow Compute~ Prent~e-Hall, 1991.
 
16
P. Evripidou and J.-L. Gaudiot. The USC Decoupled Multilevel Data-Flow Execution Model Prent~HaH, 1991.
 
17
J. T. Feo. The Livermore Loops in S~al. Techn~ cal Report UCID-21159, Computing Research Group, Lawrence Livermore National Laborator~ Livermore, CA 94550, August 1987.
 
18
D. Gajski, D. A. Padua, D. J. Kuck, and R. H. Kuhn. A second opinion on Data Flow machines and languages. IEEE Compute~ pages 58-69, 1982.
 
19
 
20
J.-L. Gaudiot and W. A. Najjar. Macro-actor execution on multflevd data-driven architectures. In Prgc. of the IFIP Working Group 10. 3 Working Conference on Parallel Processing, Pisa, ItM~ 1988.
 
21
I. Gottheb. Dynam~ Structured Dataflow:Preserving the Advantages of Sequential Proces~ng in a D~ta Driven Envkonment. In Int. Conf. on Parallel Pl~ocessin~ Chicago, 1988.
 
22
V. G. Grafe and J. E. Hoch. Implementation of the EpsHon dataflow processor. In Proc. of the Twentythird Ann. Hawaii Int. Conf. on System Sciences, pages 19-29, 1990. volume 1.
23
24
 
25
Cray Research Inc. CRAY Y-MP C90 Functional Description Manual. Technical Report HR-04028, Cray Research Inc., March 1992.
 
26
J. McGraw, S. Skedzielewski, S. AHan, R. Oldehoeff, J. Glauert, C. Kirkham, B. Noyce, and R. Thomas. SISAL: Streams and Iteration in a Single Assignment Language: reference manual version 1.2. Manual M- 146, Rev. 1, Lawrence Livermore National Laborator~ Livermore, CA, March 1985.
 
27
F. H. McMahon. Livermore FORTRAN kernds: A computer test of numerical performance range. Techng cM Report UCRL-53745, Lawrence Livermore National Laborator~ Livermore, CA, December 1986.
28
 
29
30
 
31
G. M. Papadopoulos. Implementation of a General- Purpose Dataflow M ulfiprocessor. Technical Report TR-432, MIT Laboratory for Computer Sconce, August 1988.
32
33
 
34
J. R. Rice. Prob~ms to test paral}~ and vector languages. TechnicM Report CSD-TR 516, Purdue Univer~t~ 1985.
 
35
J. R. Price. Problems to test parall~ and vector languages- 2. Technical Report CSD-TR 1016, Purdue University, 1990.
36
37
 
38
S. Sak~i, Y. Yamaguchi, K. Hiraki, Y. Kodama, and T. Yuba. Pipefine optimization of a dataflow machine. In J-L. Gaudiot and L. Bic, editors, Advanced Topics in Data-Flow Computing, pages 225-246. Prentice Hall, 1991.
 
39
S. Sake, Y. Yamaguchi, K. H~aki, and T. Yuba. Introduction of a Strongly-Connected-Arc-Mod~ in ~ Data- Driven Single Chip Processor EMC-R. In Data-flow Workshop, IECE, pages 231-238, 1987. In Japanese.
40
 
41
 
42
S. Sekiguchi, K. Hiraki, and T. Shimadn. Effident vector proces~ng on a Dataflow supercomputer SIGMA-1. In Int. Conf. on Parallel Processin~ 1988.
 
43
B. Smith. The end of architecture (keynote address). In Int. Ann. Syrup. on Computer Architecture, May 1990.
44
45

Collaborative Colleagues:
W. Marcus Miller: colleagues
Walid A. Najjar: colleagues
A. P. Wim Böhm: colleagues