ACM Home Page
Please provide us with feedback. Feedback
Coherent network interfaces for fine-grain communication
Full text PdfPdf (1.72 MB)
Source International Symposium on Computer Architecture archive
Proceedings of the 23rd annual international symposium on Computer architecture table of contents
Philadelphia, Pennsylvania, United States
Pages: 247 - 258  
Year of Publication: 1996
ISBN:0-89791-786-3
Also published in ...
Authors
Shubhendu S. Mukherjee  Computer Sciences Department, University of Wisconsin-Madison, Madison, Wisconsin
Babak Falsafi  Computer Sciences Department, University of Wisconsin-Madison, Madison, Wisconsin
Mark D. Hill  Computer Sciences Department, University of Wisconsin-Madison, Madison, Wisconsin
David A. Wood  Computer Sciences Department, University of Wisconsin-Madison, Madison, Wisconsin
Sponsors
IEEE-CS\TCCA : TC on Computer Arhitecture
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 22,   Citation Count: 10
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/232973.232999
What is a DOI?

ABSTRACT

Historically, processor accesses to memory-mapped device registers have been marked uncachable to insure their visibility to the device. The ubiquity of snooping cache coherence, however, makes it possible for processors and devices to interact with cachable, coherent memory operations. Using coherence can improve performance by facilitating burst transfers of whole cache blocks and reducing control overheads (e.g., for polling).This paper begins an exploration of network interfaces (NIs) that use coherence---coherent network interfaces (CNIs)---to improve communication performance. We restrict this study to NI/CNIs that reside on coherent memory or I/O buses, to NI/CNIs that are much simpler than processors, and to the performance of fine-grain messaging from user process to user process.Our first contribution is to develop and optimize two mechanisms that CNIs use to communicate with processors. A cachable device register---derived from cachable control registers [39,40]---is a coherent, cachable block of memory used to transfer status, control, or data between a device and a processor. Cachable queues generalize cachable device registers from one cachable, coherent memory block to a contiguous region of cachable, coherent blocks managed as a circular queue.Our second contribution is a taxonomy and comparison of four CNIs with a more conventional NI. Microbenchmark results show that CNIs can improve the round-trip latency and achievable bandwidth of a small 64-byte message by 37% and 125% respectively on the memory bus and 74% and 123% respectively on a coherent I/O bus. Experiments with five macrobenchmarks show that CNIs can improve the performance by 17-53% on the memory bus and 30-88% on the I/O bus.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
4
5
6
 
7
B.R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swamintathan, and M. Karplus. Charmm: A program for macromolecular energy, rmmrmzation, and dynamics calculatlon. Journal of Computational Chemtstry, 4(187), 1983.
 
8
Doug Burger and San.lay Mehta. Parallelizing Appbt for a Shared-Memory Multiprocessor. Technical Report 1286, Computer Sciences Department, University of Wisconsin-Madison, September 1995.
9
 
10
 
11
 
12
Fred Chong, Shamik Sharma, Eric Brewer, and Joel Saltz. Multiprocessor Runtame Support for Irregular DAGs. In R. Kalia and P. Vashishta, editors, Toward TerafIop Computing and New Grand Challenge Apphcations. Nova Science Pulishers, Inc., 1995.
13
14
 
15
William J. Dally, Andrew Chien, Stuart Fiske, Waldemar Horwat, John Keen, Michael Lanvee, Rich Nuth, Scott Wills, Paul Carrick, and Greg Flyer. The J- Machine. A Fine-Grain Concurrent Computer. In G. X. Ritter, editor, Proc. Information Processing 89. Elsevier North-Holland, Inc., 1989.
 
16
17
 
18
PCI Special Interest Group PCI Local Bus Specificatton. Revzszon 2 1, 1995.
19
20
21
 
22
 
23
MIPS Technologies Inc. MIPS RIO000 Microprocessor User's Manual, 1995.
 
24
Sun Mlcrosystems Inc. SPARC MBus Interface Specification, April 1991.
25
26
27
28
29
 
30
 
31
Lok Tin Liu and David E Culler. Evaluatmn of the intel Paragon on Actave Message Communication In Proceedings of InteI Supercomputer Users Group Conference. June 1995.
 
32
 
33
Meiko World Inc. Computing Surface 2: Overview Documentation Set, 1993.
34
35
 
36
Robert W. Pfile Typhoon-Zero Implementation: The Vortex Module Technical report, Computer Sciences Department, University of Wisconsin-Madison, 1995.
 
37
Steven K. Remhardt. Tempest Interface Specification (Revismn 1.2.1) Technical Report 1267, Computer Sciences Department, University of Wisconsin-Madison, February 1995.
38
 
39
Steven K. Reinhardt, Robert W. Pfile, and David Wood. Typhoon-0: Hardware Support for Distributed Shared on a Network of Workstations Memory. In Workshop on Scalable Shared-Memory Muttiprocessors, 1995
40
41
 
42
SPARC Technology Business. UItraSPARC-I User's Manual, Revision 1.0, September 1995.
43
 
44
Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary, 199 I.
45
 
46
 
47
 
48

CITED BY  10
 
 
 

Collaborative Colleagues:
Shubhendu S. Mukherjee: colleagues
Babak Falsafi: colleagues
Mark D. Hill: colleagues
David A. Wood: colleagues