ACM Home Page
Please provide us with feedback. Feedback
High-bandwidth network memory system through virtual pipelines
Full text PdfPdf (1.48 MB)
Source IEEE/ACM Transactions on Networking (TON) archive
Volume 17 ,  Issue 4  (August 2009) table of contents
Pages 1029-1041  
Year of Publication: 2009
ISSN:1063-6692
Authors
Banit Agrawal  Computer Science Department, University of California, Santa Barbara, Santa Barbara, CA
Timothy Sherwood  Computer Science Department, University of California, Santa Barbara, Santa Barbara, CA
Publisher
IEEE Press  Piscataway, NJ, USA
Bibliometrics
Downloads (6 Weeks): 56,   Downloads (12 Months): 56,   Citation Count: 0
Additional Information:

abstract   references   index terms  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: 10.1109/TNET.2008.2008646

ABSTRACT

As network bandwidth increases, designing an effective memory system for network processors becomes a significant challenge. The size of the routing tables, the complexity of the packet classification rules, and the amount of packet buffering required all continue to grow at a staggering rate. Simply relying on large, fast SRAMs alone is not likely to be scalable or cost-effective. Instead, trends point to the use of low-cost commodity DRAM devices as a means to deliver the worst-case memory performance that network data-plane algorithms demand. While DRAMs can deliver a great deal of throughput, the problem is that memory banking significantly complicates the worst-case analysis, and specialized algorithms are needed to ensure that specific types of access patterns are conflict-free.

We introduce virtually pipelined memory, an architectural technique that efficiently supports high bandwidth, uniform latency memory accesses, and high-confidence throughput even under adversarial conditions. Virtual pipelining provides a simple-to-analyze programming model of a deep pipeline (deterministic latencies) with a completely different physical implementation (a memory system with banks and probabilistic mapping). This allows designers to effectively decouple the analysis of their algorithms and data structures from the analysis of the memory buses and banks. Unlike specialized hardware customized for a specific data-plane algorithm, our system makes no assumption about the memory access patterns. We present a mathematical argument for our system's ability to provably provide bandwidth with high confidence and demonstrate its functionality and area overhead through a synthesizable design. We further show that, even though our scheme is general purpose to support new applications such as packet reassembly, it outperforms the state-of-the-art in specialized packet buffering architectures.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
A. Nikologiannis and M. Katevenis, "Efficient per-flow queueing in DRAM at OC-192 line rate using out-of-order execution techniques," in Proc. IEEE Int. Conf. Commun. (ICC'2001), Helsinki, Finland, Jun. 2001, pp. 2048-2052.
 
3
S. Iyer, R. R. Kompella, and N. McKeown, "Designing packet buffers for router linecards," Stanford Univ., Tech. Rep. TR02-HPNG-031001, Nov. 2002.
 
4
5
6
 
7
 
8
 
9
 
10
 
11
B. K. Mathew, S. A. McKee, J. B. Carter, and A. Davis, "Design of a parallel vector access unit for SDRAM memory systems," in Proc. 6th Int. Symp. High-Perform. Comput. Archit. (HPCA'00), 2000, pp. 39-48.
 
12
13
 
14
15
 
16
 
17
18
 
19
G. A. Bouchard, M. Calle, and R. Ramaswami, "Dynamic random access memory system with bank conflict avoidance feature," U.S. Patent 6 944 731, Sep. 2005.
 
20
G. Shrimali and N. McKeown, "Building packet buffers with interleaved memories," in Proc. Workshop High Perform. Switching and Routing, Hong Kong, May 2005, pp. 1-5.
 
21
22
23
 
24
 
25
M. Gries, "A survey of synchronous RAM architectures," Computer Eng. and Networks Lab. (TIK), ETH Zurich, Switzerland, Tech. Rep. 71, Apr. 1999.
 
26
27
 
28
RamBus, "RDRAM memory: Leading performance and value over SDRAM and DDR," 2001.
 
29
Samsung, "Samsung RamBus MR18R162GDF0-CM8 512 MB 16 bit 800 MHz datasheet," 2005.
 
30
J. Truong, "Evolution of network memory," Samsung Semiconductor, Inc., Mar. 2005.
 
31
T. Kirihata et al., "An 800 MHz embedded dram with a concurrent refresh mode," in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), Feb. 2004, pp. 15-19.
 
32
J. L. Carter and M. N. Wegman, "Universal classes of hash functions," J.Comput. Syst. Sci., vol. 18, pp. 143-154, 1979.
 
33
 
34
"Internet core router test: Looking at latency," 2001 [Online]. Available: http://www.lightreading.com/document.asp?doc_id=4009&page_number=7
 
35
P. Shivakumar and N. P. Jouppi, "Cacti 3.0: An integrated cache timing, power and area model," Western Research Lab (WRL) Res. Rep., Tech. Rep, 2001/2.
 
36
J. Turner, "A proposed architecture for the Geni backbone platform," Mar. 2006.
 
37
W. Eatherton, "The push of network processing to the top of the pyramid," 2005, ANCS'05 Keynote.
 
38