|
ABSTRACT
As network bandwidth increases, designing an effective memory system for network processors becomes a significant challenge. The size of the routing tables, the complexity of the packet classification rules, and the amount of packet buffering required all continue to grow at a staggering rate. Simply relying on large, fast SRAMs alone is not likely to be scalable or cost-effective. Instead, trends point to the use of low-cost commodity DRAM devices as a means to deliver the worst-case memory performance that network data-plane algorithms demand. While DRAMs can deliver a great deal of throughput, the problem is that memory banking significantly complicates the worst-case analysis, and specialized algorithms are needed to ensure that specific types of access patterns are conflict-free. We introduce virtually pipelined memory, an architectural technique that efficiently supports high bandwidth, uniform latency memory accesses, and high-confidence throughput even under adversarial conditions. Virtual pipelining provides a simple-to-analyze programming model of a deep pipeline (deterministic latencies) with a completely different physical implementation (a memory system with banks and probabilistic mapping). This allows designers to effectively decouple the analysis of their algorithms and data structures from the analysis of the memory buses and banks. Unlike specialized hardware customized for a specific data-plane algorithm, our system makes no assumption about the memory access patterns. We present a mathematical argument for our system's ability to provably provide bandwidth with high confidence and demonstrate its functionality and area overhead through a synthesizable design. We further show that, even though our scheme is general purpose to support new applications such as packet reassembly, it outperforms the state-of-the-art in specialized packet buffering architectures.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
A. Nikologiannis and M. Katevenis, "Efficient per-flow queueing in DRAM at OC-192 line rate using out-of-order execution techniques," in Proc. IEEE Int. Conf. Commun. (ICC'2001), Helsinki, Finland, Jun. 2001, pp. 2048-2052.
|
| |
3
|
S. Iyer, R. R. Kompella, and N. McKeown, "Designing packet buffers for router linecards," Stanford Univ., Tech. Rep. TR02-HPNG-031001, Nov. 2002.
|
| |
4
|
|
 |
5
|
|
 |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
B. K. Mathew, S. A. McKee, J. B. Carter, and A. Davis, "Design of a parallel vector access unit for SDRAM memory systems," in Proc. 6th Int. Symp. High-Perform. Comput. Archit. (HPCA'00), 2000, pp. 39-48.
|
| |
12
|
Sung I. Hong , Sally A. McKee , Maximo H. Salinas , Robert H. Klenke , James H. Aylor , Wm. A. Wulf, Access Order and Effective Bandwidth for Streams on a Direct Rambus Memory, Proceedings of the 5th International Symposium on High Performance Computer Architecture, p.80, January 09-12, 1999
|
 |
13
|
Scott Rixner , William J. Dally , Ujval J. Kapasi , Peter Mattson , John D. Owens, Memory access scheduling, Proceedings of the 27th annual international symposium on Computer architecture, p.128-138, June 2000, Vancouver, British Columbia, Canada
|
| |
14
|
Roger Espasa , Mateo Valero , James E. Smith, Out-of-order vector architectures, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.160-170, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
15
|
|
| |
16
|
|
| |
17
|
|
 |
18
|
|
| |
19
|
G. A. Bouchard, M. Calle, and R. Ramaswami, "Dynamic random access memory system with bank conflict avoidance feature," U.S. Patent 6 944 731, Sep. 2005.
|
| |
20
|
G. Shrimali and N. McKeown, "Building packet buffers with interleaved memories," in Proc. Workshop High Perform. Switching and Routing, Hong Kong, May 2005, pp. 1-5.
|
| |
21
|
|
 |
22
|
|
 |
23
|
|
| |
24
|
|
| |
25
|
M. Gries, "A survey of synchronous RAM architectures," Computer Eng. and Networks Lab. (TIK), ETH Zurich, Switzerland, Tech. Rep. 71, Apr. 1999.
|
| |
26
|
|
 |
27
|
Vinodh Cuppu , Bruce Jacob , Brian Davis , Trevor Mudge, A performance comparison of contemporary DRAM architectures, Proceedings of the 26th annual international symposium on Computer architecture, p.222-233, May 01-04, 1999, Atlanta, Georgia, United States
|
| |
28
|
RamBus, "RDRAM memory: Leading performance and value over SDRAM and DDR," 2001.
|
| |
29
|
Samsung, "Samsung RamBus MR18R162GDF0-CM8 512 MB 16 bit 800 MHz datasheet," 2005.
|
| |
30
|
J. Truong, "Evolution of network memory," Samsung Semiconductor, Inc., Mar. 2005.
|
| |
31
|
T. Kirihata et al., "An 800 MHz embedded dram with a concurrent refresh mode," in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), Feb. 2004, pp. 15-19.
|
| |
32
|
J. L. Carter and M. N. Wegman, "Universal classes of hash functions," J.Comput. Syst. Sci., vol. 18, pp. 143-154, 1979.
|
| |
33
|
|
| |
34
|
"Internet core router test: Looking at latency," 2001 [Online]. Available: http://www.lightreading.com/document.asp?doc_id=4009&page_number=7
|
| |
35
|
P. Shivakumar and N. P. Jouppi, "Cacti 3.0: An integrated cache timing, power and area model," Western Research Lab (WRL) Res. Rep., Tech. Rep, 2001/2.
|
| |
36
|
J. Turner, "A proposed architecture for the Geni backbone platform," Mar. 2006.
|
| |
37
|
W. Eatherton, "The push of network processing to the top of the pyramid," 2005, ANCS'05 Keynote.
|
| |
38
|
|
INDEX TERMS
Primary Classification:
B.
Hardware
B.3
MEMORY STRUCTURES
B.3.1
Semiconductor Memories
Subjects:
Dynamic memory (DRAM)
Additional Classification:
C.
Computer Systems Organization
C.2
COMPUTER-COMMUNICATION NETWORKS
C.2.0
General
Subjects:
Data communications
C.2.m
Miscellaneous
D.
Software
D.4
OPERATING SYSTEMS
D.4.2
Storage Management
Subjects:
Allocation/deallocation strategies;
Virtual memory
General Terms:
Algorithms,
Design,
Management
Keywords:
DRAM,
MTS,
VPNM,
bank conflicts,
mean time to stall,
memory,
memory controller,
network,
packet buffering,
packet reassembly,
universal hashing,
virtual pipeline
|