|
ABSTRACT
Programming network processors is challenging. To sustain high line rates, network processors have extremely tight memory access and instruction budgets. Achieving desired performance has traditionally required hand-coded assembly. Researchers have recently proposed high-level programming languages for packet processing, but the challenges of compiling these languages into code that is competitive with hand-tuned assembly remain unanswered.This paper describes the Shangri-La compiler, which accepts a packet program written in a C-like high-level language and applies scalar and specialized optimizations to generate a highly optimized binary. Hot code paths identified by profiling are mapped across processing elements to maximize processor utilization. Since our compilation target has no hardware caches, software-controlled caches are generated for frequently accessed application data structures. Packet handling optimizations significantly reduce per-packet memory access and instruction counts. Finally, a custom stack model maps stack frames to the fastest levels of the target processor's heterogeneous memory hierarchy.Binaries generated by the compiler were evaluated on the Intel IXP2400 network processor with eight packet processing cores and eight threads per core. Our results show the importance of both traditional and specialized optimization techniques for achieving the maximum forwarding rates on three network applications, L3-Switch, MPLS and Firewall.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Amaral, J.N., Gao, G.R., Dehnert, J. and Towle, R. The SGI Pro64 Compiler Infrastructure: A Tutorial. In PACT'00, Philadelphia, PA, October 2000.
|
 |
2
|
|
| |
3
|
|
| |
4
|
Broadcom Corporation. The Sibyte BCM1250 Processor. http://sibyte.broadcom.com/public/index.html
|
| |
5
|
|
| |
6
|
Chiueh, T. and Pradhan, P. High-performance IP routing table lookup using CPU caching. In IEEE Infocom'99, New York, NY, March 1999.
|
 |
7
|
Fred Chow , Sun Chan , Robert Kennedy , Shin-Ming Liu , Raymond Lo , Peng Tu, A new algorithm for partial redundancy elimination based on SSA form, Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation, p.273-286, June 16-18, 1997, Las Vegas, Nevada, United States
|
 |
8
|
|
 |
9
|
|
 |
10
|
Jinquan Dai , Bo Huang , Long Li , Luddy Harrison, Automatically partitioning packet processing applications for pipelined architectures, Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, June 12-15, 2005, Chicago, IL, USA
|
 |
11
|
Amer Diwan , Kathryn S. McKinley , J. Eliot B. Moss, Type-based alias analysis, Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, p.106-117, June 17-19, 1998, Montreal, Quebec, Canada
|
 |
12
|
|
| |
13
|
Goglin, S., Johnson, E.J. and Vin, H. Baker: A Packet Processing Programming Language for Highly Concurrent Hardware. Under preparation for submission.
|
| |
14
|
|
| |
15
|
IBM. The PowerNP architecture. http://www.hifn.com/products/5np4g.html.
|
| |
16
|
Intel Corporation. Intel IXP2400 Network Processor: Hardware Reference Manual. October 2002.
|
| |
17
|
Intel Corporation. Microengine Version 2 (MEv2): Microengine C Compiler Coding Considerations. June 2003.
|
| |
18
|
|
| |
19
|
Ju, R., Chan, S. and Wu, Chengyong. Open Research Compiler for Itanium Processor Family. Tutorial in MICRO-34, Austin, TX, December 2001.
|
 |
20
|
|
 |
21
|
|
 |
22
|
Jinhwan Kim , Sungjoon Jung , Yunheung Paek , Gang-Ryung Uh, Experience with a retargetable compiler for a commercial network processor, Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems, October 08-11, 2002, Grenoble, France
[doi> 10.1145/581630.581658]
|
 |
23
|
Chidamber Kulkarni , Matthias Gries , Christian Sauer , Kurt Keutzer, Programming challenges in network processor deployment, Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems, October 30-November 01, 2003, San Jose, California, USA
[doi> 10.1145/951710.951735]
|
 |
24
|
|
 |
25
|
|
| |
26
|
Intel Corporation. Microengine Version 2 (MEv2): Microengine C Compiler Coding Considerations. June 2003.
|
| |
27
|
Network Processing Forum. IP Forwarding Application Level Benchmark. http://www.npforum.org/techinfo/ipforwarding_bm.pdf
|
| |
28
|
Network Processing Forum. MPLS Forwarding Application Level Benchmark and Annex. http://www.npforum.org/techinfo/MPLSBenchmark.pdf
|
| |
29
|
PMC-Sierra. MIPS-based Processors. http://pmc-sierra.com/processors/
|
| |
30
|
Rosen, E., Viswanathan, A. and Callon, R. RFC 3031 - Multiprotocol Label Switching Architecture. IETF, January 2001.
|
| |
31
|
Shah, N., Plishker, W. and Keutzer, K. NP-Click: A Programming Model for the Intel IXP1200. In 2nd Workshop on Network Processors (NP-2), Anaheim, CA, February 2003.
|
| |
32
|
Shah, N., Plishker, W. and Keutzer, K. Comparing Network Processor Programming Environments: A Case Study. In 2004 Workshop on Productivity and Performance in High-End Computing (P-PHEC), HPCA-10, Madrid, Spain, February 2004.
|
 |
33
|
Mark Stephenson , Jonathan Babb , Saman Amarasinghe, Bidwidth analysis with application to silicon compilation, Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, p.108-120, June 18-21, 2000, Vancouver, British Columbia, Canada
|
| |
34
|
|
 |
35
|
|
| |
36
|
Vin, H., Mudigonda, J., Jason, J., Johnson, E.J., Ju, R., Kunze, A. and Lian, R. A Programming Environment for Packet-processing Systems: Design Considerations. In 3rd Workshop on Network Processors & Applications, Madrid, Spain, February 2004.
|
 |
37
|
|
| |
38
|
Zhuang, X. and Pande, S. Resolving Register Bank Conflicts for a Network Processor. In PLDI'03, New Orleans, LA, June 2004.
|
CITED BY 17
|
|
|
|
|
|
|
|
Arun Raghunath , Aaron Kunze , Erik J. Johnson , Vinod Balakrishnan, Framework for supporting multi-service edge packet processing on network processors, Proceedings of the 2005 symposium on Architecture for networking and communications systems, October 26-28, 2005, Princeton, NJ, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Olivier Morandi , Fulvio Risso , Silvio Valenti , Paolo Veglia, Design and implementation of a framework for creating portable and efficient packet-processing applications, Proceedings of the 7th ACM international conference on Embedded software, October 19-24, 2008, Atlanta, GA, USA
|
|
|
|
|
|
|
|
|
|
|
|
Amir Hormati , Manjunath Kudlur , Scott Mahlke , David Bacon , Rodric Rabbah, Optimus: efficient realization of streaming applications on FPGAs, Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems, October 19-24, 2008, Atlanta, GA, USA
|
|
|
Charlie Wiseman , Jonathan Turner , Michela Becchi , Patrick Crowley , John DeHart , Mart Haitjema , Shakir James , Fred Kuhns , Jing Lu , Jyoti Parwatikar , Ritun Patney , Michael Wilson , Ken Wong , David Zar, A remotely accessible network processor-based router for network experimentation, Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems, November 06-07, 2008, San Jose, California
|
|
|
|
|