ACM Home Page
Please provide us with feedback. Feedback
Profiling and mapping of parallel workloads on network processors
Full text PdfPdf (373 KB)
Source Symposium on Applied Computing archive
Proceedings of the 2005 ACM symposium on Applied computing table of contents
Santa Fe, New Mexico
SESSION: Embedded systems: applications, solutions and techniques (EMBS) table of contents
Pages: 890 - 896  
Year of Publication: 2005
ISBN:1-58113-964-0
Authors
Ning Weng  University of Massachusetts, Amherst, MA
Tilman Wolf  University of Massachusetts, Amherst, MA
Sponsor
SIGAPP: ACM Special Interest Group on Applied Computing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 45,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1066677.1066879
What is a DOI?

ABSTRACT

Network processors are embedded system-on-a-chip multiprocessors that are optimized to perform simple packet processing tasks at data rates of several Gigabits per second. To meet the performance demands of increasing link speeds and more complex network applications, network processors are implemented with several dozens of processor cores and execute multiple packet processing applications in parallel. The complexity of such systems makes it increasingly difficult for application developers to map applications to the various system resources and achieve optimal performance. We propose an automated profiling and mapping methodology for these highly parallel, embedded systems that starts out with a simple uniprocessor implementation of the networking application. An architecture independent representation of the runtime behavior of the application is used to map and schedule different processing steps to the underlying hardware. An analytic performance model is used in the process to estimate system performance and to find an near-optimal solution through iteration.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
T. M. Austin and G. S. Sohi. Tetra: evaluation of serial program performance on fine-grain parallel processors. Technical Report 1163, Computer Science Department, University of Wisconsin, Madison, WI, July 1993.
 
2
F. Baker. Requirements for IP version 4 routers. RFC 1812, Network Working Group, June 1995.
 
3
J. Daemen and V. Rijmen. The block cipher Rijndael. In Lecture Notes in Computer Science, volume 1820, pages 288--296. Springer-Verlag, 2000.
 
4
M. A. Franklin and T. Wolf. Power considerations in network processor design. In M. A. Franklin, P. Crowley, H. Hadimioglu, and P. Z. Onufryk, editors, Network Processor Design: Issues and Practices, Volume 2, chapter 3, pages 29--50. Morgan Kaufmann Publishers, Nov. 2003.
 
5
S. D. Goglin, D. Hooper, A. Kumar, and R. Yavatkar. Advanced software framework, tools, and languages for the IXP family. Intel Technology Journal, 7(4):64--76, Nov. 2004.
 
6
M. Gries, C. Kulkarni, C. Sauer, and K. Keutzer. Exploring trade-offs in performance and programmability of processing element topologies for network processors. In Proc. of Second Network Processor Workshop (NP-2) in conjunction with Ninth International Symposium on High Performance Computer Architecture (HPCA-9), pages 75--87, Anaheim, CA, Feb. 2003.
 
7
Intel Corp. Intel Second Generation Network Processor, 2002. http://www.intel.com/design/network/products/np-family/ixp2400.htm.
 
8
9
 
10
 
11
 
12
 
13
S. Nilsson and G. Karlsson. IP-address lookup using LC-tries. IEEE Journal on Selected Areas in Communications, 17(6):1083--1092, June 1999.
 
14
R. Ramaswamy, N. Weng, and T. Wolf. Analysis of network processing workloads. Under submission.
 
15
R. Ramaswamy and T. Wolf. PacketBench: A tool for workload characterization of network processing. In Proc. of IEEE 6th Annual Workshop on Workload Characterization (WWC-6), pages 42--50, Austin, TX, Oct. 2003.
 
16
 
17
 
18
N. Shah, W. Plishker, and K. Keutzer. NP-Click: A programming model for the intel IXP 1200. In Proc. of Second Network Processor Workshop (NP-2) in conjunction with Ninth International Symposium on High Performance Computer Architecture (HPCA-9), pages 100--111, Anaheim, CA, Feb. 2003.
 
19
Teja Technologies. TejaNP Datasheet, 2003. http://www.teja.com.
 
20
L. Thiele, S. Chakraborty, M. Gries, and S. Künzli. Design space exploration of network processor architectures. In Proc. of First Network Processor Workshop (NP-1) in conjunction with Eighth International Symposium on High Performance Computer Architecture (HPCA-8), pages 30--41, Cambridge, MA, Feb. 2002.
 
21
Y.-C. Wei and C.-K. Cheng. Ratio cut partitioning for hierarchical designs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 10(7):911--921, July 1991.
 
22
N. Weng and T. Wolf. Pipelining vs. multiprocessors -choosing the right network processor system topology. In Proc. of Advanced Networking and Communications Hardware Workshop (ANCHOR 2004) in conjunction with The 31st Annual International Symposium on Computer Architecture (ISCA 2004), Munich, Germany, June 2004.