|
ABSTRACT
This paper describes a rapid design methodology to create a pipeline of processers to execute streaming applications. The methodology is in two separate phases: the first phase, uses a heuristic to rapidly search through a large number of processor configurations (configurations differ by the base processor, the additional instructions and cache sizes) to find the near Pareto front; the second phase, utilizes either the above heuristic or an ILP (Integer Linear Programming) formulation to search a smaller design space to find an appropriate final implementation. By the utilization of the fast heuristic with differing runtime constraints in the first phase, we rapidly find the near Pareto front. The second phase provides either an optimal or a near optimal solution. Both the ILP formulation and the heuristic find a system with the smallest area, within a designer specified runtime constraint. The system has efficiently explored design spaces with over 1012 design points. We integrated this design methodology into a commercial design flow and evaluated our approach with different benchmarks (JPEG Encoder, JPEG Decoder and MP3 Encoder). For each benchmark, the near Pareto front was found in a few hours using the heuristic (took several days for the ILP). The results show that the average area error of the heuristic is within 2.5% of the optimal design points (obtained using ILP) for all benchmarks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Altera Nios Processor. Altera Corp. (http://www.altera.com).
|
| |
2
|
ARC the leader in configurable processor technology. ARC International (http://www.arc.com).
|
| |
3
|
Xtensa Processor. Tensilica Inc. (http://www.tensilica.com).
|
| |
4
|
S. L. Shee, A. Erdos, and S. Parameswaran. Heterogeneous multiprocessor implementations for jpeg:: a case study. In CODES+ISSS '06: Proceedings of the 4th international conference on Hardware/software codesign and system synthesis, pages 217--222, New York, NY, USA, 2006. ACM.
|
| |
5
|
M. Strik, A. Timmer, J. van Meerbergen, and G.-J. van Rootselaar. Heterogeneous multiprocessor for the management of real-time video and graphics streams. Solid-State Circuits, IEEE Journal of, 35(11):1722--1731, Nov 2000.
|
| |
6
|
A. Beric, R. Sethuraman, C. Pinto, H. Peters, G. Veldman, P. van de Haar, and M. Duranton. Heterogeneous multiprocessor for high definition video. Consumer Electronics, 2006. ICCE '06. 2006 Digest of Technical Papers. International Conference on, pages 401--402, 7--11 Jan. 2006.
|
| |
7
|
T. Kodaka, K. Kimura, and H. Kasahara. Multigrain parallel processing for jpeg encoding on a single chip multiprocessor. In IWIA '02: Proceedings of the International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'02), page 57, Washington, DC, USA, 2002. IEEE Computer Society.
|
| |
8
|
S. Banerjee, T. Hamada, P. Chau, and R. Fellman. Macro pipelining based scheduling on high performance heterogeneous multiprocessor systems. Signal Processing, IEEE Transactions on, 43(6):1468--1484, 1995.
|
| |
9
|
J. Jeon and K. Choi. Loop pipelining in hardware-software partitioning. In Asia and South Pacific Design Automation Conference, pages 361--366, 1998.
|
| |
10
|
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stien. Introduction to Algorithms. MIT Press and MCGraw-Hill, Second edition, 2001.
|
| |
11
|
J. DeSouza-Batista and A. Parker. Optimal synthesis of application specific heterogeneous pipelined multiprocessors. Application Specific Array Processors, 1994. Proceedings., International Conference on, pages 99--110, 22--24 Aug 1994.
|
| |
12
|
S.-R. Kuang, C.-Y. Chen, and R.-Z. Liao. Partitioning and pipelined scheduling of embedded system using integer linear programming. In ICPADS '05: Proceedings of the 11th International Conference on Parallel and Distributed Systems - Workshops (ICPADS'05), pages 37--41, Washington, DC, USA, 2005. IEEE Computer Society.
|
| |
13
|
M. Schwiegershausen and P. Pirsch. A formal approach for the optimization of heterogeneous multiprocessors for complex image processing schemes. In EURO-DAC '95/EURO-VHDL '95: Proceedings of the conference on European design automation, pages 8--13, Los Alamitos, CA, USA, 1995. IEEE Computer Society Press.
|
| |
14
|
F. Sun, S. Ravi, A. Raghunathan, and N. K. Jha. Synthesis of application-specific heterogeneous multiprocessor architectures using extensible processors. In VLSID '05: Proceedings of the 18th International Conference on VLSI Design held jointly with 4th International Conference on Embedded Systems Design, pages 551--556, Washington, DC, USA, 2005. IEEE Computer Society.
|
| |
15
|
J. Cong, G. Han, and W. Jiang. Synthesis of an application-specific soft multiprocessor system. In FPGA '07: Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field programmable gate arrays, pages 99--107, New York, NY, USA, 2007. ACM.
|
| |
16
|
S. L. Shee and S. Parameswaran. Design methodology for pipelined heterogeneous multiprocessor system. In DAC '07: Proceedings of the 44th annual conference on Design automation, pages 811--816, New York, NY, USA, 2007. ACM.
|
| |
17
|
H. Javaid and S. Parameswaran. Synthesis of heterogeneous pipelined multiprocessor systems using ilp: jpeg case study. In CODES/ISSS '08: Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis, pages 1--6, New York, NY, USA, 2008. ACM.
|
| |
18
|
H. Javaid and S. Parameswaran. Synthesis of application specific heterogeneous multiprocessor systems. Technical Report UNSW-CSE-TR-0911, School of Computer Science and Engineering, The University of New South Wales.
|
| |
19
|
Flix: Fast relief for performance-hungry embedded applications, 2005. Available at: http://www.tensilica.com/pdf/FLIX_White_Paper_v2.pdf.
|
| |
20
|
XPRES Generated Specialized Operations, 2005. Available at: http://tensilica.com/pdf/XPRES%201205.pdf.
|
| |
21
|
lp_solve. Available at: http://lpsolve.sourceforge.net/5.5/.
|
|