ACM Home Page
Please provide us with feedback. Feedback
Fast enumeration of maximal valid subgraphs for custom-instruction identification
Full text PdfPdf (615 KB)
Source
International Conference on Compilers, Architecture and Synthesis for Embedded Systems archive
Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems table of contents
Grenoble, France
SESSION: Compiler techniques for performance table of contents
Pages 29-36  
Year of Publication: 2009
ISBN:978-1-60558-626-7
Authors
Tao Li  National University of Defense Technology, Changsha, Hunan, China
Zhigang Sun  National University of Defense Technology, Changsha, Hunan, China
Wu Jigang  Nanyang Technological University, Singapore, Singapore
Xicheng Lu  National University of Defense Technology, Changsha, Hunan, China
Sponsors
SIGDA: ACM Special Interest Group on Design Automation
ACM: Association for Computing Machinery
SIGBED: ACM Special Interest Group on Embedded Systems
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 10,   Citation Count: 0
Additional Information:

abstract   references   index terms  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1629395.1629402
What is a DOI?

ABSTRACT

Extensible processors are increasingly becoming popular as they allow for incorporating custom instructions to meet design constraints. However, identifying custom instructions under architectural input/output ports constraint is a time consuming process particularly when large applications are considered. To rapidly identify the most profitable custom instructions with large inputs and outputs, this paper proposes a novel identification algorithm for enumerating maximal convex subgraphs containing no invalid node (i.e., maximal valid subgraphs). The proposed enumerating strategy is based on divide-and-conquer with a top-down manner, rather than the bottom-up manner utilized in the state-of-the-art. The division operation only considers invalid inner nodes of the given DFG, rather than taking all the invalid nodes into account, and thus accelerates enumeration of the maximal valid subgraphs. Experimental results show that, the improvement over the latest work is more than 90% for 60% DFG instances of the acknowledged benchmarks.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
N. Clark, H. Zhong, and S. Mahlke. Processor acceleration through automated instruction set customization. in MICRO, pages 129--140, 2003.
 
2
B. Kastrup, A. Bink, and J. Hoogerbrugge. Concise: A compiler-driven cpld-based instruction set accelerator. Field-Programmable Custom Computing Machines, Annual IEEE Symposium on, pages 92--101, 1999.
 
3
R. E. Gonzalez. Xtensa: A configurable and extensible processor. IEEE Micro, 20(2):60--70, 2000.
 
4
T.R. Halfhill. ARC cores encourages "plug-ins". Microprocessor Report, Jun.19, 2000.
 
5
Altera. Nios embedded processor system development. http://www.altera.com/.
 
6
P. Ienne and R. Leupers. Customizable Embedded Processors-Design Technologies and Applications. Morgan Kaufmann, San Mateo, Calif., 2006.
 
7
N. Clark, A. Hormati, S. Mahlke, and S. Yehia. Scalable subgraph mapping for acyclic computation accelerators. In Proceedings of the International Conference on Compilers, Architectures, and Synthesis for Embedded Systems, Seoul, South Korea, 2006.
 
8
P. Yu. Scalable custom instructions identification for instruction-set extensible processors. In CASES, pages 69--78. ACM Press, 2004.
 
9
L. Pozzi, K. Atasu, and P. Ienne. Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans. on CAD of Integrated Circuits and Systems, 25(7):1209--1229, 2006.
 
10
X. Chen, D. L. Maskell, and Y. Sun. Fast identification of custom instructions for extensible processors. IEEE Trans. on CAD of Integrated Circuits and Systems, 26(2):359--368, 2007.
 
11
O. M. W. L. C. -- O. K. Atasu, R. G. Dimond and G. Dundar. Optimizing instruction-set extensible processors under data bandwidth constraints. In Proceedings of the Design, Automation and Test in Europe Conference, 2007, pages 588--593, 2007.
 
12
A. K. Verma, P. Brisk, and P. Ienne. Rethinking custom ISE identification: a new processor-agnostic method. In CASES '07: Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems, pages 125--134, New York, NY, USA, 2007. ACM.
 
13
J. Cong, Y. Fan, G. Han, A. Jagannathan, G. Reinman, and Z. Zhang. Instruction set extension with shadow registers for configurable processors. In FPGA '05: Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays, pages 99--106, New York, NY, USA, 2005. ACM.
 
14
L. Pozzi and P. Ienne. Exploiting pipelining to relax register-file port constraints of instruction-set extensions. In CASES '05: Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems, pages 2--10, New York, NY, USA, 2005. ACM.
 
15
N. Pothineni, A. Kumar, and K. Paul. Application specific datapath extension with distributed i/o functional units. In VLSID '07: Proceedings of the 20th International Conference on VLSI Design held jointly with 6th International Conference, pages 551--558, Washington, DC, USA, 2007. IEEE Computer Society.
 
16
K. Atasu, O. Mencer, W. Luk, C. Ozturan, and G. Dundar. Fast custom instruction identification by convex subgraph enumeration. In ASAP '08: Proceedings of the 2008 International Conference on Application-Specific Systems, Architectures and Processors, pages 1--6, Washington, DC, USA, 2008. IEEE Computer Society.
 
17
C. Lee, M. Potkonjak, and W. H. Mangione-smith. Mediabench: A tool for evaluating and synthesizing multimedia and communications systems. In International Symposium on Microarchitecture, pages 330--335, 1997.
 
18
M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. Mibench: A free, commercially representative embedded benchmark suite. In WWC '01: Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop, pages 3--14, Washington, DC, USA, 2001. IEEE Computer Society.
 
19
L. N. Chakrapani, J. Gyllenhaal, W. mei W. Hwu, S. A. Mahlke, K. V. Palem, and R. M. Rabbah. Trimaran: An infrastructure for research in instruction-level parallelism. In Instruction-level Parallelism. LNCS, page 32--41, 2004.
 
20
C. Bron and J. Kerbosch. Algorithm 457: finding all cliques of an undirected graph. Commun. ACM, 16(9):575--577, 1973.