ACM Home Page
Please provide us with feedback. Feedback
Leveraging protocol knowledge in slack matching
Full text PdfPdf (251 KB)
Source International Conference on Computer Aided Design archive
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design table of contents
San Jose, California
SESSION: Specification and architecture challenges in high-level synthesis table of contents
Pages: 724 - 729  
Year of Publication: 2006
ISBN ~ ISSN:1092-3152 , 1-59593-389-1
Authors
Girish Venkataramani  Carnegie Mellon University, Pittsburgh, PA
Seth C. Goldstein  Carnegie Mellon University, Pittsburgh, PA
Sponsors
IEEE-CS : Computer Society
IEEE-CAS : Circuits & Systems
SIGDA: ACM Special Interest Group on Design Automation
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 16,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1233501.1233650
What is a DOI?

ABSTRACT

Stalls, due to mis-matches in communication rates, are a major performance obstacle in pipelined circuits. If the rate of data production is faster than the rate of consumption, the resulting design performs slower than when the communication rate is matched. This can be remedied by inserting pipeline buffers (to temporarily hold data), allowing the producer to proceed if the consumer is not ready to accept data. The problem of deciding which channels need these buffers (and how many) for an arbitrary communication profile is called the slack matching problem; the optimal solution to this problem has been shown to be NP-complete.

In this paper, we present a heuristic that uses knowledge of the communication protocol to explicitly model these bottlenecks, and an iterative algorithm to progressively remove these bottlenecks by inserting buffers. We apply this algorithm to asynchronous circuits, and show that it naturally handles large designs with arbitrarily cyclic and acyclic topologies, which exhibit various types of control choice. The heuristic is efficient, achieving linear time complexity in practice, and produces solutions that (a) achieve up to 60% performance speedup on large media processing kernels, and (b) can either be verified to be optimal, or the approximation margin can be bounded.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
 
4
 
5
 
6
7
 
8
9
 
10
B. Taskin and I. Kourtev. Delay insertion method in clock skew scheduling. IEEE TCAD, 25--4:651--663, 2006.
 
11
G. Venkataramani, M. Budiu, et al. C to asynchronous dataflow circuits: An end-to-end toolflow. In IWLS, June 2004.
 
12
G. Venkataramani, T. Chelcea, et al. Modeling the global critical path in concurrent systems. Technical Report CMU-CS-06-144, Carnegie Mellon University, August 2006.
 
13


Collaborative Colleagues:
Girish Venkataramani: colleagues
Seth C. Goldstein: colleagues