ACM Home Page
Please provide us with feedback. Feedback
Using Interaction Costs for Microarchitectural Bottleneck Analysis
Full text PdfPdf (398 KB)
Source International Symposium on Microarchitecture archive
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture table of contents
Page: 228  
Year of Publication: 2003
ISBN:0-7695-2043-X
Authors
Brian A. Fields  University of California-Berkeley
Rastislav Bodík  University of California-Berkeley
Mark D. Hill  University of Wisconsin-Madison
Chris J. Newburn  Intel Corporation
Sponsor
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
Publisher
IEEE Computer Society  Washington, DC, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 18,   Citation Count: 9
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  

ABSTRACT

Attacking bottlenecks in modern processors is difficultbecause many microarchitectural events overlap witheach other. This parallelism makes it difficult to both(a) assign a cost to an event (e.g., to one of two overlappingcache misses) and (b) assign blame for each cycle(e.g., for a cycle where many, overlapping resources areactive). This paper introduces a new model for understandingevent costs to facilitate processor design andoptimization.First, we observe that everything in a machine (instructions,hardware structures, events) can interact inonly one of two ways (in parallel or serially). Wequantify these interactions by defining interaction cost,which can be zero (independent, no interaction), positive(parallel), or negative (serial).Second, we illustrate the value of using interactioncosts in processor design and optimization.Finally, we propose performance-monitoring hardwarefor measuring interaction costs that is suitable formodern processors.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
 
4
[4] D. C. Burger and T. M. Austin. The simplescalar tool set, version 2.0. Technical Report CS-TR-1997-1342, University of Wisconsin, Madison, Jun. 1997.
5
 
6
[6] J. Casmira and D. Grunwald. Dynamic instruction scheduling slack. In Kool Chips Workshop in conjunction with MICRO 33, Dec. 2000.
 
7
[7] Intel Corporation. Intel Itanium 2 processor reference manual for software development and optimization. Apr. 2003.
 
8
[8] Intel Corporation. Intel Pentium 4 processor manual. In [http://developer.intel.com/design/pentium4/manuals/], 2003.
 
9
 
10
11
12
 
13
[13] B. R. Fisk and R. I. Bahar. The non-critical buffer: Using load latency tolerance to improve data cache efficiency. Oct. 1999.
 
14
[14] R. D. Fleischmann et al. Whole-genome random sequencing and assembly of haemophilus-influenzae. Science, 269:496- 512, 1995.
15
 
16
17
 
18
[18] Raj Jain. The Art of Computer Systems Performance Analysis. Wiley Professional Computing, 1991.
 
19
 
20
21
 
22
 
23
 
24
[24] P. Ranganathan, K. Gharachorloo, S. V. Adve, and L. A. Barroso. Performance of database workloads on shared-memory systems with out-of-order processors. Oct. 1998.
25
26
 
27
 
28
 
29
30
31
 
32
33
 
34
35
 
36
 
37
 
38
 
39

CITED BY  10
 
 

Collaborative Colleagues:
Brian A. Fields: colleagues
Rastislav Bodík: colleagues
Mark D. Hill: colleagues
Chris J. Newburn: colleagues

Peer to Peer - Readers of this Article have also read: