ACM Home Page
Please provide us with feedback. Feedback
Phantom-BTB: a virtualized branch target buffer design
Full text PdfPdf (431 KB)
Source
Architectural Support for Programming Languages and Operating Systems archive
Proceeding of the 14th international conference on Architectural support for programming languages and operating systems table of contents
Washington, DC, USA
SESSION: Architectures table of contents
Pages 313-324  
Year of Publication: 2009
ISBN:978-1-60558-406-5
Also published in ...
Authors
Ioana Burcea  University of Toronto, Toronto, Canada
Andreas Moshovos  Unversity of Toronto, Toronto, Canada
Sponsors
SIGPLAN: ACM Special Interest Group on Programming Languages
SIGOPS: ACM Special Interest Group on Operating Systems
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 35,   Downloads (12 Months): 165,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1508244.1508281
What is a DOI?

ABSTRACT

Modern processors use branch target buffers (BTBs) to predict the target address of branches such that they can fetch ahead in the instruction stream increasing concurrency and performance. Ideally, BTBs would be sufficiently large to capture the entire working set of the application and sufficiently small for fast access and practical on-chip dedicated storage. Depending on the application, these requirements are at odds.

This work introduces a BTB design that accommodates large instruction footprints without dedicating expensive onchip resources. In the proposed Phantom-BTB (PBTB) design, a conventional BTB is augmented with a virtual table that collects branch target information as the application runs. The virtual table does not have fixed dedicated storage. Instead, it is transparently allocated, on demand, in the on-chip caches, at cache line granularity. The entries in the virtual table are proactively prefetched and installed in the dedicated conventional BTB, thus, increasing its perceived capacity. Experimental results with commercial workloads under full-system simulation demonstrate that PBTB improves IPC performance over a 1K-entry BTB by 6.9% on average and up to 12.7%, with a storage overhead of only 8%. Overall, the virtualized design performs within 1% of a conventional 4K-entry, single-cycle access BTB, while the dedicated storage is 3.6 times smaller.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
First the tick, now the tock: Next generation Intel microarchitecture (Nehalem). White Paper, Intel Co., 2008.
 
2
 
3
 
4
5
 
6
Ioana Burcea and Andreas Moshovos. Virtualizing branch target buffers. Technical report, University of Toronto, 2008. www.eecg.toronto.edu/~ioana/papers/pbtb tech rep.pdf.
7
8
 
9
 
10
Philip G. Emma, Allan M. Hartstein, Brian R. Prasky, Thomas R. Puzak, Moinuddin K. A. Qureshi, and Vijayalakshmi Srinivasan. Context lookahead storage structures. IBM, U.S. Patent 7337271 B2, 2008.
 
11
12
 
13
R. B. Hilgendorf, G. J. Heim, and W. Rosenstiel. Evaluation of branch-prediction methods on traces from commercial applications. IBM Journal of Research and Development, 43(4), 1999.
 
14
15
16
17
18
 
19
 
20
Ryotaro Kobayashi, Yuji Yamada, Hideki Ando, and Toshio Shimada. A cost-effective branch target buffer with a two-level table organization. In Proceedings of the 2nd International Symposium of Low-Power and High-Speed Chips (COOL Chips II), 1999.
 
21
22
23
 
24
 
25
26
27
 
28
Ed H. Sussenguth. Instruction sequence control. IBM, U.S. Patent 3559183, 1971.
 
29
30

Collaborative Colleagues:
Ioana Burcea: colleagues
Andreas Moshovos: colleagues