|
ABSTRACT
In modern processors, the dynamic translation of virtual addresses to support virtual memory is done before or in parallel with the first-level cache access. As processor technology improves at a rapid pace and the working sets of new applications grow insatiably the latency and bandwidth demands on the TLB (Translation Lookaside Buffer) are getting more and more difficult to meet. The situation is worse in multiprocessor systems, which run larger applications and are plagued by the TLB consistency problem.We evaluate and compare five options for virtual address translation in the context of COMAs (Cache Only Memory Architectures). The dynamic address translation mechanism can be located after the cache access provided the cache is virtual. In a particular design, which we call V-COMA for Virtual COMA, the physical address concept and the traditional TLB are eliminated. While still supporting virtual memory, V-COMA reduces the address translation overhead to a minimum.V-COMA scales well and works better in systems with large number of processors. As a machine running on virtual addresses, V-COMA provides a simple and consistent hardware model to the operating system and the compiler, in which further optimization opportunities are possible.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
 |
3
|
Edouard Bugnion , Jennifer M. Anderson , Todd C. Mowry , Mendel Rosenblum , Monica S. Lam, Compiler-directed page coloring for multiprocessors, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.244-255, October 01-04, 1996, Cambridge, Massachusetts, United States
|
| |
4
|
H. Burkhardt III et al. "Overview of the KSR-1 Computer System," Technical Report KSR-TR-9202001, Kendall Square Research, Feb. 1992.
|
| |
5
|
|
| |
6
|
|
 |
7
|
|
 |
8
|
|
 |
9
|
|
 |
10
|
Kourosh Gharachorloo , Anoop Gupta , John Hennessy, Performance evaluation of memory consistency models for shared-memory multiprocessors, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.245-257, April 08-11, 1991, Santa Clara, California, United States
|
 |
11
|
|
| |
12
|
L. Gwennap, "Design Concepts for Merced, Forecasting the Inner Workings of the Decade's Most Anticipated Processor," pages 9-11, Microprocessor Report, vol. 11, no. 3, March I0, 1997.
|
| |
13
|
|
 |
14
|
|
| |
15
|
|
| |
16
|
|
 |
17
|
Eric J. Koldinger , Jeffrey S. Chase , Susan J. Eggers, Architecture support for single address space operating systems, Proceedings of the fifth international conference on Architectural support for programming languages and operating systems, p.175-186, October 12-15, 1992, Boston, Massachusetts, United States
|
 |
18
|
J. Kuskin , D. Ofelt , M. Heinrich , J. Heinlein , R. Simoni , K. Gharachorloo , J. Chapin , D. Nakahira , J. Baxter , M. Horowitz , A. Gupta , M. Rosenblum , J. Hennessy, The Stanford FLASH multiprocessor, Proceedings of the 21ST annual international symposium on Computer architecture, p.302-313, April 18-21, 1994, Chicago, Illinois, United States
|
| |
19
|
William Lynch. "The Interaction of Virtual Memory and Cache Memory," Ph.D. Thesis, Technical Report CSL-TR-93- 587, Stanford University, 1993.
|
| |
20
|
Cathy May , Ed Silha , Rick Simpson , Hank Warren , CORPORATE International Business Machines, Inc., The PowerPC architecture: a specification for a new family of RISC processors, Morgan Kaufmann Publishers Inc., San Francisco, CA, 1994
|
| |
21
|
|
 |
22
|
David Nagle , Richard Uhlig , Tim Stanley , Stuart Sechrest , Trevor Mudge , Richard Brown, Design tradeoffs for software-managed TLBs, Proceedings of the 20th annual international symposium on Computer architecture, p.27-38, May 16-19, 1993, San Diego, California, United States
|
| |
23
|
Xiaogang Qiu and Michel Dubois, "Options for Dynamic Address Translation in COMAs", Technical report CENG98-08, Department of Electrical Engineering- Systems, University of Southern California.
|
 |
24
|
Theodore H. Romer , Wayne H. Ohlrich , Anna R. Karlin , Brian N. Bershad, Reducing TLB and memory overhead using online superpage promotion, Proceedings of the 22nd annual international symposium on Computer architecture, p.176-187, June 22-24, 1995, S. Margherita Ligure, Italy
|
 |
25
|
|
 |
26
|
Madhusudhan Talluri , Shing Kong , Mark D. Hill , David A. Patterson, Tradeoffs in supporting two page sizes, Proceedings of the 19th annual international symposium on Computer architecture, p.415-424, May 19-21, 1992, Queensland, Australia
|
| |
27
|
Patricia Teller and Allan Gottlieb. "Locating Multiprocessor TLBs at Memory," In Proceedings of the 27th Annual Hawaii international Conference on System Science, pages 554-563, 1994.
|
| |
28
|
|
 |
29
|
|
 |
30
|
|
 |
31
|
Steven Cameron Woo , Moriyoshi Ohara , Evan Torrie , Jaswinder Pal Singh , Anoop Gupta, The SPLASH-2 programs: characterization and methodological considerations, Proceedings of the 22nd annual international symposium on Computer architecture, p.24-36, June 22-24, 1995, S. Margherita Ligure, Italy
|
 |
32
|
D. A. Wood , S. J. Eggers , G. Gibson , M. D. Hill , J. M. Pendleton, An in-cache address translation mechanism, Proceedings of the 13th annual international symposium on Computer architecture, p.358-365, June 02-05, 1986, Tokyo, Japan
|
| |
33
|
|
|