|
ABSTRACT
Despite the fact that large-scale shared-memory multiprocessors have been commercially available for several years, system software that fully utilizes all their features is still not available, mostly due to the complexity and cost of making the required changes to the operating system. A recently proposed approach, called Disco, substantially reduces this development cost by using a virtual machine monitor that leverages the existing operating system technology.In this paper we present a system called Cellular Disco that extends the Disco work to provide all the advantages of the hardware partitioning and scalable operating system approaches. We argue that Cellular Disco can achieve these benefits at only a small fraction of the development cost of modifying the operating system. Cellular Disco effectively turns a large-scale shared-memory multiprocessor into a virtual cluster that supports fault containment and heterogeneity, while avoiding operating system scalability bottle-necks. Yet at the same time, Cellular Disco preserves the benefits of a shared-memory multiprocessor by implementing dynamic, fine-grained resource sharing, and by allowing users to overcommit resources such as processors and memory. This hybrid approach requires a scalable resource manager that makes local decisions with limited information while still providing good global performance and fault containment.In this paper we describe our experience with a Cellular Disco prototype on a 32-processor SGI Origin 2000 system. We show that the execution time penalty for this approach is low, typically within 10% of the best available commercial operating system for most workloads, and that it can manage the CPU and memory resources of the machine significantly better than the hardware partitioning approach.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
 |
3
|
J. Chapin , M. Rosenblum , S. Devine , T. Lahiri , D. Teodosiu , A. Gupta, Hive: fault containment for shared-memory multiprocessors, Proceedings of the fifteenth ACM symposium on Operating systems principles, p.12-25, December 03-06, 1995, Copper Mountain, Colorado, United States
|
| |
4
|
Compaq Computer Corporation. OpenVMS Galaxy. http ://www. openvms.di g ital.com/availability/ galaxy. html. Accessed October 1999.
|
| |
5
|
R. J. Creasy. The Origin of the VM/370 Time-Sharing System. IBM J. Res. Develop 25(5) pp. 483-490, 1981.
|
 |
6
|
M. J. Feeley , W. E. Morgan , E. P. Pighin , A. R. Karlin , H. M. Levy , C. A. Thekkath, Implementing global memory management in a workstation cluster, Proceedings of the fifteenth ACM symposium on Operating systems principles, p.201-212, December 03-06, 1995, Copper Mountain, Colorado, United States
|
| |
7
|
Mike Galles and Eric Williams. Performance Optimizations, Implementation, and Verification of the SGI Challenge Multiprocessor. In Proceedings of the 27th Hawaii International Conference on System Sciences, Volume 1: Architecture, pp. 134-143. January 1994.
|
| |
8
|
Ben Gamsa , Orran Krieger , Jonathan Appavoo , Michael Stumm, Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system, Proceedings of the third symposium on Operating systems design and implementation, p.87-100, February 1999, New Orleans, Louisiana, United States
|
| |
9
|
Robert P. Goldberg. Survey of Virtual Machine Research. IEEE Computer Magazine 7(6), pp. 34-45. June 1974.
|
| |
10
|
IBM Corporation. The K42 Project. http://www, research. ibm.com/K42/index.html. Accessed October 1999.
|
| |
11
|
|
| |
12
|
|
 |
13
|
J. Kuskin , D. Ofelt , M. Heinrich , J. Heinlein , R. Simoni , K. Gharachorloo , J. Chapin , D. Nakahira , J. Baxter , M. Horowitz , A. Gupta , M. Rosenblum , J. Hennessy, The Stanford FLASH multiprocessor, Proceedings of the 21ST annual international symposium on Computer architecture, p.302-313, April 18-21, 1994, Chicago, Illinois, United States
|
 |
14
|
|
| |
15
|
H. M. Levy and P. H. Lipman. Virtual Memory Management in the VAX/VMS Operating System. IEEE Computer, 15(3), pp. 35-41. March 1982.
|
| |
16
|
|
| |
17
|
Dejan S. Milojicic, Fred Douglis, Yves Paindaveine, Richard Wheeler and Songnian Zhou. Process Migration. TOG Research Institute Technical Report. December 1996.
|
| |
18
|
Richard Rashid , Avadis Tevanin, Jr. , Michael Young , David Golub , Robert Baron, Machine-Independent Virtual Memory Management for Paged Uniprocessor and Multiprocessor Architectures, IEEE Transactions on Computers, v.37 n.8, p.896-908, August 1988
[doi> 10.1109/12.2242]
|
 |
19
|
|
| |
20
|
Seawright, L.H., and MacKinnon, R.A. VM/370: A study of multiplicity and usefulness. IBM Systems Journal, 18(1), pp. 4-17. 1979.
|
| |
21
|
Sequent Computer Systems, Inc. Sequent's Application Region Manager. http ://www. sequent, com/dcsolutions/ agile_wpl.html. Accessed October 1999.
|
| |
22
|
SGI Inc. IRIX 6.5. http://www, sgi.com/software/irix6.5. Accessed October 1999.
|
| |
23
|
Standard Performance Evaluation Corporation. SPECweb96 Benchmark. http://www, spec.org/osg/ web96. Accessed October 1999.
|
 |
24
|
Vijayaraghavan Soundararajan , Mark Heinrich , Ben Verghese , Kourosh Gharachorloo , Anoop Gupta , John Hennessy, Flexible use of memory for replication/migration in cache-coherent DSM multiprocessors, Proceedings of the 25th annual international symposium on Computer architecture, p.342-355, June 27-July 02, 1998, Barcelona, Spain
|
| |
25
|
Sun Microsystems, Inc. Sun Enterprise 10000 Server: Dynamic System Domains. http://www, sun.com/servers/ highend/lOOOO/Tour/domains.html. Accessed October 1999.
|
 |
26
|
Dan Teodosiu , Joel Baxter , Kinshuk Govil , John Chapin , Mendel Rosenblum , Mark Horowitz, Hardware fault containment in scalable shared-memory multiprocessors, Proceedings of the 24th annual international symposium on Computer architecture, p.73-84, June 01-04, 1997, Denver, Colorado, United States
|
| |
27
|
Transaction Processing Performance Council. TPC Benchmark D (Decision Support) Standard Specification. TPC, San Jose, CA. June 1997.
|
| |
28
|
Unisys Corporation. Cellular MultiProcessing: Breakthrough Architecture for an Open Mainframe. http:// www'marketplace'unisys'c~m/ent/cmp'html' Accessed October 1999.
|
 |
29
|
Ben Verghese , Scott Devine , Anoop Gupta , Mendel Rosenblum, Operating system support for improving data locality on CC-NUMA compute servers, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.279-289, October 01-04, 1996, Cambridge, Massachusetts, United States
|
| |
30
|
VMWare. Virtual Platform. http://www, vmware.com/ pr~ducts/virtualplatf~rm'html' Accessed October 1999.
|
 |
31
|
Steven Cameron Woo , Moriyoshi Ohara , Evan Torrie , Jaswinder Pal Singh , Anoop Gupta, The SPLASH-2 programs: characterization and methodological considerations, Proceedings of the 22nd annual international symposium on Computer architecture, p.24-36, June 22-24, 1995, S. Margherita Ligure, Italy
|
CITED BY 17
|
|
David Oppenheimer , Aaron Brown , James Beck , Daniel Hettena , Jon Kuroda , Noah Treuhaft , David A. Patterson , Katherine Yelick, ROC-1: Hardware Support for Recovery-Oriented Computing, IEEE Transactions on Computers, v.51 n.2, p.100-107, February 2002
|
|
|
|
|
|
|
|
|
Paul Barham , Boris Dragovic , Keir Fraser , Steven Hand , Tim Harris , Alex Ho , Rolf Neugebauer , Ian Pratt , Andrew Warfield, Xen and the art of virtualization, Proceedings of the nineteenth ACM symposium on Operating systems principles, October 19-22, 2003, Bolton Landing, NY, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abhishek Chandra , Micah Adler , Pawan Goyal , Prashant Shenoy, Surplus fair scheduling: a proportional-share CPU scheduling algorithm for symmetric multiprocessors, Proceedings of the 4th conference on Symposium on Operating System Design & Implementation, p.4-4, October 22-25, 2000, San Diego, California
|
|
|
Dejan Milojicic , Steve Hoyle , Alan Messer , Albert Munoz , Lance Russell , Tom Wylegala , Vivekanand Vellanki , Stephen Childs, Global memory management for a multi computer system, Proceedings of the 4th conference on USENIX Windows Systems Symposium, p.12-12, August 03-04, 2000, Seattle, Washington
|
|
|
|
|
|
|
|
|
Jonathan Appavoo , Dilma Da Silva , Orran Krieger , Marc Auslander , Michal Ostrowski , Bryan Rosenburg , Amos Waterland , Robert W. Wisniewski , Jimi Xenidis , Michael Stumm , Livio Soares, Experience distributing objects in an SMMP OS, ACM Transactions on Computer Systems (TOCS), v.25 n.3, p.6-es, August 2007
|
|
|
|
|
|
|
|