|
ABSTRACT
Threads are the vehicle for concurrency in many approaches to parallel programming. Threads separate the notion of a sequential execution stream from the other aspects of traditional UNIX-like processes, such as address spaces and I/O descriptors. The objective of this separation is to make the expression and control of parallelism sufficiently cheap that the programmer or compiler can exploit even fine-grained parallelism with acceptable overhead.Threads can be supported either by the operating system kernel or by user-level library code in the application address space, but neither approach has been fully satisfactory. This paper addresses this dilemma. First, we argue that the performance of kernel threads is inherently worse than that of user-level threads, rather than this being an artifact of existing implementations; we thus argue that managing parallelism at the user level is essential to high-performance parallel computing. Next, we argue that the lack of system integration exhibited by user-level threads is a consequence of the lack of kernel support for user-level threads provided by contemporary multiprocessor operating systems; we thus argue that kernel threads or processes, as currently conceived, are the wrong abstraction on which to support user-level management of parallelism. Finally, we describe the design, implementation, and performance of a new kernel interface and user-level thread package that together provide the same functionality as kernel threads without compromising the performance and flexibility advantages of user-level management of parallelism.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
Agha 86
|
|
| |
Anderson et al. 89
|
|
| |
Barnes & Hut 86
|
Barnes, J. and Hut, P. A Hierarchical O(N log N) Force-Calculation Algorithm. Nature, 324:446-449, 1986.
|
 |
Birrell et al. 87
|
A. Birrell , J. Guttag , J. Horning , R. Levin, Synchronization primitives for a multiprocessor: a formal specification, Proceedings of the eleventh ACM Symposium on Operating systems principles, p.94-102, November 08-11, 1987, Austin, Texas, United States
|
| |
Black 90
|
|
 |
Chase et al. 89
|
|
 |
Cheriton 88
|
|
| |
Draves & Cooper 88
|
Draves, R. and Cooper, E. C Threads. Technical Report CMU-CS-88-154, School of Computer Science, Carnegie-Mellon University, June 1988.
|
| |
Edler et al. 88
|
Edler, J., Lipkis, J., and Schonberg, E. Process Management for Highly Parallel UNIX Systems. In Proceedings o/the USENIX Workshop on UNIX and Supercomputers, pages 1-17, September 1988.
|
 |
Halstead 85
|
|
 |
Herlihy 90
|
|
 |
Karlin et al. 91
|
Anna R. Karlin , Kai Li , Mark S. Manasse , Susan Owicki, Empirical studies of competitve spinning for a shared-memory multiprocessor, Proceedings of the thirteenth ACM symposium on Operating systems principles, p.41-55, October 13-16, 1991, Pacific Grove, California, United States
|
 |
Lampson & Redell 80
|
|
| |
Lo & Gligor 87
|
Lo, S.-P. and Gligor, V. A Comparative Analysis of Multiprocessor Scheduling Algorithms. In Proceedings of the 7th International Conference on Distributed Computing Systems, pages 356-363, September 1987.
|
 |
Marsh et al. 91
|
Brian D. Marsh , Michael L. Scott , Thomas J. LeBlanc , Evangelos P. Markatos, First-class user-level threads, Proceedings of the thirteenth ACM symposium on Operating systems principles, p.110-121, October 13-16, 1991, Pacific Grove, California, United States
|
| |
Moeller-Nielsen & Staunstrup 87
|
Moeller-Nielsen, P. and Staunstrup, J. Problem-Heap: A Paradigm for Multiprocessor Algorithms. Parallel Computing, 4(1):63-74, February 1987.
|
| |
Moss & Kohler 87
|
|
| |
Redell 88
|
Redell, D. The Topaz Tele-Debugger. In Proceedings of the A CM SIGPLAN/SIGOPS Workshop on Parallel and Distributed Debugging, May 1988.
|
 |
Schroeder & Burrows 90
|
|
 |
Scott et al. 90
|
M. L. Scott , T. J. LeBlanc , B. D. Marsh, Multi-model parallel programming in psyche, Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming, p.70-78, March 14-16, 1990, Seattle, Washington, United States
|
| |
Tevanian et al. 87
|
Tevanian, A., Rashid, R., Golub, D., Black, D., Cooper, E., and Young, M. Mach Threads and the Unix Kernel: The Battle for Control. in Proceedings of the 1987 USENIX Summer Conference, pages 185-197, 1987.
|
| |
Thacker et al. 88
|
|
 |
Tucker & Gupta 89
|
|
| |
Vandevoorde & Roberts 88
|
|
 |
Weiser et al. 89
|
|
| |
Wulf et al. 81
|
Wulf, W., Levin, R., and Harbison, S. Hydra/C, mmp: An Experimental Computer System. McGraw-Hill, 1981.
|
 |
Zahorjan & McCann 90
|
|
| |
Zahorjan et al. 91
|
|
CITED BY 43
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Vijayaraghavan Soundararajan , Mark Heinrich , Ben Verghese , Kourosh Gharachorloo , Anoop Gupta , John Hennessy, Flexible use of memory for replication/migration in cache-coherent DSM multiprocessors, ACM SIGARCH Computer Architecture News, v.26 n.3, p.342-355, June 1998
|
|
|
|
|
|
|
|
|
Eleftherios D. Polychronopoulos , Xavier Martorell , Dimitrios S. Nikolopoulos , Jesus Labarta , Theodore S. Papatheodorou , Nacho Navarro, Kernel-level scheduling for the nano-threads programming model, Proceedings of the 12th international conference on Supercomputing, p.337-344, July 1998, Melbourne, Australia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Robert D. Blumofe , Christopher F. Joerg , Bradley C. Kuszmaul , Charles E. Leiserson , Keith H. Randall , Yuli Zhou, Cilk: an efficient multithreaded runtime system, ACM SIGPLAN Notices, v.30 n.8, p.207-216, Aug. 1995
|
|
|
|
|
|
Soichiro Araki , Angelos Bilas , Cezary Dubnicki , Jan Edler , Koichi Konishi , James Philbin, User-space communication: a quantitative study, Proceedings of the 1998 ACM/IEEE conference on Supercomputing (CDROM), p.1-16, November 07-13, 1998, San Jose, CA
|
|
|
Ivan Kalas , Eshrat Arjomandi , Guang R. Gao , Bill O'Farrell, FTL: a multithreaded environment for parallel computation, Proceedings of the 1994 conference of the Centre for Advanced Studies on Collaborative research, p.33, October 31-November 03, 1994, Toronto, Ontario, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
J. Appavoo , M. Auslander , M. Butrico , D. M. da Silva , O. Krieger , M. F. Mergen , M. Ostrowski , B. Rosenburg , R. W. Wisniewski , J. Xenidis, Experience with K42, an open-source, Linux-compatible, scalable operating-system kernel, IBM Systems Journal, v.44 n.2, p.427-440, January 2005
|
|
|
|
|
|
Daniel Peek , Edmund B. Nightingale , Brett D. Higgins , Puspesh Kumar , Jason Flinn, Sprockets: safe extensions for distributed file systems, 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference, p.1-14, June 17-22, 2007, Santa Clara, CA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Raoul Bhoedjang , Tim Ruhl , Rutger Hofman , Koen Langendoen , Henri Bal , Frans Kaashoek, Panda: a portable platform to support parallel programming languages, USENIX Systems on USENIX Experiences with Distributed and Multiprocessor Systems, p.11-11, September 22-23, 1993, San Diego, California
|
|
|
|
|
|
|
|
|
|
|
|
|
|