| Dynamic run-time architecture techniques for enabling continuous optimization |
| Full text |
Pdf
(331 KB)
|
| Source
|
Conference On Computing Frontiers
archive
Proceedings of the 2nd conference on Computing frontiers
table of contents
Ischia, Italy
SESSION: Track 7: compilers and operating systems
table of contents
Pages: 211 - 220
Year of Publication: 2005
ISBN:1-59593-019-1
|
|
Authors
|
|
Tipp Moseley
|
University of Colorado, Boulder, CO
|
|
Alex Shye
|
University of Colorado, Boulder, CO
|
|
Vijay Janapa Reddi
|
University of Colorado, Boulder, CO
|
|
Matthew Iyer
|
University of Colorado, Boulder, CO
|
|
Dan Fay
|
University of Colorado, Boulder, CO
|
|
David Hodgdon
|
University of Colorado, Boulder, CO
|
|
Joshua L. Kihm
|
University of Colorado, Boulder, CO
|
|
Alex Settle
|
University of Colorado, Boulder, CO
|
|
Dirk Grunwald
|
University of Colorado, Boulder, CO
|
|
Daniel A. Connors
|
University of Colorado, Boulder, CO
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 38, Citation Count: 2
|
|
|
ABSTRACT
Future computer systems will integrate tens of multithreaded processor cores on a single chip die, resulting in hundreds of concurrent program threads sharing system resources. These designs will be the cornerstone of improving throughput in high-performance computing and server environments. However, to date, appropriate systems software (operating system, run-time system, and compiler) technologies for these emerging machines have not been adequately explored. Future processors will require sophisticated hardware monitoring units to continuously feed back resource utilization information to allow the operating system to make optimal thread co-scheduling decisions and also to software that continuously optimizes the program itselfNevertheless, in order to continually and automatically adapt systems resources to program behaviors and application needs, specific run-time information must be collected to adequately enable dynamic code optimization and operating system scheduling. Generally, run-time optimization is limited by the time required to collect profiles, the time required to perform optimization, and the inherent benefits of any optimization or decisions. Initial techniques for effectively utilizing run-time information for dynamic optimization and informed thread scheduling in future multithreaded architectures are presented
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Jennifer M. Anderson , Lance M. Berc , Jeffrey Dean , Sanjay Ghemawat , Monika R. Henzinger , Shun-Tak A. Leung , Richard L. Sites , Mark T. Vandevoorde , Carl A. Waldspurger , William E. Weihl, Continuous profiling: where have all the cycles gone?, Proceedings of the sixteenth ACM symposium on Operating systems principles, p.1-14, October 05-08, 1997, Saint Malo, France
|
| |
2
|
Apple Computer, Inc. http://developer.apple.com/tools/performance/.
|
| |
3
|
|
 |
4
|
Thomas Ball , Peter Mataga , Mooly Sagiv, Edge profiling versus path profiling: the showdown, Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, p.134-148, January 19-21, 1998, San Diego, California, United States
[doi> 10.1145/268946.268958]
|
| |
5
|
|
| |
6
|
J. M. Borkenhagen, R. J. Eickemeyer, R. N. Kalla, and S. R. Kunkel. A multithreaded powerpc processor for commercial servers. IBM Journal of Research and Development, 44(6):885--898, November 2000.
|
| |
7
|
Howard Chen , Wei-Chung Hsu , Jiwei Lu , Pen-Chung Yew , Dong-Yuan Chen, Dynamic trace selection using performance monitoring hardware sampling, Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, March 23-26, 2003, San Francisco, California
|
| |
8
|
|
| |
9
|
A. Eustace and A. Srivastava. ATOM: A flexible interface for building high performance program analysis tools. In Proceedings of the Winter 1995 USENIX Conference, January 1995.
|
| |
10
|
Hewlett-Packard Development Company. perfmon project http://www.hpl.hp.com/research/linux/perfmon/.
|
 |
11
|
|
| |
12
|
Wen-Mei W. Hwu , Scott A. Mahlke , William Y. Chen , Pohua P. Chang , Nancy J. Warter , Roger A. Bringmann , Roland G. Ouellette , Richard E. Hank , Tokuzo Kiyohara , Grant E. Haab , John G. Holm , Daniel M. Lavery, The superblock: an effective technique for VLIW and superscalar compilation, The Journal of Supercomputing, v.7 n.1-2, p.229-248, May 1993
[doi> 10.1007/BF01205185]
|
| |
13
|
Intel Corporation. Special issue on intel hyperthreading in pentium-4 processors. Intel Technology Journal, 1(1), January 2002.
|
| |
14
|
Intel Corporation. Intel Itanium 2 processor reference manual: For software development and optimization. May 2004.
|
| |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
J. Lu, H. Chen, P.-C. Yew, and W.-C. Hsu. Design and implementation of a lightweight dynamic optimization system. In Journal of Instruction-Level Parallelism 6(2004), pages 1--24, April 2004.
|
| |
19
|
D. T. Marr, F. Binns, D. L. Hill, G. Hinton, D. A. Koufaty, J. A. Miller, and M. Upton. Hyper-threading technology architecture and microarchitecture. Intel Technology Journal, 6(1):4--15, Feb. 2002.
|
 |
20
|
Matthew C. Merten , Andrew R. Trick , Erik M. Nystrom , Ronald D. Barnes , Wen-mei W. Hmu, A hardware mechanism for dynamic extraction and relayout of program hot spots, Proceedings of the 27th annual international symposium on Computer architecture, p.59-70, June 2000, Vancouver, British Columbia, Canada
|
| |
21
|
OpenIMPACT Research Compiler. http://www.gelato.uiuc.edu/.
|
 |
22
|
|
| |
23
|
R Development Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, 2004. 3-900051-07-0.
|
| |
24
|
A. Settle, J. Kihm, , A. Janiszewski, and D. Connors. Performance analysis of simultaneous multithreading in a powerpc-based processor. In Proceedings of the International Conference on Parallel Architectures and Compiler Techniques, October 2004.
|
 |
25
|
|
| |
26
|
A. Snavely and L. Carter. Symbiotic jobscheduling on the tera mta. In Workshop on Multi-Threaded Execution Architecture and Compilers, Jan 2000.
|
 |
27
|
|
 |
28
|
|
| |
29
|
|
 |
30
|
|
| |
31
|
D. M. Tullsen, J. L. Lo, S. J. Eggers, and H. M. Levy. Supporting fine-grained synchronization on a simultaneous multithreading processor. In International Symposium on Architectural Support for Programming Languages and Operating Systems, pages 54--58, 2000.
|
 |
32
|
|
CITED BY 2
|
|
|
|
|
Angela C. Sodan , Garima Gupta , Lin Han , Lun Liu , Benjamin Lafreniere, Time and space adaptation for computational grids with the ATOP-Grid middleware, Future Generation Computer Systems, v.24 n.6, p.561-581, June, 2008
|
|