|
ABSTRACT
Computer systems are resource constrained. Application adaptation is a useful way to optimize system resource usage while satisfying the application performance constraints. Previous application adaptation efforts, however, were ad-hoc, time-consuming, and highly application-specific with limited portability between computer systems. In this work, our goal is to provide a development platform to systematically explore and rigorously apply portable application-specific runtime optimization. We present OCCAM, a software platform for developing multicore adaptive applications. OCCAM's design-time platform consists of APIs and data structures that allow application developers to specify the performance constraints and application-specific optimization techniques. OCCAM's run-time system dynamically manages the application behavior and optimizes system resource usage. OCCAM targets emerging Recognition, Mining, and Synthesis Applications (RMS). Using a set of RMS benchmarks, the experimental study demonstrates that OCCAM can successfully optimize resource usage under application performance constraints across a wide range of computer platforms, with an average of 38% energy savings on an Intel Atom-based, energy-constrained portable system, and an average of 24% energy savings on a high-performance, dual-core computer platform. These savings are accomplished with low overhead. We have also successfully extended OCCAM applications to run on a 16-core setup.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
"Real-time stereo vision based on the uniqueness constraint: experimental results and applications," http://vision.deis.unibo.it/smatt/stereo.htm.
|
| |
2
|
NVIDIA CUDA Compute Unified Device Architecture - Programming Guide, 2007. [Online]. Available: http://developer.download.nvidia.com/compute/cuda/10/NVIDIACUDAProgrammingGuide1.0.pdf
|
| |
3
|
S. Adve, A. Harris, C. Hughes, D. Jones, R. Kravets, K. Nahrstedt, D. Sachs, R. Sasanka, J. Srinivasan, and W. Yuan, "The illinois grace project: Global resource adaptation through cooperation," 2002. [Online]. Available: citeseer.ist.psu.edu/adve02illinois.html
|
| |
4
|
C. Becker and G. Schiele, "Middleware and application adaptation requirements and their support in pervasive computing," in In Proceedings of the 3rd International Workshop on Distributed Auto-adaptive and Reconfigurable Systems (DARES) at ICDCS, 2003, pp. 98--103.
|
| |
5
|
P. J. Braam, "The coda distributed file system," Linux J., vol. 1998, no. 50es, 1998. [Online]. Available: http://portal.acm.org/citation.cfm?id=327403
|
| |
6
|
I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan, "Brook for GPUs: Stream computing on graphics hardware," 2004, submitted to ACM Transactions on Graphics, 2004. [Online]. Available: citeseer.ist.psu.edu/article/buck04brook.html
|
| |
7
|
Y.-K. Chen, J. Chhugani, P. Dubey, C. Hughes, D. Kim, S. Kumar, V. Lee, A. Nguyen, and M. Smelyanskiy, "Convergence of recognition, mining, and synthesis workloads and its implications," Proceedings of the IEEE, vol. 96, no. 5, pp. 790--807, May 2008.
|
| |
8
|
C. T. Chu, S. K. Kim, Y. A. Lin, Y. Yu, G. R. Bradski, A. Y. Ng, and K. Olukotun, "Map-reduce for machine learning on multicore," in NIPS, B. Scholkopf, J. C. Platt, and T. Hoffman, Eds.1em plus 0.5em minus 0.4emMIT Press, 2006, pp. 281--288. [Online]. Available: http://dblp.uni-trier.de/rec/bibtex/conf/nips/ChuKLYBNO06
|
| |
9
|
G. Contreras and M. Martonosi, "Characterizing and improving the performance of the intel threading building blocks runtime system," in International Symposium on Workload Characterization (IISWC 2008), September 2008. [Online]. Available: http://www.gigascale.org/pubs/1350.html
|
| |
10
|
J. Dean and S. Ghemawat, "Mapreduce: Simplified data processing on large clusters," pp. 137--150. [Online]. Available: http://www.usenix.org/events/osdi04/tech/dean.html
|
| |
11
|
W. R. Dieter, S. Datta, and W. K. Kai, "Power reduction by varying sampling rate," in Proceedings of the International Symposium on Low Power Electronics and Design, ACM SIGDA.1em plus 0.5em minus 0.4em New York, NY, USA: ACM Press, August 2005, pp. 227--232.
|
| |
12
|
J. C. et. al., "Powernowd," http://deater.net/john/powernowd.html.
|
| |
13
|
M. Gordon, W. Thies, and S. Amarasinghe, "Exploiting coarse-grained task, data, and pipeline parallelism in stream programs," in International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, Oct 2006. [Online]. Available: http://groups.csail.mit.edu/commit/papers/./06/gordon-asplos06.pdf
|
| |
14
|
Y. Kim, Y. Cho, N. Chang, C. Chakrabarti, and N. I. Cho, "Extending the lifetime of media recorders constrained by battery and flash memory size," in ISLPED '08: Proceeding of the thirteenth international symposium on Low power electronics and design.1em plus 0.5em minus 0.4em New York, NY, USA: ACM, 2008, pp. 159--164.
|
| |
15
|
K. Kumar, Y. Nimmagadda, Y.-J. Hong, and Y.-H. Lu, "Energy conservation by adaptive feature loading for mobile content-based image retrieval," in ISLPED '08: Proceeding of the thirteenth international symposium on Low power electronics and design.1em plus 0.5em minus 0.4em New York, NY, USA: ACM, 2008, pp. 153--158.
|
| |
16
|
F. Labonte, P. Mattson, I. Buck, C. Kozyrakis, and M. Horowitz, "The stream virtual machine," in Proceedings of the 2004 International Conference on Parallel Architectures and Compilation Techniques, Antibes Juan-les-pins, France, September 2004.
|
| |
17
|
M.-L. Li, R. Sasanka, S. Adve, Y.-K. Chen, and E. Debes, "The alpbench benchmark suite for complex multimedia applications," IEEE Workload Characterization Symposium, vol. 0, pp. 34--45, 2005.
|
| |
18
|
R. Murphy, "On the effects of memory latency and bandwidth on supercomputer application performance," Workload Characterization, 2007. IISWC 2007.
|
| |
19
|
IEEE 10th International Symposium on, pp. 35--43, Sept. 2007.
|
| |
20
|
B. Noble, "System support for mobile, adaptive applications," IEEE Personal Communications, vol. 7, no. 1, pp. 44--49, Feb. 2000. [Online]. Available: http://www.cs.cmu.edu/afs/cs/project/coda/Web/docdir/ieeepcs00.pdf
|
| |
21
|
J. Peddersen and S. Parameswaran, "Energy driven application self-adaptation," in VLSI Design, 2007. Held jointly with 6th International Conference on Embedded Systems., 20th International Conference on, Jan. 2007, pp. 385--390.
|
| |
22
|
C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis, "Evaluating mapreduce for multi-core and multiprocessor systems," High-Performance Computer Architecture, International Symposium on, vol. 0, pp. 13--24, 2007.
|
| |
23
|
M. Shafique, L. Bauer, and J. Henkel, "3-tier dynamically adaptive power-aware motion estimator for h.264/avc video encoding," in ISLPED '08: Proceeding of the thirteenth international symposium on Low power electronics and design.1em plus 0.5em minus 0.4em New York, NY, USA: ACM, 2008, pp. 147--152.
|
| |
24
|
P. D. Yale, "Fast monte-carlo algorithms for approximate matrix multiplication," in In Proceedings of the 42nd Annual IEEE Symposium on Foundations of Computer Science, 2001, pp. 452--459.
|
| |
25
|
D. Zhang, "A streaming computation framework for the cell processor," 2007, m. Eng. Thesis, Massachusetts Institute of Technology.
|
|