|
ABSTRACT
In this paper, we (i) quantify the impact of compiler-directed I/O prefetching on shared caches at I/O nodes. The experimental data collected shows that while I/O prefetching brings some benefits, its effectiveness reduces significantly as the number of clients (compute nodes) is increased; (ii) identify interclient misses due to harmful I/O prefetches as one of the main sources for this reduction in performance with increased number of clients; and (iii) propose and experimentally evaluate prefetch throttling and data pinning schemes to improve performance of I/O prefetching. Prefetch throttling prevents one or more clients from issuing further prefetches if such prefetches are predicted to be harmful, i.e., replace from the memory cache the useful data accessed by other clients. Data pinning on the other hand makes selected data blocks immune to harmful prefetches by pinning them in the memory cache. We show that these two schemes can be applied in isolation or combined together, and they can be applied at a coarse or fine granularity. Our experiments with these two optimizations using four disk-intensive applications reveal that they can improve performance by 9.7% and 15.1% on average, over standard compiler-directed I/O prefetching and no-prefetch case, respectively, when 8 clients are used.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
D. H. Bailey , E. Barszcz , J. T. Barton , D. S. Browning , R. L. Carter , L. Dagum , R. A. Fatoohi , P. O. Frederickson , T. A. Lasinski , R. S. Schreiber , H. D. Simon , V. Venkatakrishnan , S. K. Weeratunga, The NAS parallel benchmarks—summary and preliminary results, Proceedings of the 1991 ACM/IEEE conference on Supercomputing, p.158-165, November 18-22, 1991, Albuquerque, New Mexico, United States
[doi> 10.1145/125826.125925]
|
| |
2
|
|
| |
3
|
|
| |
4
|
Philip H. Carns , Walter B. Ligon, III , Robert B. Ross , Rajeev Thakur, PVFS: a parallel file system for linux clusters, Proceedings of the 4th annual Linux Showcase & Conference, p.28-28, October 10-14, 2000, Atlanta, Georgia
|
| |
5
|
|
| |
6
|
Z. Chen, Y. Zhou, and K. Li. Eviction-based Cache Placement for Storage Caches. In Proceedings of the USENIX Annual Technical Conference, pages 269--281, 2003.
|
| |
7
|
F. J. Corbato. A Paging Experiment with the Multics System, 1969.
|
| |
8
|
|
| |
9
|
Xiaoning Ding , Song Jiang , Feng Chen , Kei Davis , Xiaodong Zhang, DiskSeen: exploiting disk layout and access history to enhance I/O prefetch, 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference, p.1-14, June 17-22, 2007, Santa Clara, CA
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
Song Jiang , Xiaoning Ding , Feng Chen , Enhua Tan , Xiaodong Zhang, DULO: an effective buffer cache management scheme to exploit both temporal and spatial locality, Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies, p.8-8, December 13-16, 2005, San Francisco, CA
|
 |
16
|
|
| |
17
|
|
 |
18
|
|
 |
19
|
|
| |
20
|
|
 |
21
|
Donghee Lee , Jongmoo Choi , Jong-Hun Kim , Sam H. Noh , Sang Lyul Min , Yookun Cho , Chong Sang Kim, On the existence of a spectrum of policies that subsumes the least recently used (LRU) and least frequently used (LFU) policies, Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, p.134-143, May 01-04, 1999, Atlanta, Georgia, United States
|
| |
22
|
|
| |
23
|
Xuhui Li , Ashraf Aboulnaga , Kenneth Salem , Aamer Sachedina , Shaobo Gao, Second-tier cache management using write hints, Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies, p.9-9, December 13-16, 2005, San Francisco, CA
|
| |
24
|
|
 |
25
|
Todd C. Mowry , Angela K. Demke , Orran Krieger, Automatic compiler-inserted I/O prefetching for out-of-core applications, Proceedings of the second USENIX symposium on Operating systems design and implementation, p.3-17, October 29-November 01, 1996, Seattle, Washington, United States
|
 |
26
|
|
 |
27
|
R. H. Patterson , G. A. Gibson , E. Ginting , D. Stodolsky , J. Zelenka, Informed prefetching and caching, Proceedings of the fifteenth ACM symposium on Operating systems principles, p.79-95, December 03-06, 1995, Copper Mountain, Colorado, United States
|
 |
28
|
Tracy Kimbrel , Andrew Tomkins , R. Hugo Patterson , Brian Bershad , Pei Cao , Edward W. Felten , Garth A. Gibson , Anna R. Karlin , Kai Li, A trace-driven comparison of algorithms for parallel prefetching and caching, Proceedings of the second USENIX symposium on Operating systems design and implementation, p.19-34, October 29-November 01, 1996, Seattle, Washington, United States
|
| |
29
|
|
| |
30
|
|
 |
31
|
Andrew Tomkins , R. Hugo Patterson , Garth Gibson, Informed multi-process prefetching and caching, Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, p.100-114, June 15-18, 1997, Seattle, Washington, United States
|
| |
32
|
|
| |
33
|
|
 |
34
|
Robert P. Wilson , Robert S. French , Christopher S. Wilson , Saman P. Amarasinghe , Jennifer M. Anderson , Steve W. K. Tjiang , Shih-Wei Liao , Chau-Wen Tseng , Mary W. Hall , Monica S. Lam , John L. Hennessy, SUIF: an infrastructure for research on parallelizing and optimizing compilers, ACM SIGPLAN Notices, v.29 n.12, p.31-37, Dec. 1994
[doi> 10.1145/193209.193217]
|
| |
35
|
|
| |
36
|
|
| |
37
|
|
| |
38
|
|
|