|
ABSTRACT
We develop and evaluate a system for load management in shared-disk file systems built on clusters of heterogeneous computers. The system generalizes load balancing and server provisioning. It balances file metadata workload by moving file sets among cluster server nodes. It also responds to changing server resources that arise from failure and recovery and dynamically adding or removing servers. The system is adaptive and self-managing. It operates without any a-priori knowledge of workload properties or the capabilities of the servers. Rather, it continuously tunes load placement using a technique called adaptive, non-uniform (ANU) randomization. ANU randomization realizes the scalability and metadata reduction benefits of hash-based, randomized placement techniques. It also avoids hashing's drawbacks: load skew, inability to cope with heterogeneity, and lack of tunability. Simulation results show that our load-management algorithm performs comparably to a prescient algorithm.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Khalil Amiri , David Petrou , Gregory R. Ganger , Garth A. Gibson, Dynamic function placement for data-intensive cluster computing, Proceedings of the Annual Technical Conference on 2000 USENIX Annual Technical Conference, p.25-25, June 18-23, 2000, San Diego, California
|
| |
2
|
|
| |
3
|
[3] L. Aversa and A. Bestavros. Load balancing a cluster of web servers using distributed packet rewriting. In Proceedings of the IEEE International Performance, Computing, and Communications Conference, 2000.
|
| |
4
|
|
| |
5
|
[5] G. Bell and J. Gray. High Performance Computing: Crays, Clusters and Centers. What Next? Technical Report MSR-TR- 2001-76, Microsoft Research, 2001.
|
| |
6
|
[6] P. J. Braam. The Lustre storage architecture. Technical Report available at - http://www.lustre.org/docs.html, Lustre, 2002.
|
 |
7
|
André Brinkmann , Kay Salzwedel , Christian Scheideler, Compact, adaptive placement schemes for non-uniform requirements, Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures, August 10-13, 2002, Winnipeg, Manitoba, Canada
[doi> 10.1145/564870.564878]
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
[11] G. R. Ganger, B. L. Worthington, R. Y. Hou, and Y. N. Patt. Disk subsystem load balancing: Disk striping vs. conventional data placement. In Proceedings of the International Conference on System Sciences, 1993.
|
 |
12
|
|
 |
13
|
John H. Howard , Michael L. Kazar , Sherri G. Menees , David A. Nichols , M. Satyanarayanan , Robert N. Sidebotham , Michael J. West, Scale and performance in a distributed file system, ACM Transactions on Computer Systems (TOCS), v.6 n.1, p.51-81, Feb. 1988
[doi> 10.1145/35037.35059]
|
| |
14
|
[14] J. R. Jump. YACSIM reference manual. Rice University, version 2.1.1 edition, 1993.
|
| |
15
|
|
| |
16
|
[16] K. Li and J. Dorband. A task scheduling algorithm for heterogeneous processing. In Proceedings of the Symposium on High Performance Computing, 1997.
|
 |
17
|
|
| |
18
|
|
| |
19
|
|
| |
20
|
|
 |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
 |
25
|
|
| |
26
|
[26] K. W. Preslan, A. P. Barry, J. E. Brassow, G. M. Erickson, E. Nygaard, C. J. Sabol, S. R. Soltis, D. C. Teigland, and M. T. O'Keefe. A 64-bit, shared disk file system for Linux. In Proceedings of the IEEE Mass Storage Systems Symposium, 1999.
|
| |
27
|
|
 |
28
|
Yasushi Saito , Brian N. Bershad , Henry M. Levy, Manageability, availability and performance in Porcupine: a highly scalable, cluster-based mail service, Proceedings of the seventeenth ACM symposium on Operating systems principles, p.1-15, December 12-15, 1999, Charleston, South Carolina, United States
|
| |
29
|
Mahadev Satyanarayanan , James J. Kistler , Puneet Kumar , Maria E. Okasaki , Ellen H. Siegel , David C. Steere, Coda: A Highly Available File System for a Distributed Workstation Environment, IEEE Transactions on Computers, v.39 n.4, p.447-459, April 1990
[doi> 10.1109/12.54838]
|
| |
30
|
|
| |
31
|
[31] P. Shenoy and H. Vin. Efficient striping techniques for multimedia file servers. In Proceedings of the International Workshop on Network and Operating System Support for Digital Audio and Video, 1997.
|
| |
32
|
|
 |
33
|
Ion Stoica , Robert Morris , David Karger , M. Frans Kaashoek , Hari Balakrishnan, Chord: A scalable peer-to-peer lookup service for internet applications, Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, p.149-160, August 2001, San Diego, California, United States
|
 |
34
|
|
| |
35
|
[35] D. Walsh, B. Lyon, G. Sager, J. Chang, D. Goldberg, S. Kleiman, T. Lyon, R. Sandberg, and P. Weiss. Overview of the Sun network file system. In Proceedings of the 1985 Winter Usenix Technical Conference, January 1985.
|
| |
36
|
[36] J. Watts, M. Rieffel, and S. Taylor. Dynamic management of heterogenous resources. In Proceeding of the High Performance Computing Conference: Grand Challenges in Computer Simulation, 1998.
|
| |
37
|
|
| |
38
|
|
| |
39
|
|
| |
40
|
|
| |
41
|
|
| |
42
|
[42] H. Zhu, T. Yang, Q. Zheng, D. Watson, O. H. Ibarra, and T. R. Smith. Adaptive load sharing for clustered digital library servers. International Journal on Digital Libraries, 2(4), 2000.
|
CITED BY
|
|
Yongwei Wu , Likun Liu , Jiayin Mao , Guangwen Yang , Weimin Zheng, An analytical model for performance evaluation in a computational grid, Proceedings of the 2007 Asian technology information program's (ATIP's) 3rd workshop on High performance computing in China: solution approaches to impediments for high performance computing, November 11-11, 2007, Reno, Nevada
|
|