|
ABSTRACT
Modern storage systems are required to scale to large storage capacities and I/O throughput in a cost effective manner. For this reason, they are increasingly being built out of commodity components, mainly PCs equipped with large numbers of disks and interconnected of high-performance system area networks. A main issue in these efforts is to achieve high I/O throughput over commodity, low-cost system area networks and commodity operating systems.In this work, we examine in detail the performance of remote block-level storage I/O over commodity, RDMA-capable network interfaces and networks. We examine the support that is required from the network interface for achieving high throughput. We also examine in detail the overheads associated in kernel-level protocols for networked storage access. We find that base system performance is limited by (a) interrupt cost, (b) request size, and (c) protocol message size. We examine the impact of techniques to alleviate these factors and find that our techniques combined can improve throughput by up to 100% over a simpler unoptimized configuration. Our current prototype is able to achieve a throughput of about 200 MBytes/s over a network that is capable of delivering about 500 MBytes/s. We identify major limiting factors, mostly at the I/O target-side.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
An infiniband technology overview. Infiniband Trade Association, http://www.infinibandta.org/ibta.
|
| |
2
|
Reiserfs. Namesys Inc, http://www.namesys.com.
|
| |
3
|
Rocket i/o user guide. Xilinx Inc, http://www.xilinx.com/bvdocs/userguides/ug024.pdf.
|
| |
4
|
Soichiro Araki , Angelos Bilas , Cezary Dubnicki , Jan Edler , Koichi Konishi , James Philbin, User-space communication: a quantitative study, Proceedings of the 1998 ACM/IEEE conference on Supercomputing (CDROM), p.1-16, November 07-13, 1998, San Jose, CA
|
 |
5
|
|
 |
6
|
T. von Eicken , A. Basu , V. Buch , W. Vogels, U-Net: a user-level network interface for parallel and distributed computing (includes URL), Proceedings of the fifteenth ACM symposium on Operating systems principles, p.40-53, December 03-06, 1995, Copper Mountain, Colorado, United States
|
| |
7
|
|
 |
8
|
M. A. Blumrich , K. Li , R. Alpert , C. Dubnicki , E. W. Felten , J. Sandberg, Virtual memory mapped network interface for the SHRIMP multicomputer, Proceedings of the 21ST annual international symposium on Computer architecture, p.142-153, April 18-21, 1994, Chicago, Illinois, United States
|
| |
9
|
Nanette J. Boden , Danny Cohen , Robert E. Felderman , Alan E. Kulawik , Charles L. Seitz , Jakov N. Seizovic , Wen-King Su, Myrinet: A Gigabit-per-Second Local Area Network, IEEE Micro, v.15 n.1, p.29-36, February 1995
[doi> 10.1109/40.342015]
|
| |
10
|
|
| |
11
|
J. Chase, A. Gallatin, and K. Yocum. End-system optimizations for high-speed tcp. IEEE Communications, 39(4):68--74, 2001. Special issue on TCP Performance in Future Networking Environments.
|
 |
12
|
Yuqun Chen , Angelos Bilas , Stefanos N. Damianakis , Cezary Dubnicki , Kai Li, UTLB: a mechanism for address translation on network interfaces, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.193-204, October 02-07, 1998, San Jose, California, United States
|
| |
13
|
C. Dubnicki, A. Bilas, Y. Chen, S. Damianakis, and K. Li. VMMC-2: efficient support for reliable, connection-oriented communication. In Proc. of The 1997 IEEE Symposium on High Performance Interconnects (HOT Interconnects V). Stanford CA, USA., Aug. 1997
|
| |
14
|
Dave Dunning , Greg Regnier , Gary McAlpine , Don Cameron , Bill Shubert , Frank Berry , Anne Marie Merritt , Ed Gronke , Chris Dodd, The Virtual Interface Architecture, IEEE Micro, v.18 n.2, p.66-76, March 1998
[doi> 10.1109/40.671404]
|
 |
15
|
Thorsten von Eicken , David E. Culler , Seth Copen Goldstein , Klaus Erik Schauser, Active messages: a mechanism for integrated communication and computation, Proceedings of the 19th annual international symposium on Computer architecture, p.256-266, May 19-21, 1992, Queensland, Australia
|
| |
16
|
Giganet, Giganet cLAN family of products. http://www.emulex.com/products.html, 2001.
|
| |
17
|
|
| |
18
|
I/O Performance Inc. The xdd i/o benchmark. http://www.ioperformance.com.
|
| |
19
|
G. Kalokairinos, V. Papaefstathiou, A. Ioannou, D. Simos, M. Papamichail, G. Mihelogiannakis, M. Marazakis, D. Pnevmatikatos, and M. Katevenis. Design and implementation of a multi-gigabit nic anda scalable buffered crossbar switch. Technical Report TR376-04-2006, FORTH-ICS, 2006.
|
| |
20
|
J. Katcher. Postmark: A new file system benchmark. Technical Report TR3022, Network Applicance Inc., 1997.
|
| |
21
|
|
 |
22
|
H. T. Kung , Trevor Blackwell , Alan Chapman, Credit-based flow control for ATM networks: credit update protocol, adaptive credit allocation and statistical multiplexing, Proceedings of the conference on Communications architectures, protocols and applications, p.101-114, August 31-September 02, 1994, London, United Kingdom
|
| |
23
|
J. Liu, B. Chandrasekaran, W. Yu, J. Wu, D. Buntinas, S. Kini, and D. Panda. Microbenchmark performance comparison of high-speed cluster interconnects. IEEE Micro, 24(1):42--51, 2004.
|
| |
24
|
|
 |
25
|
Alan M. Mainwaring , David E. Culler, Design challenges of virtual networks: fast, general-purpose communication, Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming, p.119-130, May 04-06, 1999, Atlanta, Georgia, United States
|
| |
26
|
D. Mayhew and V. Krishnan. Pci express and advanced switching: Evolutionary path to building next-generation interconnects. In Proceedings of the 11th IEEE Symposium on High Performance Interconnects, 2003.
|
| |
27
|
|
| |
28
|
V. Olaru and W. Tichy. On the design and performance of remote disk drivers for clusters of pcs. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications. CSREA Press, 2004.
|
| |
29
|
|
| |
30
|
PCI-SIG. Pci express. http://www.pcisig.com.
|
| |
31
|
|
| |
32
|
|
| |
33
|
J. Pinkerton. The case for rdma, 2002. RDMA Consortium, http://www.rdmaconsortium.org/home/The_Case_for_RDMA-02053.pdf.
|
| |
34
|
L. Prylli and B. Tourancheau. BIP: a new protocol designed for high performance. In PC-NOW Workshop, held in parallel with IPPS/SPDP98, Orlando, USA, March 30 - April 3 1998.
|
| |
35
|
Greg Regnier , Srihari Makineni , Ramesh Illikkal , Ravi Iyer , Dave Minturn , Ram Huggahalli , Don Newell , Linda Cline , Annie Foong, TCP Onloading for Data Center Servers, Computer, v.37 n.11, p.48-58, November 2004
[doi> 10.1109/MC.2004.223]
|
| |
36
|
G. Regnier, D. Minturn, G. McAlpine, V. Saletore, and A. Foong. Eta: Experience with an intel xeon processor as a packet processing engine. IEEE Micro, 24(1):24--31, 2004.
|
| |
37
|
Y. Saito, S. Frolund, A. Veitch, A. Merchant, and S. Spence. FAB: Enterprise storage systems on a shoestring. In Proc. of the ASPLOS 2004, Oct. 2004.
|
| |
38
|
A. S. I. SIG. Asi technical overview. http://www.asi-sig.org.
|
| |
39
|
H. Tezuka, A. Hori, and Y. Ishikawa. PM: a high-performance communication library for multi-user parallel environments. Technical Report TR-96015, Real World Computing Partnership, 1996.
|
| |
40
|
|
 |
41
|
|
 |
42
|
Yuanyuan Zhou , Angelos Bilas , Suresh Jagannathan , Cezary Dubnicki , James F. Philbin , Kai Li, Experiences with VI communication for database storage, Proceedings of the 29th annual international symposium on Computer architecture, p.257, May 25-29, 2002, Anchorage, Alaska
|
|