|
ABSTRACT
We report the performance of NOW-Sort, a collection of sorting implementations on a Network of Workstations (NOW). We find that parallel sorting on a NOW is competitive to sorting on the large-scale SMPs that have traditionally held the performance records. On a 64-node cluster, we sort 6.0 GB in just under one minute, while a 32-node cluster finishes the Datamation benchmark in 2.41 seconds.
Our implementations can be applied to a variety of disk, memory, and processor configurations; we highlight salient issues for tuning each component of the system. We evaluate the use of commodity operating systems and hardware for parallel sorting. We find existing OS primitives for memory management and file access adequate. Due to aggregate communication and disk bandwidth requirements, the bottleneck of our system is the workstation I/O bus.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
 |
3
|
Remzi H. Arpaci , Andrea C. Dusseau , Amin M. Vahdat , Lok T. Liu , Thomas E. Anderson , David A. Patterson, The interaction of parallel and sequential workloads on a network of workstations, Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, p.267-278, May 15-19, 1995, Ottawa, Ontario, Canada
|
 |
4
|
|
 |
5
|
T. von Eicken , A. Basu , V. Buch , W. Vogels, U-Net: a user-level network interface for parallel and distributed computing (includes URL), Proceedings of the fifteenth ACM symposium on Operating systems principles, p.40-53, December 03-06, 1995, Copper Mountain, Colorado, United States
|
| |
6
|
|
| |
7
|
|
 |
8
|
Guy E. Blelloch , Charles E. Leiserson , Bruce M. Maggs , C. Greg Plaxton , Stephen J. Smith , Marco Zagha, A comparison of sorting algorithms for the connection machine CM-2, Proceedings of the third annual ACM symposium on Parallel algorithms and architectures, p.3-16, July 21-24, 1991, Hilton Head, South Carolina, United States
[doi> 10.1145/113379.113380]
|
 |
9
|
M. A. Blumrich , K. Li , R. Alpert , C. Dubnicki , E. W. Felten , J. Sandberg, Virtual memory mapped network interface for the SHRIMP multicomputer, Proceedings of the 21ST annual international symposium on Computer architecture, p.142-153, April 18-21, 1994, Chicago, Illinois, United States
|
| |
10
|
Nanette J. Boden , Danny Cohen , Robert E. Felderman , Alan E. Kulawik , Charles L. Seitz , Jakov N. Seizovic , Wen-King Su, Myrinet: A Gigabit-per-Second Local Area Network, IEEE Micro, v.15 n.1, p.29-36, February 1995
[doi> 10.1109/40.342015]
|
| |
11
|
H. Boral , W. Alexander , L. Clay , G. Copeland , S. Danforth , M. Franklin , B. Hart , M. Smith , P. Valduriez, Prototyping Bubba, A Highly Parallel Database System, IEEE Transactions on Knowledge and Data Engineering, v.2 n.1, p.4-24, March 1990
[doi> 10.1109/69.50903]
|
 |
12
|
A. Krishnamurthy , D. E. Culler , A. Dusseau , S. C. Goldstein , S. Lumetta , T. von Eicken , K. Yelick, Parallel programming in Split-C, Proceedings of the 1993 ACM/IEEE conference on Supercomputing, p.262-273, December 1993, Portland, Oregon, United States
[doi> 10.1145/169627.169724]
|
| |
13
|
|
| |
14
|
D. J. Dewitt , S. Ghandeharizadeh , D. A. Schneider , A. Bricker , H. -I. Hsiao , R. Rasmussen, The Gamma Database Machine Project, IEEE Transactions on Knowledge and Data Engineering, v.2 n.1, p.44-62, March 1990
[doi> 10.1109/69.50905]
|
| |
15
|
|
| |
16
|
|
 |
17
|
|
 |
18
|
|
| |
19
|
D. P. Ghormley, D. Petrou, A. M. Vahdat, and T. E. Anderson. GLUnix: A Global Layer Unix for NOW. http://now.cs.berkeley.edu/Glunix/glunix.html.
|
| |
20
|
G. Graefe. Volcano: An Extensible and Parallel Dataflow Query Processing System. Technical report, Oregon Graudate Center, June 1989.
|
| |
21
|
G. Graefe. Parallel External Sorting in Volcano. Technical Report CU-CS-459, Computer Science, University of Colorado at Boulder, June 1990.
|
 |
22
|
|
| |
23
|
C. A. R. Hoare. Quicksort. Computer Journal, 5( 1): 10-15,1962.
|
| |
24
|
Steven Kleiman , Jim Voll , Joe Eykholt , Anil Shivalingiah , Dock Williams, Symmetric multiprocessing in Solaris 2.0, Proceedings of the thirty-seventh international conference on COMPCON, p.181-186, January 1992, San Francisco, California, United States
|
 |
25
|
S. J. Smith , X. Li , G. Linoff , C. Stanfill , K. Thearling, A practical external sort for shared disk MPP's, Proceedings of the 1993 ACM/IEEE conference on Supercomputing, p.666-675, December 1993, Portland, Oregon, United States
[doi> 10.1145/169627.169815]
|
 |
26
|
Chris Nyberg , Tom Barclay , Zarka Cvetanovic , Jim Gray , Dave Lomet, AlphaSort: a RISC machine sort, Proceedings of the 1994 ACM SIGMOD international conference on Management of data, p.233-242, May 24-27, 1994, Minneapolis, Minnesota, United States
|
 |
27
|
Betty Salzberg , Alex Tsukerman , Jim Gray , Michael Stuewart , Susan Uren , Bonnie Vaughan, FastSort: a distributed single-input single-output external sort, ACM SIGMOD Record, v.19 n.2, p.94-101, Jun. 1990
|
 |
28
|
|
| |
29
|
M. Stonebraker. The Case for Shared Nothing. Database Engineering, 9(1 ), 1986.
|
| |
30
|
A. Sweeney, D. Doucette, W. Hu, C. Anderson, M. Nishimoto, and G. Peck. Scalability in the XFS File System. in Proceedings of the USENIX 1996 Annual Technical Conference, Jan. 1996.
|
 |
31
|
|
| |
32
|
Teradata Corporation. DBC/IO12 Data Base Computer System Manual, release 2.0 edition, Nov. 1985. Document Number c 10- 0001-02.
|
 |
33
|
Thorsten von Eicken , David E. Culler , Seth Copen Goldstein , Klaus Erik Schauser, Active messages: a mechanism for integrated communication and computation, Proceedings of the 19th annual international symposium on Computer architecture, p.256-266, May 19-21, 1992, Queensland, Australia
|
| |
34
|
|
 |
35
|
|
| |
36
|
|
CITED BY 29
|
|
|
|
|
Andrea C. Arpaci-Dusseau , Remzi H. Arpaci-Dusseau , David E. Culler , Joseph M. Hellerstein , David A. Patterson, Searching for the sorting record: experiences in tuning NOW-Sort, Proceedings of the SIGMETRICS symposium on Parallel and distributed tools, p.124-133, August 03-04, 1998, Welches, Oregon, United States
|
|
|
|
|
|
|
|
|
|
|
|
Yossi Matias , Eran Segal , Jeffrey Scott Vitter, Efficient bundle sorting, Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, p.839-848, January 09-11, 2000, San Francisco, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
Richard P. Martin , Amin M. Vahdat , David E. Culler , Thomas E. Anderson, Effects of communication latency, overhead, and bandwidth in a cluster architecture, ACM SIGARCH Computer Architecture News, v.25 n.2, p.85-97, May 1997
|
|
|
Remzi H. Arpaci-Dusseau , Eric Anderson , Noah Treuhaft , David E. Culler , Joseph M. Hellerstein , David Patterson , Kathy Yelick, Cluster I/O with River: making the fast case common, Proceedings of the sixth workshop on I/O in parallel and distributed systems, p.10-22, May 05-05, 1999, Atlanta, Georgia, United States
|
|
|
Nathan Thomas , Gabriel Tanase , Olga Tkachyshyn , Jack Perdue , Nancy M. Amato , Lawrence Rauchwerger, A framework for adaptive algorithm selection in STAPL, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, June 15-17, 2005, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Naga Govindaraju , Jim Gray , Ritesh Kumar , Dinesh Manocha, GPUTeraSort: high performance graphics co-processor sorting for large database management, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|