ACM Home Page
Please provide us with feedback. Feedback
A super scalar sort algorithm for RISC processors
Full text PdfPdf (806 KB)
Source International Conference on Management of Data archive
Proceedings of the 1996 ACM SIGMOD international conference on Management of data table of contents
Montreal, Quebec, Canada
Pages: 240 - 246  
Year of Publication: 1996
ISBN:0-89791-794-4
Also published in ...
Author
Ramesh C. Agarwal  IBM T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY
Sponsors
SIGART: ACM Special Interest Group on Artificial Intelligence
SIGMOD: ACM Special Interest Group on Management of Data
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 39,   Citation Count: 15
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/233269.233336
What is a DOI?

ABSTRACT

The compare and branch sequences required in a traditional sort algorithm can not efficiently exploit multiple execution units present in currently available high performance RISC processors. This is because of the long latency of the compare instructions and the sequential algorithm used in sorting. With the increased level of integration on a chip, this trend is expected to continue. We have developed new sort algorithms which eliminate almost all the compares, provide functional parallelism which can be exploited by multiple execution units, significantly reduce the number of passes through keys, and improve data locality. These new algorithms outperform traditional sort algorithms by a large factor.For the Datamation disk to disk sort benchmark (one million 100-byte records), at SIGMOD'94, Chris Nyberg et al presented several new performance records using DEC alpha processor based systems.We have implemented the Datamation sort benchmark using our new sort algorithm on a desktop IBM RS/6000 model 39H (66.6 MHz) with 8 IBM SSA 7133 disk drives (total cost $73K). The total elapsed time for the 100 MB sort was 5.1 seconds (vs the old uni-processor record of 9.1 seconds). We have also established a new price performance record (0.2¢ vs the old record of 0.9¢, as the cost of the sort). The entire sort processing was overlapped with I/O. During the read phase, we achieved a sustained BW of 47 MB/sec and during the write phase, we achieved a sustained BW of 39 MB/sec. Key extraction and sorting of one million 10-byte keys took only 0.6 second of CPU time. The rest of the CPU time was used in moving records, servicing I/O, and other overheads.Algorithmic details leading to this level of performance are described in this paper. A detailed analysis of the CPU time spent during various phases of the sort algorithm and I/O is also provided.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
4
 
5
 
6
 
7
 
8
Tsukerman, A., "FastSort - An External Sort Using Parallel Processing", Proc. SIGMOD 1990, pp. 88- 101.
9
 
10
 
11
SGI Media Brief dated May 22, 1995.
 
12
Barley, D., Barszcz, E., Barton, J. Browning, D., Carter, R., Dagum, L., Fatoohi, R., Fineberg, S., Frederickson, P., Lasinski, T., Schereiber, P~., Simon, H., Venkatakrishnan, V., Weerantunga, S., The NAS Parallel Benchmarks, Technical Report RNR-94-007, NASA Ames Research Center, March, 1994.
 
13
Agarwal R. C., Gustavson F. G., Zubair M., "A Scalable Parallel Implementation of the NAS Integer Sort Benchmark", Proc. int. Workshop on Parallel Processing, Bangalore, India, pp. 463-477, December, 1994.
 
14
 
15

CITED BY  15