| Efficient sorting using registers and caches |
| Full text |
Pdf
(265 KB),
Ps
(269 KB),
Tar
(307 KB)
|
| Source
|
Journal of Experimental Algorithmics (JEA)
archive
Volume 7 , (2002)
table of contents
Page: 9
Year of Publication: 2002
ISSN:1084-6654
|
|
Authors
|
|
Rajiv Wickremesinghe
|
Department of Computer Science, Duke University, Box 90129, Durham, NC
|
|
Lars Arge
|
Department of Computer Science, Duke University, Box 90129, Durham, NC
|
|
Jeffrey S. Chase
|
Department of Computer Science, Duke University, Box 90129, Durham, NC
|
|
Jeffrey Scott Vitter
|
Department of Computer Science, Duke University, Box 90129, Durham, NC
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 61, Citation Count: 4
|
|
|
ABSTRACT
Modern computer systems have increasingly complex memory systems. Common machine models for algorithm analysis do not reflect many of the features of these systems, e.g., large register sets, lockup-free caches, cache hierarchies, associativity, cache line fetching, and streaming behavior. Inadequate models lead to poor algorithmic choices and an incomplete understanding of algorithm behavior on real machines.A key step toward developing better models is to quantify the performance effects of features not reflected in the models. This paper explores the effect of memory system features on sorting performance. We introduce a new cache-conscious sorting algorithm, R-MERGE, which achieves better performance in practice over algorithms that are superior in the theoretical models. R-MERGE is designed to minimize memory stall cycles rather than cache misses by considering features common to many system designs.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
AGGARWAL, A., ALPERN, B., CHANDRA, A. K., AND SNIR, M. 1987. A model for hierarchical memory. Volume 19 (New York, 1987), pp. 305-314.
|
| |
2
|
AGGARWAL, A., CHANDRA, A., AND SNIR, M. 1987. Hierarchical memory with block transfer. Volume 28 (Los Angeles, 1987), pp. 204-216.
|
 |
3
|
|
| |
4
|
ALPERN, B., CARTER, L., FEIG, E., AND SELKER, T. 1994. The uniform memory hierarchy model of computation. 12, 2-3, 72-109.
|
 |
5
|
Jennifer M. Anderson , Lance M. Berc , Jeffrey Dean , Sanjay Ghemawat , Monika R. Henzinger , Shun-Tak A. Leung , Richard L. Sites , Mark T. Vandevoorde , Carl A. Waldspurger , William E. Weihl, Continuous profiling: where have all the cycles gone?, Proceedings of the sixteenth ACM symposium on Operating systems principles, p.1-14, October 05-08, 1997, Saint Malo, France
|
| |
6
|
|
 |
7
|
|
| |
8
|
John H. Edmondson , Paul I. Rubinfeld , Peter J. Bannon , Bradley J. Benschneider , Debra Bernstein , Ruben W. Castelino , Elizabeth M. Cooper , Daniel E. Dever , Dale R. Donchin , Timothy C. Fischer , Anil K. Jain , Shekhar Mehta , Jeanne E. Meyer , Ronald P. Preston , Vidya Rajagopalan , Chandrasekhara Somanathan , Scott A. Taylor , Gilbert M. Wolrich, Internal organization of the Alpha 21164, a 300-MHz 64-bit quad-issue CMOS RISC microprocessor, Digital Technical Journal, v.7 n.1, p.119-135, Jan. 1995
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
Richard E. Ladner , James D. Fix , Anthony LaMarca, Cache performance analysis of traversals and random accesses, Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, p.613-622, January 17-19, 1999, Baltimore, Maryland, United States
|
 |
13
|
Monica D. Lam , Edward E. Rothberg , Michael E. Wolf, The cache performance and optimizations of blocked algorithms, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.63-74, April 08-11, 1991, Santa Clara, California, United States
|
| |
14
|
LAMARCA, A. AND LADNER, R. E. 1997. The influence of caches on the performance of sorting. Volume 7 (January 1997), pp. 370-379.
|
| |
15
|
|
| |
16
|
RAHMAN, N. AND RAMAN, R. 2000. Adapting radix sort to the memory hierarchy. In ALENEX, Workshop on Algorithm Engineering and Experimentation (2000).
|
| |
17
|
|
| |
18
|
|
| |
19
|
|
| |
20
|
|
 |
21
|
|
| |
22
|
VITTER, J. S. AND SHRIVER, E.A.M. 1994. Algorithms for parallel memory I: Two-level memories. 12, 2-3, 110-147.
|
|