ACM Home Page
Please provide us with feedback. Feedback
An experimental study of sorting and branch prediction
Full text PdfPdf (801 KB)
Source Journal of Experimental Algorithmics (JEA) archive
Volume 12 ,  (June 2008) table of contents
SECTION: 1 - Regular Papers table of contents
Article No. 1.8  
Year of Publication: 2008
ISSN:1084-6654
Authors
Paul Biggar  Trinity College Dublin, Ireland
Nicholas Nash  Trinity College Dublin, Ireland
Kevin Williams  Trinity College Dublin, Ireland
David Gregg  Trinity College Dublin, Ireland
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 30,   Downloads (12 Months): 258,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1227161.1370599
What is a DOI?

ABSTRACT

Sorting is one of the most important and well-studied problems in computer science. Many good algorithms are known which offer various trade-offs in efficiency, simplicity, memory use, and other factors. However, these algorithms do not take into account features of modern computer architectures that significantly influence performance. Caches and branch predictors are two such features and, while there has been a significant amount of research into the cache performance of general purpose sorting algorithms, there has been little research on their branch prediction properties. In this paper, we empirically examine the behavior of the branches in all the most common sorting algorithms. We also consider the interaction of cache optimization on the predictability of the branches in these algorithms. We find insertion sort to have the fewest branch mispredictions of any comparison-based sorting algorithm, that bubble and shaker sort operate in a fashion that makes their branches highly unpredictable, that the unpredictability of shellsort's branches improves its caching behavior, and that several cache optimizations have little effect on mergesort's branch mispredictions. We find also that optimizations to quicksort, for example the choice of pivot, have a strong influence on the predictability of its branches. We point out a simple way of removing branch instructions from a classic heapsort implementation and also show that unrolling a loop in a cache-optimized heapsort implementation improves the predicitability of its branches. Finally, we note that when sorting random data two-level adaptive branch predictors are usually no better than simpler bimodal predictors. This is despite the fact that two-level adaptive predictors are almost always superior to bimodal predictors, in general.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
Austin, T., Ernst, D., Larson, E., Weaver, C., Raj Desikan, R. N., Huh, J., Yoder, B., Burger, D., and Keckler, S. 2001. SimpleScalar Tutorial (for release 4.0).
 
4
 
5
Biggar, P. and Gregg, D. 2005. Sorting in the presence of branch prediction and caches. Tech. Rep. TCD-CS-05-57 (Aug.) University of Dublin, Trinity College.
 
6
Brodal, G. S. and Moruz, G. 2005. Tradeoffs between branch mispredictions and comparisons for sorting algorithms. In WADS. 385--395.
 
7
Brodal, G. S., Fagerberg, R., and Moruz, G. 2005. On the adaptiveness of quicksort. In Proceedings of the 7th Workshop on Algorithm Engineering and Experiments. 130--140.
8
 
9
 
10
Floyd, R. W. 1964. Treesort 3: Algorithm 245. Communi. ACM 7, 12, 701.
11
 
12
 
13
Gonnet, G. H. and Baeza-Yates, R. 1991. Pascal and C, 2nd ed. Addison-Wesley Longman Publ., Reading, MA.
 
14
Haahr, M. 2006. Random.org: True random number service. Web resource, available at http://www.random.org.
 
15
 
16
Hinton, G., Sager, D., Upton, M., Carmean, D., Kyker, A., and Roussel, P. 2001. The microarchitecture of the pentium 4 processor. Intel Technol. J. Q1.
 
17
Hoare, C. A. R. 1962. Quicksort. Comput. J. 5, 1, 10--15.
 
18
Intel. 2001. Desktop performance and optimization for intel pentium 4 processor. Tech. Rept.
 
19
Intel. 2004. Ia-32 intel architecture optimization — reference manual. Tech. Rept.
 
20
 
21
 
22
 
23
 
24
25
 
26
 
27
 
28
Mucci, P. J. 2004. PapiEx Man Page.
 
29
Mudge, T., Chen, I.-C., and Coffey, J. 1996. Limits to branch prediction. Tech. Rept. CSE-TR-282-96. 2.
30
31
 
32
Sanders, P. and Winkel, S. 2004. Super scalar sample sort. In Algorithms ESA 2004: 12th Annual European Symposium, S. Albers and T. Radzik, Eds. Lecture Notes in Computer Science, vol. 3221. Springer, Berlin. 784--796.
33
 
34
35
 
36
37
 
38
Williams, J. W. J. 1964. Heapsort: Algorithm 232. Commun. ACM 7, 6, 347--348.
39

Collaborative Colleagues:
Paul Biggar: colleagues
Nicholas Nash: colleagues
Kevin Williams: colleagues
David Gregg: colleagues