ACM Home Page
Please provide us with feedback. Feedback
Fundamental parallel algorithms for private-cache chip multiprocessors
Full text PdfPdf (385 KB)
Source
ACM Symposium on Parallel Algorithms and Architectures archive
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures table of contents
Munich, Germany
SESSION: Special track -- multicore algorithms table of contents
Pages 197-206  
Year of Publication: 2008
ISBN:978-1-59593-973-9
Authors
Lars Arge  University of Aarhus, Aarhus, Denmark
Michael T. Goodrich  University of California, Irvine, Irvine, CA, USA
Michael Nelson  University of California, Irvine, Irvine, CA, USA
Nodari Sitchinava  University of California, Irvine, Irvine, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 180,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1378533.1378573
What is a DOI?

ABSTRACT

In this paper, we study parallel algorithms for private-cache chip multiprocessors (CMPs), focusing on methods for foundational problems that are scalable with the number of cores. By focusing on private-cache CMPs, we show that we can design efficient algorithms that need no additional assumptions about the way cores are interconnected, for we assume that all inter-processor communication occurs through the memory hierarchy. We study several fundamental problems, including prefix sums, selection, and sorting, which often form the building blocks of other parallel algorithms. Indeed, we present two sorting algorithms, a distribution sort and a mergesort. Our algorithms are asymptotically optimal in terms of parallel cache accesses and space complexity under reasonable assumptions about the relationships between the number of processors, the size of memory, and the size of cache blocks. In addition, we study sorting lower bounds in a computational model, which we call the parallel external-memory (PEM) model, that formalizes the essential properties of our algorithms for private-cache CMPs.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
4
 
5
 
6
 
7
8
9
 
10
 
11
F. Dehne, W. Dittrich, D. Hutchinson, and A. Maheshwari. Bulk synchronous parallel algorithms for the external memory model. Theory of Computing Systems, 35(6):567--598, 2002.
 
12
13
 
14
15
 
16
 
17
 
18
19
 
20
G. Lowney. Why Intel is designing multi-core processors. https://conferences.umiacs.umd.edu/paa/lowney.pdf.
21
22
 
23
 
24
25
 
26
 
27
 
28
29
 
30
J. S. Vitter and E. A. M. Shriver. Algorithms for parallel memory I: Two-level memories. Algorithmica, 12(2--3):110--147, 1994.


Collaborative Colleagues:
Lars Arge: colleagues
Michael T. Goodrich: colleagues
Michael Nelson: colleagues
Nodari Sitchinava: colleagues