ACM Home Page
Please provide us with feedback. Feedback
How GPUs can outperform ASICs for fast LDPC decoding
Full text PdfPdf (648 KB)
Source
International Conference on Supercomputing archive
Proceedings of the 23rd international conference on Supercomputing table of contents
Yorktown Heights, NY, USA
SESSION: Accelerating applications with GPUs II table of contents
Pages 390-399  
Year of Publication: 2009
ISBN:978-1-60558-498-0
Authors
Gabriel Falcão  University of Coimbra, Coimbra, Portugal
Vitor Silva  University of Coimbra, Coimbra, Portugal
Leonel Sousa  Technical University of Lisbon, Lisboa, Portugal
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 47,   Downloads (12 Months): 154,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1542275.1542330
What is a DOI?

ABSTRACT

Due to huge computational requirements, powerful Low-Density Parity-Check (LDPC) error correcting codes, discovered in the early 1960s, have only recently been adopted by emerging communication standards. LDPC decoders are supported by VLSI technology, which delivers good parallel computational power with excellent throughputs, but at the expense of significant costs.

In this work, we propose an alternative flexible LDPC decoder that exploits data-parallelism for simultaneous multicodeword decoding, supported by multithreading on CUDA-based graphics processing units (GPUs). The ratio of arithmetic operations per memory access is low for the efficient min-sum LDPC decoding algorithm proposed, which causes a bottleneck due to memory latency and data collisions. We propose runtime data realignment to allow coalesced parallel memory accesses to be performed by distinct threads inside the same warp. The memory access patterns of LDPC codes are random, which does not admit the simultaneous use of coalescence in both read and write operations of the decoding process. To overcome this problem we have developed a data mapping transformation which allows new addresses to be contiguously accessed for one of the mentioned memory access types. Our implementation shows throughputs above 100Mbps and BER curves that compare well with ASIC solutions.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
R. G. Gallager. Low-density parity-check codes. IRE Transactions on Information Theory, 8(1):21--28, January 1962.
 
2
R. Tanner. A recursive approach to low complexity codes. IEEE Transactions on Information Theory, IT-27(5):533--547, September 1981.
 
3
C. Berrou, A. Glavieux, and P. Thitimajshima. Near Shannon limit error-correcting coding and decoding: Turbo-codes (1) In IEEE International Conference on Communications (ICC'93), pages 1064--1070, May 1993.
 
4
D. J. C. Mackay and R. M. Neal. Near Shannon limit performance of low density parity check codes. IEE Electronics Letters, 32(18):1645--1646, August 1996.
 
5
Digital video broadcasting (DVB); second generation framing structure, channel coding and modulation systems for broadcasting, interactive services, news gathering and other broad-band satellite applications. EN 302 307 V1. 1.1, European Telecommunications Standards Institute (ETSI), 2005.
 
6
T. Zhang and K. Parhi. Joint (3,k)-regular LDPC code and decoder/encoder design. IEEE Transactions on Signal Processing, 52(4):1065--1079, April 2004.
 
7
F. Kienle, T. Brack, and N. Wehn. A Synthesizable IP Core for WIMAX 802.16E LDPC Code Decoding. In IEEE 17th International Symposium on Personal, Indoor and Mobile Radio Communications, pages 1--5, September, 2006.
 
8
 
9
Li Ping and W. K. Leung. Decoding Low Density Parity Check Codes with Finite Quantization Bits. IEEE Communications Letters, 4(2):62--64, February 2000.
 
10
A. J. Blanksby and C. J. Howland. A 690-mW 1-Gb/s 1024-b, rate-1/2 low-density parity-check code decoder. IEEE Journal of Solid-State Circuits, 37(3):404--412, March 2002.
 
11
 
12
Sangwon Seo, Trevor Mudge, Yuming Zhu, and Chaitali Chakrabarti. Design and Analysis of LDPC Decoders for Software Defined Radio. In Proceedings of IEEE Workshop on Signal Processing Systems, pages 210--215, October 2007.
 
13
A. Ghuloum, E. Sprangle, J. Fang, G. Wu, and X. Zhou. Ct: A Flexible Parallel Programming Model for Tera-scale Architectures. Intel, pages 1--21, 2007.
 
14
 
15
M. McCool. Scalable Programming Models for Massively Multicore Processors. Proceedings of the IEEE, 96(5):816--831, May 2008.
16
 
17
G. Falcão, V. Silva, L. Sousa and J. Marinho. High coded data rate and multicodeword WiMAX LDPC decoding on Cell/BE. IET Electronics Letters, 44(24):1415--1417, November 2008.
 
18
S. Chung, G. Forney, T. Richardson and R. Urbanke. On the Design of Low-Density Parity-Check Codes within 0.0045 dB of the Shannon Limit. IEEE Communications Letters, 5(2):58--60, 2001.
19
 
20
 
21
22
 
23
 
24
F. Guilloud, E. Boutillon, and J.-L. Danger. λ-min decoding algorithm of regular and irregular LDPC codes. In Proc. 3rd Int. Symp. Turbo Codes Relat. Topics, pages 1--4, September 2003.
 
25
C.-H. Liu, S.-W. Yen, C.-L. Chen, H.-C. Chang, C.-Y. Lee, Y.-S. Hsu, and S.-J. Jou. An LDPC Decoder Chip Based on Self-Routing Network for IEEE 802.16e Applications. IEEE Journal of Solid-State Circuits, 43(3):684--694, 2008.
 
26
Xin-Yu Shih, Cheng-Zhou Zhan, Cheng-Hung Lin, and An-Yeu Wu. An 8.29 mm 2 52 mW Multi-Mode LDPC Decoder Design for Mobile WiMAX System in 0.13 μm CMOS Process. IEEE Journal of Solid-State Circuits, 43(3):672--683, 2008.

Collaborative Colleagues:
Gabriel Falcão: colleagues
Vitor Silva: colleagues
Leonel Sousa: colleagues