Please use this identifier to cite or link to this item: http://localhost/handle/Hannan/601994
Title: Implementation of a Fully-Parallel Turbo Decoder on a General-Purpose Graphics Processing Unit
Authors: An Li;Robert G. Maunder;Bashir M. Al-Hashimi;Lajos Hanzo
subject: GPGPU computing|could radio access network|parallel processing|Fully-parallel turbo decoder|software defined radio
Year: 2016
Publisher: IEEE
Abstract: Turbo codes comprising a parallel concatenation of upper and lower convolutional codes are widely employed in the state-of-the-art wireless communication standards, since they facilitate transmission throughputs that closely approach the channel capacity. However, this necessitates high processing throughputs in order for the turbo code to support real-time communications. In the state-of-the-art turbo code implementations, the processing throughput is typically limited by the data dependences that occur within the forward and backward recursions of the Log-BCJR algorithm, which is employed during turbo decoding. In contrast to the highly serial Log-BCJR turbo decoder, we have recently proposed a novel fully parallel turbo decoder (FPTD) algorithm, which can eliminate the data dependences and perform fully parallel processing. In this paper, we propose an optimized FPTD algorithm, which reformulates the operation of the FPTD algorithm so that the upper and lower decoders have identical operation, in order to support single instruction multiple data operation. This allows us to develop a novel general purpose graphics processing unit (GPGPU) implementation of the FPTD, which has application in software-defined radios and virtualized cloud-radio access networks. As a benefit of its higher degree of parallelism, we show that our FPTD improves the higher processing throughput of the Log-BCJR turbo decoder by between 2.3 and 9.2 times, when employing a high-specification GPGPU. However, this is achieved at the cost of a moderate increase of the overall complexity by between 1.7 and 3.3 times.
Description: 
URI: http://localhost/handle/Hannan/138542
http://localhost/handle/Hannan/601994
ISSN: 2169-3536
volume: 4
Appears in Collections:2016

Files in This Item:
File Description SizeFormat 
7501831.pdf7.84 MBAdobe PDFThumbnail
Preview File
Title: Implementation of a Fully-Parallel Turbo Decoder on a General-Purpose Graphics Processing Unit
Authors: An Li;Robert G. Maunder;Bashir M. Al-Hashimi;Lajos Hanzo
subject: GPGPU computing|could radio access network|parallel processing|Fully-parallel turbo decoder|software defined radio
Year: 2016
Publisher: IEEE
Abstract: Turbo codes comprising a parallel concatenation of upper and lower convolutional codes are widely employed in the state-of-the-art wireless communication standards, since they facilitate transmission throughputs that closely approach the channel capacity. However, this necessitates high processing throughputs in order for the turbo code to support real-time communications. In the state-of-the-art turbo code implementations, the processing throughput is typically limited by the data dependences that occur within the forward and backward recursions of the Log-BCJR algorithm, which is employed during turbo decoding. In contrast to the highly serial Log-BCJR turbo decoder, we have recently proposed a novel fully parallel turbo decoder (FPTD) algorithm, which can eliminate the data dependences and perform fully parallel processing. In this paper, we propose an optimized FPTD algorithm, which reformulates the operation of the FPTD algorithm so that the upper and lower decoders have identical operation, in order to support single instruction multiple data operation. This allows us to develop a novel general purpose graphics processing unit (GPGPU) implementation of the FPTD, which has application in software-defined radios and virtualized cloud-radio access networks. As a benefit of its higher degree of parallelism, we show that our FPTD improves the higher processing throughput of the Log-BCJR turbo decoder by between 2.3 and 9.2 times, when employing a high-specification GPGPU. However, this is achieved at the cost of a moderate increase of the overall complexity by between 1.7 and 3.3 times.
Description: 
URI: http://localhost/handle/Hannan/138542
http://localhost/handle/Hannan/601994
ISSN: 2169-3536
volume: 4
Appears in Collections:2016

Files in This Item:
File Description SizeFormat 
7501831.pdf7.84 MBAdobe PDFThumbnail
Preview File
Title: Implementation of a Fully-Parallel Turbo Decoder on a General-Purpose Graphics Processing Unit
Authors: An Li;Robert G. Maunder;Bashir M. Al-Hashimi;Lajos Hanzo
subject: GPGPU computing|could radio access network|parallel processing|Fully-parallel turbo decoder|software defined radio
Year: 2016
Publisher: IEEE
Abstract: Turbo codes comprising a parallel concatenation of upper and lower convolutional codes are widely employed in the state-of-the-art wireless communication standards, since they facilitate transmission throughputs that closely approach the channel capacity. However, this necessitates high processing throughputs in order for the turbo code to support real-time communications. In the state-of-the-art turbo code implementations, the processing throughput is typically limited by the data dependences that occur within the forward and backward recursions of the Log-BCJR algorithm, which is employed during turbo decoding. In contrast to the highly serial Log-BCJR turbo decoder, we have recently proposed a novel fully parallel turbo decoder (FPTD) algorithm, which can eliminate the data dependences and perform fully parallel processing. In this paper, we propose an optimized FPTD algorithm, which reformulates the operation of the FPTD algorithm so that the upper and lower decoders have identical operation, in order to support single instruction multiple data operation. This allows us to develop a novel general purpose graphics processing unit (GPGPU) implementation of the FPTD, which has application in software-defined radios and virtualized cloud-radio access networks. As a benefit of its higher degree of parallelism, we show that our FPTD improves the higher processing throughput of the Log-BCJR turbo decoder by between 2.3 and 9.2 times, when employing a high-specification GPGPU. However, this is achieved at the cost of a moderate increase of the overall complexity by between 1.7 and 3.3 times.
Description: 
URI: http://localhost/handle/Hannan/138542
http://localhost/handle/Hannan/601994
ISSN: 2169-3536
volume: 4
Appears in Collections:2016

Files in This Item:
File Description SizeFormat 
7501831.pdf7.84 MBAdobe PDFThumbnail
Preview File