Please use this identifier to cite or link to this item: http://localhost/handle/Hannan/625174
Title: Implementation of a Fully-Parallel Turbo Decoder Designed for Mission-Critical Machine-Type Communication Applications
Authors: An Li;Peter Hailes;Robert G. Maunder;Bashir M. Al-Hashimi;Lajos Hanzo
subject: turbo decoding|Fully-parallel turbo decoder|FPGA|LTE
Year: 2016
Publisher: IEEE
Abstract: In wireless communication schemes, turbo codes facilitate near-capacity transmission throughputs by achieving reliable forward error correction. However, owing to the serial data dependencies imposed by the underlying logarithmic Bahl-Cocke-Jelinek-Raviv (Log-BCJR) algorithm, the limited processing throughputs of conventional turbo decoder implementations impose a severe bottleneck upon the overall throughputs of real-time wireless communication schemes. Motivated by this, we recently proposed a fully parallel turbo decoder (FPTD) algorithm, which eliminates these serial data dependencies, allowing parallel processing and hence offering a significantly higher processing throughput. In this paper, we propose a novel resource-efficient version of the FPTD algorithm, which reduces its computational resource requirement by 50%, which enhancing its suitability for field-programmable gate array (FPGA) implementations. We propose a model FPGA implementation. When using a Stratix IV FPGA, the proposed FPTD FPGA implementation achieves an average throughput of 1.53 Gb/s and an average latency of 0.56 μs, when decoding frames comprising N = 720 b. These are, respectively, 13.2 times and 11.1 times superior to those of the state-of-the-art FPGA implementation of the Log-BCJR long-term evolution (LTE) turbo decoder, when decoding frames of the same frame length at the same error correction capability. Furthermore, our proposed FPTD FPGA implementation achieves a normalized resource usage of 0.42 (kALUTs/Mb/s), which is 5.2 times superior to that of the benchmarker decoder. Furthermore, when decoding the shortest N = 40-b LTE frames, the proposed FPTD FPGA implementation achieves an average throughput of 442 Mb/s and an average latency of 0.18 μs, which are, respectively, 21.1 times and 10.6 times superior to those of the benchmarker decoder. In this case, the normalized resource usage of 0.08 (kALUTs/Mb/s) is 146.4 times superior to that of the benchmarker decoder.
Description: 
URI: http://localhost/handle/Hannan/161271
http://localhost/handle/Hannan/625174
ISSN: 2169-3536
volume: 4
Appears in Collections:2016

Files in This Item:
File Description SizeFormat 
7539551.pdf13.66 MBAdobe PDFThumbnail
Preview File
Title: Implementation of a Fully-Parallel Turbo Decoder Designed for Mission-Critical Machine-Type Communication Applications
Authors: An Li;Peter Hailes;Robert G. Maunder;Bashir M. Al-Hashimi;Lajos Hanzo
subject: turbo decoding|Fully-parallel turbo decoder|FPGA|LTE
Year: 2016
Publisher: IEEE
Abstract: In wireless communication schemes, turbo codes facilitate near-capacity transmission throughputs by achieving reliable forward error correction. However, owing to the serial data dependencies imposed by the underlying logarithmic Bahl-Cocke-Jelinek-Raviv (Log-BCJR) algorithm, the limited processing throughputs of conventional turbo decoder implementations impose a severe bottleneck upon the overall throughputs of real-time wireless communication schemes. Motivated by this, we recently proposed a fully parallel turbo decoder (FPTD) algorithm, which eliminates these serial data dependencies, allowing parallel processing and hence offering a significantly higher processing throughput. In this paper, we propose a novel resource-efficient version of the FPTD algorithm, which reduces its computational resource requirement by 50%, which enhancing its suitability for field-programmable gate array (FPGA) implementations. We propose a model FPGA implementation. When using a Stratix IV FPGA, the proposed FPTD FPGA implementation achieves an average throughput of 1.53 Gb/s and an average latency of 0.56 μs, when decoding frames comprising N = 720 b. These are, respectively, 13.2 times and 11.1 times superior to those of the state-of-the-art FPGA implementation of the Log-BCJR long-term evolution (LTE) turbo decoder, when decoding frames of the same frame length at the same error correction capability. Furthermore, our proposed FPTD FPGA implementation achieves a normalized resource usage of 0.42 (kALUTs/Mb/s), which is 5.2 times superior to that of the benchmarker decoder. Furthermore, when decoding the shortest N = 40-b LTE frames, the proposed FPTD FPGA implementation achieves an average throughput of 442 Mb/s and an average latency of 0.18 μs, which are, respectively, 21.1 times and 10.6 times superior to those of the benchmarker decoder. In this case, the normalized resource usage of 0.08 (kALUTs/Mb/s) is 146.4 times superior to that of the benchmarker decoder.
Description: 
URI: http://localhost/handle/Hannan/161271
http://localhost/handle/Hannan/625174
ISSN: 2169-3536
volume: 4
Appears in Collections:2016

Files in This Item:
File Description SizeFormat 
7539551.pdf13.66 MBAdobe PDFThumbnail
Preview File
Title: Implementation of a Fully-Parallel Turbo Decoder Designed for Mission-Critical Machine-Type Communication Applications
Authors: An Li;Peter Hailes;Robert G. Maunder;Bashir M. Al-Hashimi;Lajos Hanzo
subject: turbo decoding|Fully-parallel turbo decoder|FPGA|LTE
Year: 2016
Publisher: IEEE
Abstract: In wireless communication schemes, turbo codes facilitate near-capacity transmission throughputs by achieving reliable forward error correction. However, owing to the serial data dependencies imposed by the underlying logarithmic Bahl-Cocke-Jelinek-Raviv (Log-BCJR) algorithm, the limited processing throughputs of conventional turbo decoder implementations impose a severe bottleneck upon the overall throughputs of real-time wireless communication schemes. Motivated by this, we recently proposed a fully parallel turbo decoder (FPTD) algorithm, which eliminates these serial data dependencies, allowing parallel processing and hence offering a significantly higher processing throughput. In this paper, we propose a novel resource-efficient version of the FPTD algorithm, which reduces its computational resource requirement by 50%, which enhancing its suitability for field-programmable gate array (FPGA) implementations. We propose a model FPGA implementation. When using a Stratix IV FPGA, the proposed FPTD FPGA implementation achieves an average throughput of 1.53 Gb/s and an average latency of 0.56 μs, when decoding frames comprising N = 720 b. These are, respectively, 13.2 times and 11.1 times superior to those of the state-of-the-art FPGA implementation of the Log-BCJR long-term evolution (LTE) turbo decoder, when decoding frames of the same frame length at the same error correction capability. Furthermore, our proposed FPTD FPGA implementation achieves a normalized resource usage of 0.42 (kALUTs/Mb/s), which is 5.2 times superior to that of the benchmarker decoder. Furthermore, when decoding the shortest N = 40-b LTE frames, the proposed FPTD FPGA implementation achieves an average throughput of 442 Mb/s and an average latency of 0.18 μs, which are, respectively, 21.1 times and 10.6 times superior to those of the benchmarker decoder. In this case, the normalized resource usage of 0.08 (kALUTs/Mb/s) is 146.4 times superior to that of the benchmarker decoder.
Description: 
URI: http://localhost/handle/Hannan/161271
http://localhost/handle/Hannan/625174
ISSN: 2169-3536
volume: 4
Appears in Collections:2016

Files in This Item:
File Description SizeFormat 
7539551.pdf13.66 MBAdobe PDFThumbnail
Preview File