low power and low complex implementation of turbo ...

Viewer
Transcript

LOW POWER AND LOW COMPLEX IMPLEMENTATION OF TURBO CODEC FOR SATELLITE AND WIRELESS MOBILE COMMUNICATIONS. Mr.M.Kannnan, Senior Lecturer, MIT, Anna University, Sathish Santhanam, Venkatesan Sankaran, Shankar Myilswamy. Department of Electronics and communication Engineering, MIT Anna University, Chennai, Tamil Nadu. [email protected], [email protected], [email protected], [email protected] Abstract: Turbo codes are first proposed by Berrou and Glavieux in 1993 and from then it was the fascinating thing that attracted many scholars to work on these codes as it was one of the two codes that are capable of reaching the Shannon limit error correction capability. Satellite and mobile communications need high bit rate and low BER for paramount performance. So modern satellite and wireless applications like OFDM, employ Turbo codec. Power in modern communications plays a vital role. Multifarious algorithms have been proposed to reduce memory access and improving BER. Reducing the memory access saves power to a huge amount. In this paper we have analyzed different algorithms proposed and developed a novel method to reduce the memory access by reducing the number of the iterations hence able to produce a low cost chip that can even be used by the rural people.

Introduction: For many communications applications power, bandwidth are very much demanding thing as we have only limited resources in the real world. Turbo codes are the one which acts iteratively to perform the decoding operation. Turbo decoder is the most complex block which find very hard to implement in the real time applications. Because of its complexity it consumes more power and here we have discussed a combination of the existing methods with our idea to reduce the power greatly.

Here the input bit fame is passed to the two encoders acting in parallel. The input bit stream to first encoder is given unaltered and to the second encoder the interleaved version of the input bit stream is given. The input bits and the parity bits are multiplexed and send through the channel. The output consists of the three bits (m1, p1, p2) for a single bit input m1, where p1 and p2 are the parity bits from the encoder. The channel used for the coding purpose is the Adaptive White Gaussian Channel.

Turbo Encoder:

Interleaver and Deinterleaver:

It consists of two recursive systematic encoders which are parallel to each other acting at the same time. The recursive systematic encoders are used because to convert the low weight code word into the high weight word.

The interleaver plays a vital role in the turbo decoder as it avoids the burst error. The interleaver used here was the random interleaver in which the inputs are scrambled randomly according to some predefined index and it can

also deinterleaved using the same index at the decoder side. Ex: m1 m2 m3….mn are the message bit streams and the index is chosen so that there is a minimum spacing of 9 index between the original and the interleaved version. So the interleaved version may take up like m11, m56, m89…..m7.

Low Complex Architecture:

Turbo

Decoder

The decoder consists of two soft input soft output (SISO) decoding modules separated by pseudo random interleaver/deinterleaver acting iterative to each other. The typical turbo decoder is shown in the figure 2.The multiplexed output from the encoder is demultiplexed at the decoder. The input to the first decoder consists of the message bit, parity bit from the first encoder and the extrinsic information from the second decoder block. The extrinsic information from the second block adds strength to the correct detection of the bit so that the correct decision can be made.

here sk denotes the present state and sk+1 denotes the next state. The  and value denotes the branch metrics and value denotes the path metrics. s0 and s1 denotes all possible state transitions for the message bits 0 and 1. sk  sk+1 denotes the transition from the present state to the next state. The multiplication of the branch and path metrics is complex and hence it is necessary to go for the logarithm where the multiplication terms are simplified to the addition of the terms is also considerable. This increases the latency of the circuit which is highly undesirable in the high data rate coding schemes like Turbo codes.

Sliding Window: The sliding window approach is used because the number values stored in the turbo decoder are high and also the time taken for the retrieval of those values for the LLR calculation. The input bit stream of 400 bits is taken for the verification. The sliding window size is taken as 32 bits, as after 32 bits the alpha value can be calculated as true values [13]. The sliding window concept we used is shown in the figure below. Here the 400 bits are split into 12 windows of 32 bits each and 1 window of 16 bits.

Calculation of the State metrics and the branch metrics: The alpha, beta and gamma values are calculated using the formulae respectively as

α km = Σj α k-1b(j,m) * γ k-1j,b(j,m) βkm = Σj β k+1f(j,m) * γ kj,m Turbo Decoder Log Map decoding algorithm: The decoder uses the MAP decoding algorithm for the decoding process. The output is calculated by the log likelihood ratio (LLR). The output is calculated using the formula given below

L = sk->sk+1 

γ = 0.5 * (d (ys + La) + ypC) Where

α km

- forward state metric at time k and state m b(j,m) α k-1 - forward state metric at time k-1 and b(j,m) is the state

γ k-1j,b(j,m)

-

βkm

-

Backward state metric at time k

and state m

β k+1f(j,m) - Backward state metric at time d

k+1 - Message bit

yp , ys

- Parity bits.

La – Apriori information

In our calculation initially, the gamma values are calculated for the bit values from 400 to 336. Now at time t=0, the following operations are done 1. The original beta values are calculated. The beta values obtained here are the true values as the calculation starts from the end of the block. 2. The dummy alpha values are calculated because the true values of alpha can only be obtained when the calculation starts from the start of the block. 3. The next thing that is done is the gamma calculation for the next bits ranging from 336 to 304 Here as the operations are done in parallel a high degree of parallelism is obtained.

The calculation of these terms require the 3 gamma memory, 2 beta memory , one llr memory, 2 alpha calculation block. The calculation proceeds in the following way in the future time intervals 1. The gamma values are calculated in the cyclic order. 2. The beta values are calculated and then stored in the memory only the reverse calculation algorithm introduced fails. This saves a huge amount of memory access. 3. The alpha values calculated after the 32 dummy values become original and these values are used for the calculation of the LLR values. The alpha values are completely calculated and not stored in the memory at any stage. So this saves the large amount of memory access and hence contributes the power consumption. 4. The LLR values are calculated simultaneously and stored in the memory.

Block Diagram:

BER Plot:

The Block diagram consist of three gamma storage unit ,a beta storage unit, a beta calculation unit, alpha calculation unit, an approximate checking unit and an LLR calculation unit. The gamma unit is accessed in a cyclic manner and it is controlled by a control unit which selects appropriate gamma values to alpha, beta and LLR unit. Same gamma values are used for both alpha calculation and LLR calculation. If the reverse approximate checking is success, reverse calculation flag is set and the beta value is not stored. During LLR calculation, this beta value is again calculated and used. The above procedure is iterated, till desired BER is achieved. Simple Stopping criteria is followed as specified in the paper [7]

The SNR value 1.6 db is used for the calculation.

The plot of the BER value and the number of iterations is shown in the plot. The BER of the lowest value of 0 can be possible to achieve within the maximum of 5 iterations. Thus the reduction of the number iterations leads to the least power consumption which desirable for the current modern communication systems.

CONCLUSION: The results that we obtained from the work are that the reverse calculation techniques we used in our paper greatly reduce the power consumption. The reverse calculation of the beta values and the non storage of alpha values reduce the memory access to nearly 50 percent and contribute to power saving. The reverse calculation can be obtained with a little addition to circuit implementation but this provides considerable power saving.

[7] Yufei Wu, Brian D. Woerner,“A Simple

REFRENCES:

[9] K.K.Parthi , VLSI Digital Signal Processing systems: Design and Implemtation. New York:wiley, 1999. [10] J.Kwat , S.M Park and K.Lee ,”Reverse tracing of forward state metric in log-MAP and MAX-log-MAP decoders”, in proc. IEEE Int.symp. Circuits Syst., 2003, vol 2, pp. 25-28.

[1] C. Berrou, A. Glavieux, and P.Thitimajshima, “Near Shannon limit error correcting coding and decoding: Turbo codes,” in Proc. ICC, pp. 1064-1070, May 1993. [2] Y. Wu, W. J. Ebel, and B. D. Woerner, “Forward computation of backward path metrics for MAP decoders,” in Proc. of VTC, pp. 22572261, 2000. [3] Dong-soo Lee , In-Cheol Park “Low-power Log-MAP decoding based on reduced memory access” in IEEE Transactions on circuits and systems ,pp. 1244-1252 , Vol 53 ,No 6,June 2006 . [4] Z.Wang, “High performance, low complexity VLSI design of turbo decoders”,Ph.D dissertation, ECE Dept, Univ. of Minnesota, Twin Cities,2000. [5] Patrick Robertson, Emmanuelle Villebrun,Peter Hoehes, “A comparison of optimal and sub-optimal algorithms operating in log domain”

[6]C. Schurgers , F. Shanbhan and A.C. Singer,”Memory optimization of MAP turbo decoder algorithms ”,IEEE Trans.Very Large scale Integr.(VLSI) syst.Vol 9, no 2, pp.305312,Apr. 2001.

Stopping Criterion for Turbo Decoding”,IEEE Communications Letters,Vol. 4, No. 8, August 2000. [8] Curt Schurgers, Francky Catthoor, and Marc Engels,“Memory Optimization of MAP Turbo Decoder Algorithms ”,IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 9, No. 2, April 2001.

[11] Technical Specification Group Radio Access Network, The 3 rd generation partnership project (3GPP) 2005 [online].Available www.3gpp.org [12] Zhiyong He, Paul Fortier, and Sébastien Roy, “Highly-Parallel Decoding Architectures for Convolutional Turbo Codes”, IEEE Transaction on very large scale Integration (VLSI) Systems, Vol. 14, No. 10, October 2006 . [13] Seok-Jun Lee, Naresh R. Shanbhag, and Andrew C. Singer, “A low-power VLSI Architecture for turbo decoding”, ISLPED’03, August 25–27, 2003, Seoul, Korea.