VHDL Design and FPGA Implementation of a Fully Parallel Architecture for Iterative Decoder of Majority Logic Codes for High Data Rate Applications

In this work, we propose a design and FPGA (Field Programmable Gate Arrays) implementation of three parallel architectures for majority logic decoder of low complexity for high data rate applications. These architectures are hard decision decoder architecture (Hard in - Hard out (HIHO)), the SIHO threshold decoding (Soft in – Hard out) and the SISO threshold decoding (Soft in – Soft out). The chosen code is the Difference Set Cyclic DSC code. The VHDL (Very high speed integrated circuit Hardware Description Language) design and the synthesis of such architectures show that such decoders can achieve high data rate with low complexity. In our case, the iterative decoder associated to the fully parallel SISO threshold decoders allows achieving high data rates, 2 clock cycles for two iterations with a complexity of 7350LEs.


Introduction
The complexity and the latency, which are often higher during the decoding process, make up the majo r decoding drawbacks in the current d igital systems. The decoding data rate depends of the parallelis m of the chosen architecture, i.e., its ability to process simu ltaneously mult iple data. The architecture presents a high level of parallelis m mo re it is able to achieve a very high data rate. In electronic design, the complexity of the circuit (i.e., its surface) is often considered as a critical parameter. In this context of very h igh data rate, operating speed must be maximized with limitation of co mplexity induced by the parallelis m of calculations. We will specially study the decoding of the codes belonging to the OSM LD (one step majority logic decoding) codes family [3][4][5] and used in an iterative decoding process. The majority logic decoding uses a linear co mbination of a reduced set of syndromes represented by orthogonal equations. Hence it allo ws, by choosing a pipeline and / or parallelized architecture, to have a less complex decoding circuit with a high data rate.
The principle of majority log ic decoding and the proposed VHDL design and FPGA implementation of majo rity decoder with a hard decis ion is described in Section 1, the Se ct io n 2 p res en t th e VH D L d es ig n an d F P GA implementation of architecture of the Massey's APP (A Posteriori Probability) algorith m [3]. The Section 3 present the VHDL design and FPGA implementation of a fully parallel architecture of SISO threshold decoding, using the joint exp loitation the parallelism of symbol and the parallelis m of sub-blocks so as to achieve high data rate. In Section 4, we will describe the iterative algorithm used to decode the OSM LD codes. Before concluding we will present the results of our simulations.

A. Simple majority logic code (MLC)
Consider a linear cyclic code C (n, k) with H parity-check matrix. The space generated by H is the cyclic code (n, n-k), denoted by C ┴ , wh ich is the dual code of C, or the null space of C. For any vector v in C and any vector w in C ┴ , the inner product of v and w is zero. Now suppose that a code vector in C is transmitted over a binary symmet ric channel. Let e (e 1 , e 2 , .…, e n ) and R (r 1 , r 2 , …, r n ) be the error vector and the received binary vector respectively. Then R = v + e. For any vector w in the dual code C ┴ , we can fo rm the following linear sum of the received digits: This is called a parity-check su m. Using the fact that <w,v >=0, we obtain the following relat ionship between the check sum A and error dig its in e: Suppose that there exist J vectors in the dual code C┴, which have the following properties: 1. The j th component of each vector w i is a 1 2. For i≠j there is at most one vector whose i th component is 1 The application of the parity equation (1)  The parity check equations orthogonal on e n can be used to estimate or to decode the received bit r n , The value can be determined by the follo wing: • If at least J-(J / 2) +1 values of A j are equal to 1 then the decision is e n = 1, • Otherwise, the decision is e n =0.

C. Implementation of the parallel Majority logic decoder
To implement this decoder, we have transcribed the decoding steps on simp le blocks described in VHDL and assembled as shown in  Are obtained on the first rising edge of the clock (see Figure 2). The proposed architecture is described in VHDL and implanted on FPGA using the Altera's Quartus II tool. The FPGA used circuit is the Altera's EP1S10F484C5. This circuit is characterized by 10570 LEs and 336 input / output pins.
The complexity of our decoder is 174 LEs with 43 inputs / outputs. The Figure 3 shows the inputs/outputs of the VHDL circuit of our decoder. The Figure 4 shows the simulat ions in the C language and the hardware simulat ion.

A. threshold decoding
In this section, we present the algorithm of threshold decoding [3]. We keep the same notations as those adopted in the previous section, assume the existence of J equations orthogonal on each position.
We consider a transmission of a codeword c (c 1 , c 2 , ... c n ), using BPSK modulat ion over a channel with additive white Gaussian noise (AWGN). Suppose that the word y (y 1 , y 2 , ..., y n ) is the received word. The decoder calcu lates for each b it y k the Log Likelihood Ratio LLR defined as follows: Where c k is the k th bit of the transmitted code wo rd. For a One-Step Majority Logic Decodable (OSM LD) code C, there is J (A j j in {1, ..., J}) o rthogonal equations for each position k, The equation (2) becomes: Where B i , fo r i in {1, ..., J}, are obtained fro m the orthogonal equations on c k bit, as follo ws: B0 = y k and each B i with i in {1, ..., J}, is calculated by eliminating the term y k fro m the i th orthogonal equation.
By applying BA YES ru le, (3) beco mes: Since the parity check equations are orthogonal on the position j, so the probabilities P (Bi = 1 or 0) are independent and (4) can be written as: Since probability to t ransmit the symbol 1 is equal to that of 0. The equation (5) beco mes: This likelihood ratio beco mes: Where the value of (1-2B i ) is equal +1 o r -1. w i is an information proportional to the reliability of the i th parity check equation. We can then show that: Where n k is the total number of terms in the k th orthogonal parity equation and with.
The term representing the estimation of the orthogonal equations on the symbol y k . Thus (8) becomes: The algorith mic structure of the threshold decoding is summarized as follows:

B. Architecture of the threshold decoder
The entry consists of analogy samples (called symbo ls) encoded on q=4 o r 5 b its: 3 or 4 b its for reliability and 1 for sign. The sign bit represents the binary value of the symbol ('0 'or '1'), the absolute value of a received bit represent its reliability (high reliab ility means that this bit has more chance to be correct).
We can simplify the imp lementation of the log likelihood ratio using an operator ADD_MIN [6] noted ◊. This operator searches the minimal value in absolute among n inputs and computes the sign according to the product of inputs signs. The simplified exp ression of the decoder output becomes: figure 5 presents the description, after simplification, the steps decoding fro m simp les blocks described in VHDL and assembled. The architecture of the threshold decoding has three fundamental units which are: • The shift register contains the quantified symbols received in parallel, the loading of the reg ister is controlled by Ld input on the first rising front of the clock (see figure 6) and with each new rising front of the clock the shifted symbols are availab le at the register output.
• In the implementation of the add-min operator and the adder, the number of input ports must be configurable as well as the number of bits per port to avoid poor performance of the correction errors caused by deficient resolution of different operators within the architecture.
• The results of the add-min operation, the addition and the decision are available at the output at each rising edge of the clock.  The FPGA have Known a great improvement in size and speed. Also, the FPGAs provide the most suitable platform for imp lementing applications of error correcting and detecting codes. Several studies on the "threshold decoding" have already been carried out [7] [8]. The VHDL description of the threshold decoding is made so that each unit of the proposed architecture is described in an independent entity. The final arch itecture of the parallel threshold decoding is given in Figure 5. The figure 6 shows the VHDL circu it input/output in our decoder. The proposed architecture has been described in VHDL and embedded on FPGA using the Altera's Quartus II tool. The FPGA used circuit is the Altera's EP1S10F484C5 type, this circuit is characterized by 10570 LEs, and 336 input / output pins.
The proposed architecture for the threshold decoding DSC (21, 11), the operations require a latency of 22cycles clock. The Table 1 shows a comparison of comp lexity (LEs) of our architecture with quantification q = 4 and q = 5 bits. The figure 7 shows an examp le of functional simulat ion on which the informat ion presented to the entry of the decoder circu it will be decoded in 22 clock cycles.  For validation of our decoder circuit, we were ab le to simu late and verify the decoding of 20000 words used for each SNR (Signal to Noise Ratio). The figure 8 shows the comparison of two decoders using threshold decoding algorith m for DSC (21, 11). The red curve shows the performance of threshold decoding by using simulat ion in C language (real data). The black curve present performance of the decoder with functional simu lation to the material level for quantificat ion respectively of 4b its and 5bits.

A. Designe of SISO threshold decoding
The basic idea is to modify the conventional threshold algorith m by associating a reliability to each decision to be exploited in iterat ive decoding [1][2] [11][12][13].

B. Implementation of the parallel SISO threshold decoder of DSC code
Several studies on the "SISO decoders" have already been carried out [1][2] [4] ] [5][13]. In this paper the VHDL description of the SISO threshold decoding is made so that each unit of the proposed architecture is described in an independent entity. The final architecture of the parallel SISO threshold decoding is given in figure 10. The figure  11 shows the VHDL circu it of our decoder and its input/output. The proposed architecture has been described in VHDL and embedded on FPGA using the Altera's Quartus II tool. The FPGA circuit used is similar to the one used to implement the threshold decoder. old decoder, the operations are obtained on the first rising edge of the clock and the occupied area for this decoder is 3672 LEs (logic elements) with 190 inputs/outputs for the DSC (21, 11) code. The figure 12 gives an examp le of functional simulation on wh ich it is reported that the informat ion presented at the input of the circuit decoder will be decoded on the first rising edge of the clock. For validation of our decoder circuit, we were ab le to simu late and verify the decoding of 20000 words used for each SNR (Signal to Noise Ratio). We found the same results as Figure 8.

A. Iterative threshold decoding
The iterative decoding of codes OSMLD was introduced for the first time by Lucas [14]. Two years after, the latter with Fossorier et al. [15] have proposed the belief propagation algorithm for decoding of OSM LD codes, in [16]. They showed that the iterative Hart mann-Rudolf-Lucas decoding, originally used for the decoding of product codes can be used to decode the codes OSM LD. In the process of iterative decoding, each decoder takes advantage of the extrinsic informat ion produced by the other decoder in the previous step. How ext rinsic informat ion is transmitted and the way it is explo ited by decoders to make their decision are not yet closed. The iterative decoding process that we use follows the Pyndiah's scheme [9][10] (figure 13). The soft input and respectively the soft output of the qth step (iteration) o f the iterat ive decoding are given by: Where R represents the quantified received word and ( ) W q the extrinsic in formation calculated by the previous decoder.

B. Implementation of parralel iterative threshold decoding
The final design of the parallel iterat ive decoder is described in VHDL and embedded on FPGA using the Altera's Quartus II tool. For the co mplexity of the global entity, the area occupied by the fully parallel iterat ive show in table 2 for the two codes DSC (21, 11) and DSC (73, 45) with 4 and 5 bits quantificat ion. The figure 14 gives an examp le of functional simulation on wh ich it is reported that the informat ion presented at the input of the circuit decoder will be decoded on two rising edges of the clock for a t wo iterations, and for a number of iterations n_ite, latency is of n_ite clock cycles.
We were able to simu late and verify the decoding of 2000 words used for each SNR fo llowing a simu lation protocol. Figure 15 (resp. Fig.16) shows the performances of our iterative decoder using the quantified data of 5 bits and α(q) fixed for the DSC(21, 11) (resp. DSC(73, 45)) code.

Conclusions
The VHDL design and FPGA imp lementation of an iterative decoder for OSM LD codes have been proposed in this paper. This design allows reaching and achieving very high data rate while minimizing comp lexity. By adopting a combinatory arch itecture operating in p ipelined and/or parallelized mode, the SISO threshold decoder reacts on the first active front of the clock with a co mplexity of 3672 LEs for DSC(21, 11) code and q=4bits. Furthermo re the iterative decoder associated to the fully parallel SISO threshold decoders allows achiev ing high data rates, (2 clock cycles) for a two iterations with a co mplexity of 7350LEs for DSC(21, 11) code and q=4bits. As a perspective we plan to study the DSC turbo decoder.  (1) iteration (2) iteration (4)