IJRIT International Journal of Research in Information Technology, Volume 1, Issue 4,April 2013, Pg. 171-179

International Journal of Research in Information Technology (IJRIT) www.ijrit.com

ISSN 2001-5569

FPGA Based Implementation of Compact Genetic Algorithm 1

Krupesh P. Patel, 2 Mahesh T. Parmar, 3 Markand Raval

1

PG Student, Department of Electronics and Communication, Gujarat Technological University Chandkheda, Gujarat, India 2 PG Student, Department of Electronics and Communication, Gujarat Technological University Chandkheda, Gujarat, India 3 PG Student, Department of Electronics and Communication, Gujarat Technological University Chandkheda, Gujarat, India 1

[email protected] ,

2

[email protected] , 3 [email protected]

Abstract This paper presents implementation of compact genetic algorithm (CGA) on FPGA. The CGA is a Probability vector based genetic algorithm, which require less memory, less processing power and very simple hardware implementation compare to traditional genetic algorithm. The software implementation is always restricted in term of high real time application by computer system. This paper introduces a hardware structure of CGA. The design is realized using Verilog HDL, then simulated by Xilinx ISE 10.1 and fabricated on FPGA Vertex 4.

Keywords: Compact genetic algorithm (CGA), Hardware implementation of CGA.

1. Introduction The Genetic Algorithm (GA) is a powerful optimization algorithm inspired by natural evolution [1]. The Optimization is performed by creating a population of solutions. In Simple Genetic Algorithm (SGA), the offspring are produced by standard genetic operators: reproduction, crossover, and mutation. In each generation, a selection scheme is used to select the survivors to the next generation according to their fitness values defined by users. With this artificial evolution, the solutions are gradually improved generation by generation. The GA process starts with a random population and iterates until the termination condition is met (the optimal solution is found, or reaches the maximum number of generations). Over the past years, GA has been successfully applied to many hard optimization problems, but due to their population-based nature, that is, they handle a set of potential solutions instead of only one, GA must evaluate the objective function several times, thus, one of their main disadvantages is the high computing time required to solve complex problems. However, for many real-world applications, GA can run for days, even when it is executed on a high performance workstation. Due to the extensive computation of GA, a myriad of hardware-based GAs has been

Krupesh P. Patel, IJRIT

171

put forward [6], [7],[8]. Most of the cited works present the hardware implementation of the SGA, but the hardware implementation of SGA is quite complex, expensive and requires large memory. On the other hand, CGA are a kind of probabilistic model-building genetic algorithms or the Estimation Distribution Algorithms (EDA) [2]. CGA operates on probability vectors by replacing the variation operators (crossover and mutation) that describe the distribution of a hypothetic population of solutions. It is known that the CGA obtains solutions of the same quality as the SGA with uniform crossover but with the advantage of an important reduction in the memory requirements, i.e. it only needs to store the probability vector instead of the entire population [3]. Therefore, the CGA may be useful in memory-constrained applications compare to SGA [4] and its hardware implementation is quite simple compare to SGA due to absent of variation operators.

2. The compact genetic algorithm The CGA originally proposed in [4] is the simplest algorithm from the Estimation of Distribution Algorithms family [7] whose main purpose is a simplification of a Genetic Algorithm. It generates the offspring population through an estimated probabilistic model of the entire population instead of using traditional recombination and mutation operations. The Probability Vector (PV) that generates the offspring according to the estimated probabilistic model of the parent population. The values of PV pi ∈ [0; 1]; ∀i = 1 to L. where L is the chromosome’s length, measures the proportion of ’1’ alleles in the ith locus of the simulated population [8]. The CGA initializes the PV with 0.5 each of its elements. Two solutions are generated using these probabilities. The fitness values for each generated individual are computed, then the PV is updated based on these solutions. This process is repeated until the PV converges (to 0 or 1). The PV is the solution found by the CGA. The algorithm is as follows: The CGA initializes the PV to 0.5, that is pi = [0:5]; Next, the ∀i = 1 to L. Individuals a and b are generated according to the probabilities in PV. The fitness value of the individuals a and b are compared and, the individual with better fitness is named winner and the other is called loser. Then, if winner[i] = loser[i], then PV[i] will be updated as follows: if winner[i] = 1 then PV[i] will be increased by 1/n, otherwise, PV[i] will be decreased by 1/n. n is the population size that the CGA is emulating. Note that if winner[i] =loser[i], the PV[i] will not be updated. The loop is repeated until each PV[i] has converged, i.e., it has only zeros or ones. Finally, PV represents the final solution. The pseudo-code of the CGA is shown in Figure 1. [5]

2.1 Pseudo code of the compact genetic algorithm (CGA)

Krupesh P. Patel, IJRIT

172

1) Initialize probability vector For i =1 to L and p[i] = 0.5; 2) Generate two individual from the vector a = generate ( p ); b = generate ( p ); 3) Let them compete winner, loser = evaluate (a , b); 4) Update the probability vector toward the better one For i = 1 to L do if winner[i] != loser[i] then if winner[i] = 1 then p[i] = p[i] + 1/n else

p[i] = p[i] - 1/n;

5) Check if the vector has converged For i =1 to L do if p[i] > 0 and p[i] < 1 then return to step 2; 6) p represents the final solution CGA parameters n: represent parameter size. L: chromosome length.

Fig.1 Pseudo code of CGA

3. Previous related work In this section we will present some of the most representative works including those related to Evolvable Hardware which is a research area that includes both, the design and implementation of Evolutionary Algorithms into hardware platforms to execute specific tasks. In 2001 Aporntewan and Chongstitvatana [9] proposed a CGA implementation in a FPGA using the language Verilog (Hardware Description Language). The authors showed that their design runs 1000 times faster than its software version executed in a workstation. The design is composed by five modules: random number generator, probability register, comparator, buffer, and fitness function evaluator. It is based on three basic operations, addition, subtraction, and comparisons. The probability vector updating is executed in a parallel manner. Krupesh P. Patel, IJRIT

173

This CGA executes one generation per three clock cycles for the Max-One problem, for other more complicated problems, one generation would take 3+e clocks, where e is the number of clocks used to evaluate the fitness function. In [3] it was proposed a Cellular CGA implemented in a FPGA. It consists of a set of identical CGA. Each CGA is called a cell and it interacts only with their four neighbors. Each CGA cell exchanges probability vectors with its neighbors in an asynchronous schema. The probability vectors are combined by using an equation proposed by the authors. They argued the Cellular CGA parallelization is straight forward and suited to be implemented in a FPGA. They experimented with the One-Max problem and two numerical optimization problems demonstrating that their proposal is better than the simple CGA.

4. Hardware Design A hardware design implementation is presented on a block diagram (Fig. 2). This design is based in the one already presented in [9]. The components are listed below. RNG: Random number generator. PRB: Probability vector register buffer CMP: Comparator BUF: Buffer FEV: Fitness Evaluator

Fig.2 Hardware organization (population size = 256, chromosome length = 8) Krupesh P. Patel, IJRIT

174

4.1 Random number generator (RNG) This component will generate the pseudo-random numbers, which are essential for the generation of individuals. The type of chosen random number will impact the overall design of CGA because it is the component that consumes most FPGA’s resources. It is possible to generate random number by Linear Feedback Shift Register (LFSR). LFSR is a shift register, when clocked, advances the signal through the register from one bit to the next most significant bit. Some of the output is combined in exclusive-OR configuration to form a feedback mechanism. A linear feedback shift register can be formed by performing exclusive-or on the outputs of two or more of the flips-flops together and feeding those outputs back into the input of one of the flip flops. A bit (Xn-1) takes XOR with a last-bit (Xn). So, at time (t+1) bit X0 is given by following equation. Figure 3 shows LFSR used for generating random number. It is also possible to generate random number using cellular automata. X0(t +1) = Xn(t) + X0(t)

…………………. (1)

Fig 3.LFSR

4.2 Probability register buffer (PRB) This component will store the initial value of probability vector, Update the value of probability vector once winner and loser are determined, and then store the updated value of probability vector during each iteration. The probability PV[i] is a floating-point number. In fact, it can be replaced by an integer representation since the operations performed on PV[i] are only add and subtract by 1/n Suppose n= 256. Then a 8-bit integer is sufficient for PV[i] and the operations performed on the integer are limited to increment and decrement. For that reason n must be a power of two.

4.3 Comparator (CMP) The CMP is a combinational circuit that compares generated random number with stored value of probability vector. If generated random number is greater than stored value of PV, output will be “1”. Otherwise, the output will be “0”. Krupesh P. Patel, IJRIT

175

4.4 Buffer (BUF) The buffer is a sequential circuit determining the ith position bits of the individuals “a” and “b”. The Buffers hold the individuals while they are being evaluated.

4.5 Fitness Evaluator (FEV) Two fitness evaluators are used to compute the fitness of the individuals “a” and “b” in parallel. For onemax problem, the fitness evaluator simply counts the number of “1” in a binary string. The number of clocks, spent in the fitness evaluation, varies tremendously from toy problems to hard optimization problems. To minimize number of clocks, spent in the fitness evaluation, the component FEV is designed using Read Only Memory (ROM) in which look up table for all possible input and their corresponding output are stored. So fitness can be evaluated just within one clock cycle. So it will drastically reduce the number of clock cycle spend into FEV. Hence over all speed will be improved.

5. Date Flow The hardware CGA performs operations on a PV. Every dimension PV[i] is updated in parallel. The RNG, PRB and CMP units are used to generate two individuals and store them in BUF. The FEV units evaluate the fitness of two individuals. The CMP unit determines the winner/loser and updates the probability vector in the PRBs. The hardware Compact GA works as follows. When the reset signal is received, the random number generators are seeded with values, the probability registers are set at 0.5, and the buffers are reset to the start state. Next, the following steps are repeated until all probability registers are zero or one. (1) The result of fitness evaluations determines whether an increment or decrement operation is performed on the probability register. Next, the random numbers and the probability registers are compared. (2) The buffers store the comparison result. If the random number is greater than PV[i] the ith position bit of individual “a” will be set to “0”. Otherwise, it will be set to “1”. While the buffers are clocked, the new random numbers are produced simultaneously. (3) The buffers perform the same operation as in step 2 for individual “b”. In this step, the individuals are forwarded to the fitness evaluators that are combinational circuits. The comparison of the fitness values is used to update the probability registers in step 1.

In [9] each step can be executed in one clock. As a result, the CGA executes one generation per three clock cycles for one-max problem. In proposed technique all three steps are executed in one clock cycle for one-max problem. Hence speed of proposed CGA is almost 3 times faster than CGA proposed in [9]. The design is realized using Verilog hardware description language. The population size (n) and the chromosome length (L) are set at 256 and 8 respectively. At the final stage, the design is fabricated on FPGA Vertex 4. The target device is an Xilinx FPGA. The synthesis result for one-max problem is given in Figure 4.

Krupesh P. Patel, IJRIT

176

6. Performance Evaluation We choose one-max problem to evaluate the system performance. The population size (n) and the chromosome length (L) are set at 256 and 8 respectively. Figure 5 shows that the proposed CGA require only one clock cycle during each iteration for solving one-max problem. In previously proposed technique [9] three clock cycle are required for each iteration. Hence proposed technique is three time faster than traditional CGA implementation technique. In Figure 5, PV get converged when one of the bit of PV reaches to “ 00 “ or “ FF “.

Target information: Vendor:

Xilinx

Family:

Virtex4

Device:

V1000FG680

Speed:

-6

Design Summary: Number of Slices:

230

out of 6144

3%

Number of Slice Flip Flops:

103

out of 12288 0%

Number of 4 input LUTs:

428

out of 12288 3%

Number of Bonded IOBs:

83

out of 240

34%

Number of GCLKs:

2

out of 32

6%

Design statistics: Minimum period:

42.423 ns

Maximum frequency:

23.572 MHz

Maximum net delay:

10.537 ns

Fig 4: Synthesis result for one-max problem

Krupesh P. Patel, IJRIT

177

Fig 5: simulation result for one-max problem A comparison between software, traditional hardware and proposed hardware is presented in Table 1. The software version is written in C language and compiled using gcc compiler. The software executes on 200 MHz Ultra Sparc II, SunOS [9]. The result shows that the proposed hardware is 3,000 times faster than the software execution and 3 times faster than the traditional hardware execution.

Software (200 Hz Ultra Sparc 2)

Traditional Hardware

Proposed Hardware

Speed up

2:30 min.

0.15 sec.

0.05 sec.

3,000

Table 1: A performance comparison

7. Conclusion This paper presented a hardware implementation for CGA. The hardware CGA is simple but effective. The operating clock frequency on FPGA is 20 MHz For 8-bit one-max problem, the 3,000X speedup over a software version is achieved and proposed hardware CGA is 3 times faster than traditional CGA.

Krupesh P. Patel, IJRIT

178

8.References [1] [2] [3] [4] [5] [6] [7] [8]

[9] [10] [11]

Goldberg, D. E. “Genetic Algorithm in search, optimization and machine learning,” Addison- Wesley, 1989. Y. Jewajinda and P. Chongstitvatana, “Fpga implementation of a cellular compact genetic algorithm,” in NASA/ESA Conference on Adaptive Hardware and Systems, 2008, pp. 385–390. G. R. Harik, F. G. Lobo, and D. E. Goldberg, “The compact genetic algorithm,” vol. 3, no. 4, pp. 287–297, 1999. F. Cupertino, E. Mininno, and D. Naso,“Elitist compact genetic algorithms for induction motor self-tuning control,” in IEEE Congress on Evolutionary Computation, CEC, 2006, pp. 3057–3063. Marco A. Moreno-Armend´ariz, Nareli Cruz-Cort´es “A Novel Hardware Implementation of the Compact Genetic Algorithm” in International Conference on Reconfigurable Computing, 2010,pp 156 – 161. Scott, S. and Seth, A. “HGA: A Hardware-Based Genetic Algorithm,” in Proc. of the ACM/SIGDA Third Int. Symp. on Field-Programmable Gate Arrays, pp. 53-59, 1995. Graham, P. and Nelson, B. “A Hardware Genetic Algorithm for the Traveling Salesman Problem on SPLASH 2,” in Proc. of the 5th Int. Workshop on Field Programmable Logic and Applications, pp. 352-361, 1995. Sitkoff, N., Wazlowski, M., Smith, A., and Silverman, H. “Implementing a Genetic Algorithm on a Parallel Custom Computing Machine,” in Proc. Of IEEE Symp. on FPGAs for Custom Computing Machines, pp. 180187, 1995. C.Aporntewan and P.Chongstitvatana, “A hardware implementation of the compact genetic algorithm,” in Proc. 2001 IEEE Congress Evolutionary Computation, 2001, pp. 624 – 629. Tiago Carvalho Oliveira , Vaifredo Pilla Junior “An implementation of Compact Genetic Algorithm on FPGA for extrensic evolvable hardware” ,2009. Kathleen M. Timmerman” A Hardware Compact Genetic Algorithm for Hover Improvement in an InsectScale Flapping-Wing Micro Air Vehicle”, 2010.

Krupesh P. Patel, IJRIT

179

FPGA Based Implementation of Compact Genetic ...

The software implementation is always restricted in term of high real time application ... population-based nature, that is, they handle a set of potential solutions instead ..... NASA/ESA Conference on Adaptive Hardware and Systems, 2008, pp.

937KB Sizes 1 Downloads 280 Views

Recommend Documents

FPGA Based Implementation of Compact Genetic ...
1 [email protected] , 2 [email protected] , 3 [email protected]. Abstract. This paper presents implementation of compact ...

On the Implementation of FPGA-Based Adaptive ...
high computational load for many conventional processors. In this paper, we present a configurable hardware for ... both algorithms and the field programmable gate array. (FPGA) implementation and experimental result. ... realized, which we use mean

FPGA Implementation of Encryption Primitives - International Journal ...
Abstract. In my project, circuit design of an arithmetic module applied to cryptography i.e. Modulo Multiplicative. Inverse used in Montgomery algorithm is presented and results are simulated using Xilinx. This algorithm is useful in doing encryption

FPGA Implementation of Encryption Primitives - International Journal ...
doing encryption algorithms in binary arithmetic because all computers only deal with binary ... This multiplicative inverse function has iterative computations of ...

FPGA IMPLEMENTATION OF THE MORPHOLOGICAL ...
used because it might be computationally intensive in some applications, however, the available current hardware resources overcome this disadvantage.

FPGA Implementation Cost & Performance Evaluation ...
IEEE 802.11 standard does not provide technology or implementation, but introduces ... wireless protocol for both ad-hoc and client/server networks. The users' ...

A DNA-Based Genetic Algorithm Implementation for ... - Springer Link
out evolutionary computation using DNA, but only a few implementations have been presented. ... present a solution for the maximal clique problem. In section 5 ...

An FPGA Implementation of 8-Channel Arbitrary Waveform ... - IJRIT
IJRIT International Journal of Research in Information Technology, Volume 2, Issue 6, .... is basically a scaled down version of SONAR in the ocean, although, of course, there ... evaluated and the best one meeting the requirements is selected.

FPGA Implementation of a Fully Digital CDR for ...
fully digital clock and data recovery system (FD-CDR) with .... which carries the actual phase information in the system, changes .... compliance pattern [10]. Fig.

FPGA Implementation of a Configurable Cache ...
... by allowing explicit control and optimization of data placement and transfers. .... this allows a low-cost virtualized DMA engine where every process/thread can ...

An FPGA Implementation of 8-Channel Arbitrary Waveform ... - IJRIT
IJRIT International Journal of Research in Information Technology, Volume 2, Issue 6, ... does not fit the requirements of flexibility, data access, programmability, ... is basically a scaled down version of SONAR in the ocean, although, of course, .

Optimization of compact heat exchangers by a genetic ... - CiteSeerX
Heat exchanger optimization is an important field and full of challenges. ... Programming (SQP) or Simulated Annealing (SA) or .... Thus, the important objective, minimum cost, should be considered ahead of the optimum design. In addition, in some pr

Optimization of compact heat exchangers by a genetic ...
a State Key Laboratory of Multiphase Flow in Power Engineering, School of Energy and Power Engineering, Xi'an Jiaotong University, Xi'an 710049, China b Division of ... als or energy, as well as capital cost and operating cost, are common ...... [3]

MDE-based FPGA Physical Design
General Terms Design, Management. Keywords ... The design of the Madeo infrastructure, as a frame- .... ture(MDA) by the Object Management Group (OMG)[8].

A Rapid Prototyping of FPGA-Based Duobinary ... - Signal Integrity
Jan 31, 2013 - Spectrum of NRZ and Duobinary sin 2. 2 sin .... http://www.altera.com/education/univ/materials/boards/de2-115/unv-de2-115-board.html ...

A Case Study of Connect6 Game FPGA-based ...
Examples of five groups of 3TW Cs for the light side: L3 P2 and. L2 intersect L2 ..... Dec.) Altera. DE2. Development and. Education. Board. [Online]. Available:.

A Rapid Prototyping of FPGA-Based Duobinary ... - Signal Integrity
Jan 31, 2013 - Email: [email protected] ... Email:[email protected] .... .altera.com/education/univ/materials/boards/de2-115/unv-de2-115-board.html ...

A Review on Neural Network Implementation Using FPGA
Implementation method with resource/speed tradeoff is proposed to handle signed ... negative value for a weight indicates an inhibitory connection while a ..... Derivative using Back Gate Effect”, VLSI Design and Test Workshop-2003, pp.

A distributed implementation using apache spark of a genetic ...
Oct 10, 2017 - This paper presents a distributed implementation for a genetic algorithm, using Apache Spark, a fast and popular data processing framework. Our approach is rather general, but in this paper the parallelized genetic algorithm is used fo

Implementation of genetic algorithms to feature selection for the use ...
Implementation of genetic algorithms to feature selection for the use of brain-computer interface.pdf. Implementation of genetic algorithms to feature selection for ...

Compact Part-Based Image Representations - UChicago Stat
P(I |µ) where the global template µ(x) = γ(µ1(x),...,µK(x)) is a composition of part templates .... drawn from a symmetric Bernoulli distribution (w.p.. 1. 2. ). For a fair.

Design and Development of a Compact Flexure-Based $ XY ...
Abstract—This paper presents the design and development of a novel flexure parallel-kinematics precision positioning stage with a centimeter range and ...

An FPGA-based Prototyping Platform for Research in ... - CMU (ECE)
vide an aggregate bandwidth of 4×2.5 = 10 Gb/s. We ..... Table 1 shows the person-months spent in the .... packet from the NIC, namely 40 bytes, we spend 16.