Performance Evaluation of Best-Worst Selection Criteria for Genetic Algorithm

: Genetic algorithm’s performance is based on three major factors, which are selection criteria, crossover and mutation operators. Each factor has its own significant role but the selection criteria to choose parents from the population is the key role to running the genetic algorithm. There is a number of selection schemes that have been introduced in literature and all have their own advantages. Most of the selection criterion is chose the parents which give highly optimum values based on the theory that healthy parents produce healthy offspring. In the current study, we proposed a new selection scheme which selects healthy parents as well as unhealthy parents. The novel selection scheme is simple to implement, and it has notable ability to reduce the effected of premature convergence compared to other selection schemes. We apply this new technique along with some traditional selection schemes on six benchmark problems and Simulation studies show a remarkable performance of the proposed selection scheme.


Introduction
Genetic algorithms (GAs) are stochastic search methods that mimic natural biological evolutionary processes were originally proposed by John Holland in 1960s. He published results about GAs in his book "Adaptation in Natural and Artificial Systems", in 1975 [1]. A lot of work has been done about GAs with several applications in a frequently cited book by Goldberg [2]. GAs work with the help of population of chromosomes which are represented by some underlying assumptions in a set codes. Selection, crossover and mutation operators are applied to successive populations of chromosomes to produce new offspring. Individuals are selected in the capacity of the parents as would-be generation by employing a mechanism of selection. Once they are selected these individuals, then result in new offspring by mutation and crossing. Eventually, the alternative mechanism helps form the next generation from parents and offspring [3]. Till the time a desirable condition is achieved, the process is continued to be repeated. The flow chart that how GAs work is given in Figure 1.
Selection criteria is the first operator in the reproduction phase of GAs. The purpose of the selection is to select an individual from the population that will produce offspring for the next generation, technically known as mating pool. In simple words, the selection is a process to select breeding stock from the population. Without selection operator, GAs are just simple random methods which give different values every time [4]. The key idea of selection is to assess the performance of individuals who are selected on the basis of fitness. A selection chooses a fitter individual in analogy to Darwin's theory of evaluation survival to fittest and fitness is evaluated through adaptation function. A criteria that is based on a too strong selection system leads to individuals who are significantly fit sub-optimally. If such individuals dominate the population, it will bring down the diversity and hence a selection that is weak will lead to an evolution that is slow. Typically, selection schemes are divided into two distinct types as proportionate and ordinal-based selections. Proportionate-based scheme selects the individuals based on their fitness values relative to the fitness of the other individuals in the population and ordinal-based schemes select individuals based upon their rank and not upon their raw fitness within the population. In the case of selection mechanism, a strong selection pressure may cause the GAs to converge to a local optimum. On the other hand, a type of selection that has a low pressure may lead to a random result and such a selection may differ from one run to another [5]. The purpose of the selection operator is to make the most of the best characteristics of a good candidate solution to ameliorate the solution from generation to generation which forces the GAs to move into the direction of a desirable solution [2]. The yardstick that is the most important is the selection operator that might have an effect on the working of GAs [6].
In the literature, there are several selection criterion: for example, roulette wheel scheme, stochastic universal sampling, the tournament selection and the selection of Boltzmann along with other schemes. However, there are no specific guidelines or theoretical support to select a proper selection method for various problems. This can be a serious problem because of numerical results were an improper selection method can lead to a poor performance of the GAs in terms of both rapidity and reliability. A comparison study of the analysis of different selection methods used in GAs by Goldberg and Deb [7]. Another study about selection schemes comparison was conducted by Blickle and Thiele [8]. A blended selection operator which is more exploratory in nature in initial iterations and with the passage of time, it gradually shifts towards exploitation was introduced by Kumar [9]. Jebari and Madiafi [10] proposed a technique that can help reduce the dependence of next generation on the current one is presented and its efficiency is numerically illustrated. Anand et al. [11] influenced of the various selection methods on the performance of genetic algorithm can be estimated to assist the preference of a selection method. Recently, Pandey [12] has done a comprehensive study of different selection techniques in GAs.
Theoretical background about traditional selection operators has been discussed in Section 2, proposed approach given in Section 3, benchmark functions studied in Section 4, experimental results and discussion about the results in Section 5 and conclusion in Section 6.

Some Traditional Selection Operators
There are several selection operators in GAs literature and we provide a brief description of the most commonly used operators.

Roulette Wheel Selection
The roulette wheel selection (RWS) is traditional and the easiest stochastic approach proposed by John Holland [1]. The main concept of this selection method is the fact that it provides each and every individual i with a probability p (i) of being selected proportional to its fitness f (i) from the population as: A popular inherent flaw of the RWS is the premature convergence of the GAs that remains a risk to a local optimum. It is mainly owing to the possible existence of predominant individual who mostly comes and selected as parent.

Stochastic Universal Sampling
The stochastic universal sampling (SUS) was developed by Baker [13]. It is a variant of RWS which aimed at reducing the risk of premature convergence and a single selection method with minimum spread and no bias. SUS can be used to make any number of selections and also performed in the situations where more than one samples are needed to be drawn from the distribution. To selected offspring, the SUS ensures which is closer to what is deserved than RWS. Figure 2 demonstrates the SUS with individuals mapped to contiguous segments of a line, such that each individual's segment is equal in size to its fitness similarly as in RWS. For N individuals to be selected, there are equally spaced pointers are placed over the line. For 6 individuals (N = 6) to be selected, the distance between the pointers is 1/6. Selection is to be made for the sample of random number 0.1 in the range [0, 1/6]. After selection the mating population consists of the individuals 1, 2, 3, 4, 6 and 8, the SUS ensures a selection of offspring which is closer to what is deserved than RWS.

Stochastic Remainder Selection
Brindle [14] developed the stochastic remainder selection (SRS). The concept that it operates on is proportionate of every bit has to be shown in the bit's incidence in the mating pool. In the population of chromosomes, each bit having a probability based on its relative fitness. In the mechanism of SRS, the relative fitness of a bit is derived as the fitness associated with an individual divided by average fitness are: One parent is selected deterministically based on the integer part of Pselect i and then with the help of RWS on the rest fractional part to complete the free places in the mating pool. SRS gives the highest probability of selection to the fittest parents of the population. Also, average occurrence in the mating pool is computed as:

Linear Rank Selection
Baker [15] has also proposed the linear rank selection (LRS). For avoiding the drawback of the premature convergence of GAs to a local optimum, LRS was employed as a variant of RWS [7]. Here the basis is not the fitness, rather it is the ranks of the individuals. The worst individual is awarded the rank 1 while the rank n is accorded to the best one. The selection probability is linearly assigned to the individuals according to their rank: Here # $ and # % are the probabilities of worst and best individuals to be selected respectively.

Random Selection
The random selection (RS) is a very simple technique to select parents from the population at random. In terms of disruption of genetic codes, the random selection is a little more disruptive on the average than RWS. In this technique, all of the individuals are allocated equal reproduction opportunities [16]. For N individuals, the selection probability is:

Tournament Selection
A form of tournament selection (TS) attributed to unpublished work by Wetzel was studied in [14], and more studies using tournament schemes are found in a number of works [17─19]. The TS is a variant of rank-based selection techniques. There are q individuals that are randomly chosen from the population and then the best-fitted individual, designated as the winner, is selected for the next generation. The process is repeated m times until the new population is completed. The parameter q is known as the tournament size and usually, it is fixed to q = 2 (binary tournament). As shown by Back [20], the selection probability for individual a i for q-tournament selection is given by:

Proposed Selection Operator
We observed that the above-described, commonly used selection mechanisms provide opportunity only to the bestfitted individuals to join the mating pool. By the help of such selections, a considerable amount of "genetic material" is lost which a bad individual contains. The replaced individuals are according to the strength of the "loss of diversity". The proportion of not selected individuals of a population during the selection phase is the loss of diversity p d . A high loss of diversity increases the risk of premature convergence, hence it should be as low as possible. To overcome the premature convergence, we proposed a new selection criteria which always takes the last 25 percent portion (bad individuals) for mating process.
Also, we see in real life, sometimes it may happen that two most healthy-parent do not produce a healthy offspring. If one of the partners is not healthy, might also have a chance to produce a healthy offspring. In GAs, the chromosomes made with the joining of bits and with just a slight change within the bits, results might happen from worst to best. We can provide examples which may prove our justification. The two most fitted-individuals produce may not be a better of them but the combination of the worst and the best fittedindividuals might have a best-fitted offspring. For example: First, we create an initial population of individuals with 5bits-string. For simplicity, let us assume a population of size 4 where each individual randomly selected. We produce offspring with the help of all combination of parents and take the arbitrary crossover point between the 3rd and 4th bits.
The Figure 3 shows that the two most-fitted parents do not produce a better offspring comparing with themselves but the selection of best and worst strings as parents to reproduce a global optimum offspring. There is a rare chance to select a worst-fitted individual in the mating pool from all other selection schemes. To take care of this scenario, a novel selection method has been proposed, naming "Best-worst" selection criteria. Its aim is to overcome the drawback of all other selection methods, i.e., by providing a half role to the bad individual to be a successful mating candidate. In the proposed scheme, we divide our population into four equal parts after assigning the ranks of all individuals according to their fitness. Figure 4 describes that how we can call only the first and last (fourth) portions of the n-size population in mating pool. The obvious characteristic of the novel selection scheme is what gives to every individual i (after ranks) of the current population with a probability p (i) of getting selected from the population as: Where n is the population size in terms of the number of individuals with always a multiple of four. But we select all those parents for mating which meet the following constraints.  The pseudo-code of Best-worst selection (BWS) scheme can be described in Figure 5:

Benchmark Functions
In this study, we are using six multi-model benchmark functions that are most popular and used in several studies. The necessary information about these functions are as follows: The Rastrigin Function takes from De Jong's standard functions [21] with the addition of cosine modulation in order to produce frequent local minima. This is highly multimodal and difficult to find optimum due to regularly distributed of its minima locations. The optimization problem is stated as follows: It has a number of local optima and the global minimum value of the function is "0" at (0, 0).
The Rosenbrock function is one of five De Jong's standard functions [21]. It is a classic optimization problem also known as a banana function because of its distinctive shape in a contour plot. It is stated as follows: The global optimum lies inside a narrow, long, parabolic shaped flat valley and to find a minimum point is trivial. To find out the global optimum value of this function is difficult, that's why this problem has been continuously used as a benchmark to test the performance of new optimization techniques or algorithms. It has "0" global optimum value at (1,1).
The next one is the Colville function which has four dimensions and highly irregular pattern to convergence taken from [4]. It is stated as follows: This function is highly multi-modal and not easy to locate its optimum points because of its more dimensions. It has a global minimum value of "0" at (1, 1, 1, 1).
The 2-D six-hump camel back function is a global optimization test function. Within the bounded region are six local minima, two of them are global minima. This function is also a highly irregular pattern to convergence taken from [4]. It is stated as follows: It has a global minimum value of "-1.0316" at two different points (-0.0898, 0.7126) and (0.0898, -0.7126).
The Gold-Price function has taken from [22]. It is stated as follows: The function has a global minimum value of "3" at (0, -1) The Easom function is unimodal where the global minimum has a small area relative to the search space taken from [23]. It is stated as follows: The function has a global minimum at (π, π) with "-1".

Experimental Results and Discussion
In this section, we compute and explore the performance of the proposed selection scheme. We use genetic algorithm tool in MATLAB software to compare the proposed selection scheme with some of the traditional criterion. For this comparison, we fixed some parameters of the genetic algorithm.
All used parameters fixed in our study and given in the Table 1. In Table 2, we provide the best results of all six functions along various selection methods. We observed that the BWS is working better than RWS, SRS and RS for Rastrigin function. For Rosenbrock function, only SRS is slightly better than BWS. The proposed scheme is performed best among all schemes for Colville function and all methods behave with smiler pattern for Six-hump, Gold-Price and Easom functions. For much better comparison, we show our results in Figure 6.  Table 3 shows the average results with proposed scheme versus five traditional selection methods. The novel scheme is performing better on an average basis than RS and TS for Rastrigin function. Rosenbrock function has shown the best average value of BWS as compared to all other schemes. BWS is performing slightly less than TS for Colville function only and for the Six-hump function there are RWS and SUS perform better than BWS on an average basis. Two schemes RWS and RS are little better than BWS for Gold-Price function and BWS is performing best among all selection criterion for Easom function. Figure 7 is the graphical display of average values comparison among all selection schemes. We also present the worst results among all given selection schemes to more close comparison in Table 4. We see that the BWS is better than TS for Rastrigin and also than TS and RS for Rosenbrock functions. Only TS is performing slightly better than the proposed scheme for Colville function in worst values case. For Six-hump function, two selection schemes RWS and SUS are just better performing than the proposed one. Only RS is performing better with worst value than BWS for Gold-Price function. For Easom function, BWS is performing best in the worst-value case than all other selection schemes. Figure 8 is the graphics display with worst-values comparison among all selection schemes.

Conclusion
In this paper, a novel and efficient selection scheme for GAs is introduced. Our selection scheme chooses the better along with weaker chromosomes for the next generation to the purpose of optimization. It provides the better convergence rate compared to other selection methods. The performance of each selection method in terms of best, average and worst individual fitness is weighed through the implementation of a program in MATLAB. A set of experiments on a selected set of multi-modal testing functions of varying difficulty was described. Experimental results and performance evaluation provided the evidence about the improved performance of proposed technique along with other selection methods of GAs as a whole. Finally, changing the mating procedure from both fitter individuals to a better-fitter and worst-fitter would allow different strategies to be pursued. For example, by pairing the best and worst individuals as parents from the population, a more exploitative search could be produced. The novel scheme was successfully applied to the optimization problem of a set of well-known benchmark functions, which encourages further improvements of this idea.