Memetic algorithm local search and memetic algorithm example

multiobjective memetic algorithm with local aggregate meta-model and real-coded memetic algorithms with crossover hill-climbing
ZoeTabbot Profile Pic
Published Date:13-07-2017
Your Website URL(Optional)
Chapter 1 Memetic Algorithms 1.1 Introduction Back in the late 60s and early 70s, several researchers laid the founda- tions of what we now know as evolutionary algorithms 75, 108, 218, 227 (EAs). Inthesealmostfourdecades,anddespitesomehardbeginnings,most researchers interested in search or optimization both from the applied and the theoretical standpoints have grown to know and accept the existence andindeed the usefulnessof thesetechniques. Thishas been also the case for other related techniques, such as simulated annealing 122 (SA), tabu search 83 (TS), etc. The name metaheuristics is used to collectively term these techniques. It was in late 80s that the term `Memetic Algorithms' 178 (MAs) was given birth to denote a family of metaheuristics that tried to blend several conceptsfromtightlyseparatedatthattimefamiliessuchasEAsandSA. Theadjective`memetic'comesfromtheterm'meme', coinedbyR.Dawkins 62 to denote an analogous to the gene in the context of cultural evolution. Quoting Dawkins: \Examplesofmemesaretunes,ideas,catch-phrases,clothesfash- ions, ways of making pots or of building arches. Just as genes propagate themselves in the gene pool by leaping from body to body via sperms or eggs, so memes propagate themselves in the meme pool by leaping from brain to brain via a process which, in the broad sense, can be called imitation." The above quote illustrates the central philosophy of MAs: individual improvement plus populational cooperation. As it was the case for classical EAs, MAs had to su®er tough initial times, but they are now becoming increasingly popular, as the reader may check by taking a quick look at the review of current work in MAs done at the end of this chapter. It is often the case that MAs are used under a di®erent name (`hybrid EAs' and 12 'Lamarckian EAs' are two popular choices for this). Not quite surprisingly in a rapidly expanding ¯eld as this is, one can also ¯nd the term MA used in the context of particular algorithmic subclasses, arguably di®erent from those grasped in the initial de¯nition of MAs. This point will be tackled in next section; anticipating further de¯nitions, we can say that a MA is a search strategy in which a population of optimizing agents synergistically cooperate and compete 189. A more detailed description of the algorithm, as well as an functional template will be given in Section 1.2. As mentioned before, MAs are a hot topic nowadays, mainly due to their success in solving many hard optimization problems. A particular featureofMAsisgreatlyresponsibleforthis: unliketraditionalEvolutionary Computation(EC)methods,MAsareintrinsicallyconcernedwithexploiting all available knowledge about the problem under study; this is something thatwasneglectedinEAsforalongtime, despitesomecontraryvoicessuch as Hart and Belew 100, and most notably Davis 61. The formulation of the so-called No-Free-Lunch Theorem (NFL) by Wolpert and Macready 247 made it de¯nitely clear that a search algorithm strictly performs in accordance with the amount and quality of the problem knowledge they incorporate, thus backing up one of the leiv motivs of MAs. The exploitation of problem-knowledge can be accomplished in MAs in a by incorporating heuristics, approximation algorithms, local search tech- niques, specialized recombination operators, truncated exact methods, etc. Also, animportantfactoristheuseofadequaterepresentationsoftheprob- lem being tackled. These issues are of the foremost interest from an applied viewpoint, and will be dealt in Section 1.3. As important as the basic algorithmic considerations about MAs that will be presented below, a more applied perspective of MAs is also provided in Section 1.4. The reader may be convinced of the wide applicability of thesetechniquesbyinspectingthenumerousresearchpaperspublishedwith regard to the deployment of MAs on the most diverse domains. We will pay specialattentiontotheapplicationofMAsinEngineering-relatedendeavors. This chapter will end with a brief summary of the current research trends in MAs, with special mention to those emerging application ¯elds in which MAs are to play a major role in the near future. 1.2 The MA Search Template Asmentionedintheprevioussection,MAstrytoblendtogetherconcepts from di®erentmetaheuristics, suchas EAs and SA for instance. Let us start by those ideas gleaned from the former. MAs are like EAs population-based metaheuristics. This means that the algorithm maintain a population of solutions for the problem at hand, i.e., a pool comprising several solutions simultaneously. Each of these solu-3 tions is termed individual in the EA jargon, following the nature-inspired metaphor upon which these techniques are based. In the context of MAs, the denomination agent seems more appropriate for reasons that will be ev- ident later in this section. When clear from the context, both terms will be used interchangeably. Eachindividualoragentrepresentsatentativesolutionfortheproblem underconsideration. Thesesolutionsaresubjecttoprocessesofcompetition and mutual cooperation in a way that resembles the behavioral patterns of living beings from a same species. To make clearer this point, it is ¯rstly necessarytoconsiderthehigh-leveltemplateofthebasicpopulationalevent: a generation. This is shown below in Fig. 1.1. Process Do-Generation ("pop:individual ) variables breeders;newpop:Individual ; begin breedersà Select-From-Population(pop); newpopà Generate-New-Population(breeders); popà Update-Population (pop, newpop) end Figure 1.1: The basic generational step As it can be seen, each generation consists of the updating of a popula- tion of individuals, hopefully leading to better and better solutions for the problem being tackled. There are three main components in this genera- tional step: selection, reproduction, and replacement. The ¯rst component (selection) is responsible (jointly with the replacement stage) for the com- petition aspects of individuals in the population. Using the information provided by an ad hoc guiding function (¯tness function in the EA termi- nology), the goodness of individuals in pop is evaluated; subsequently, a sample of individuals is selected for reproduction according to this goodness measure. This selection can be done in a variety of ways. The most popular techniques are ¯tness-proportionate methods (the probability of selecting 1 an individual for breeding is proportional to its ¯tness ), rank-based meth- ods (the probability of selecting an individual depends on its position after rankingthewholepopulation), andtournament-basedmethods(individuals are selected on the basis of a direct competition within small sub-groups of individuals). Replacement is very related to this competition aspect, as mentioned above. This component takes care of maintaining the population at a con- 1 Maximization is assumed here. In case we were dealing with a minimization problem, ¯tness should be transformed so as to obtain an appropriate value for this purpose, e.g., subtracting it from the highest possible value of the guiding function4 stant size. To do so, individuals in the older population are substituted by the newly-created ones (obtained from the reproduction stage) using some speci¯c criterion. Typically, this can be done by taking the best (according totheguidingfunction)individualsbothfrompopandnewpop(theso-called \plus" replacement strategy), or by simply taking the best individuals from newpopandinsertingtheminpopsubstitutingtheworstones(the\comma" strategy). In the former case, if jpopj = jnewpopj then the replacement is termed generational; if jnewpopj is small (say jnewpopj = 1), then we have a steady-state replacement. Maybe the most interesting aspect in this generation process is the in- termediate phase of reproduction. At this stage, we have to create new individuals (or agents) by using the existing ones. This is done by utilizing a number of reproductive operators. Many di®erent such operators can be used in a MA, as illustrated in the general pseudocode shown in Fig. 1.2. Nevertheless,themosttypicalsituationinvolvesutilizingjusttwooperators: recombination and mutation. Process Generate-New-Population (pop:Individual ;op:Operator )Individual variables buffer :Individual ; j :1::jopj; begin buffer0Ãpop; for jà 1:jopj do bufferjà Apply-Operator (opj, bufferj¡1); endfor; return buffern op end Figure 1.2: Generating the new population. Recombination is a process that encapsulates the mutual cooperation among several individuals (typically two of them, but a higher number is possible 72). This is done by constructing new individuals using the infor- mation contained in a number of selected parents. If it is the case that the resulting individuals (the o®spring) are entirely composed of information taken from the parents, then the recombination is said to be transmitting 211. This is the case of classical recombination operators for bitstrings such as single-point crossover, or uniform crossover 233. This property captures the a priori role of recombination as previously enunciated, but it can be di±cult to achieve for certain problem domains (the Traveling Sales- man Problem TSPisatypicalexample). Inthosesituations, itispossible to consider other properties of interest such as respect or assortment. The5 former refers to the fact that the recombination operator generate descen- dants carrying all features (i.e., basic properties of solutions with relevance for the problem attacked) common to all parents; thus, this property can be seen as a part of the exploitative side of the search. On the other hand, assortment represents the exploratory side of recombination. A recombina- tion operator is said to be properly assorting if, and only if, it can generate descendants carrying any combination of compatible features taken from the parents. The assortment is said to be weak if it is necessary to perform several recombinations within the o®spring to achieve this e®ect. Several interesting concepts have been introduced in this description of recombination, namely, relevant features and cooperation. We will return to these points in the next section. Before that, let us consider the other oper- ator mentioned above: mutation. From a classical point of view (at least in thegenetic-algorithmarena84),thisisasecondaryoperatorwhosemission is to keep to pot boiling, continuously injecting new material in the popu- lation, but at a low rate (otherwise the search would degrade to a random walk in the solution space). Evolutionary-programming practitioners 75 would disagree with this characterization, claiming a central role for muta- tion. Actually, it is considered the crucial part of the search engine in this context. In essence, a mutation operator must generate a new solution by partly modifying an existing solution. This modi¯cation can be random as it is typically the case or can be endowed with problem-dependent information so as to bias the search to probably-good regions of the search space. It is preciselyinthelightofthislatterpossibilitythatoneofthemostdistinctive components of MAs is introduced: local-improvers. To understand their philosophy, let us consider the following abstract formulation: ¯rst of all, assume a mutation operator that performs a random minimal modi¯cation inasolution;nowconsiderthegraphwhoseverticesaresolutions,andwhose edges connect pairs of vertices such that the corresponding solutions can be 2 obtained via the application of the mutation operator on one of them . A local-improver is a process that starts at a certain vertex, and moves to an adjacent vertex, provided that the neighboring solution is better that the current solution. This is illustrated in Fig. 1.3. As it can be seen, the local-improver tries to ¯nd an \uphill" (in terms of improving the value provided by the guiding function F ) path in the g graphwhosede¯nitionwassketchedbefore. Theformalnameforthisgraph is ¯tness landscape 115. Notice that the length of the path found by the local-improver is determined by means of a Local-Improver-Termination- Criterion function. A usual example is terminating the path when no more uphill movements are possible (i.e., when the current solution is a local 2 Typically this graph is symmetrical, but in principle there is no problem in assuming it to be asymmetrical.6 Process Local-Improver ("current:Individual,op:Operator) variables new :Individual begin repeat newà Apply-Operator(op, current); if (F (new)Á F (current)) then g F g currentÃnew; endif until Local-Improver-Termination-Criterion(); return current; end Figure 1.3: Pseudocode of a Local-Improver optimum with respect to op). However, this is not necessarily the case always. For instance, the path can be given a maximum allowed length, or it can be terminated as soon as the improvement in the value of the guiding function is considered good enough. For this reason, MAs cannot be characterized as \EAs working in the space of local-optima with respect to a certain ¯tness landscape"; that would be an unnecessarily restricted de¯nition. Thelocal-improveralgorithmcanbeusedindi®erentpartsofthegenera- tionprocess,foritisnothingelsethanjustanotheroperator. Forexample,it can be inserted after the utilization of any other recombination or mutation operator; alternatively, it could be just used at the end of the reproductive stage. See ? for examples of these settings. 3 As said before, the utilization of this local-improver is one of the most characteristicfeaturesofMAs. Itispreciselybecauseoftheuseofthismech- anismforimprovingindividualsonalocal(andevenautonomous)basisthat the term `agent' is deserved. Thus, the MA can be viewed as a collection of agents performing an autonomous exploration of the search space, co- operating some times via recombination, and competing for computational resources due to the use of selection/replacement mechanisms. Afterhavingpresentedtheinnardsofthegenerationprocess,wecannow have access to the larger picture. The functioning of a MA consists of the iteration of this basic generational step, as shown in Fig. 1.4. Several comments must be made with respect to this general template. First of all, the Generate-Initial-Population process is responsible for cre- ating the initial set of jpopj con¯gurations. This can be done by simply 3 We use the term in singular, but notice that several di®erent local-improvers could be used in di®erent points of the algorithm.7 Process MA ()Individual variables pop:Individual ; begin popà Generate-Initial-Population(); repeat popà Do-Generation (pop) if Converged(pop) then popà Restart-Population(pop); endif until MA-Termination-Criterion() end Figure 1.4: The general template of a MA generating jpopj random con¯gurations or by using a more sophisticated seeding mechanism (for instance, some constructive heuristic), by means of which high-quality con¯gurations are injected in the initial population 232 147. Another possibility, the Local-Improver presented before could be used as shown in Fig. 1.5: Process Generate-Initial-Population (¹:N)Individual variables pop:Individual ; ind:Individual; j :1::¹; begin for jà 1:¹ do indà Generate-Random-Solution(); popjà Local-Improver (ind); endfor return pop end Figure 1.5: Injecting high-quality solutions in the initial population. There is another interesting element in the pseudocode shown in Fig. 1.4: the Restart-Population process. This process is very important in or- der to make an appropriate use of the computational resources. Consider that the population may reach a state in which the generation of new im- proved solution be very unlikely. This could be the case when all agents in thepopulationareverysimilartoeachother. Inthissituation,thealgorithm will probably expend most of the time resampling points in a very limited8 region of the search space 48, with the subsequent waste of computational e®orts. This phenomenon is known as convergence, and it can be identi¯ed using measures such as Shannon's entropy 60. If this measure falls below a prede¯ned threshold, the population is considered at a degenerate state. This threshold depends upon the representation of the problem being used (number of values per variable, constraints, etc.) and hence must be deter- mined in an ad-hoc fashion. A di®erent possibility is using a probabilistic approach to determine with a desired con¯dence that the population has converged. For example, in 111 a Bayesian approach is presented for this purpose. Oncethepopulationisconsideredtobeatadegeneratestate,therestart process is invoked. Again, this can be implemented in a number of ways. A very typical strategy is keeping a fraction of the current population (this fraction can be as small as one solution, the current best), and substituting the remaining con¯gurations with newly generated (from scratch) solutions, as shown in Fig. 1.6: Process Restart-Population (pop:Individual )Individual variables newpop:Individual ; j;preserved:1::jpopj; begin preservedÃjpopj¢%PRESERVE; for jà 1:preserved do th newpopjà i Best(pop, j); endfor for jÃ(preserved+1):jpopj do newpopjà Generate-Random-Con¯guration(); newpopjà Local-Improver (newpopj); endfor; return newpop end Figure 1.6: A possible re-starting procedure for the population. The above process completes the functional description of MAs. Obvi- ously, it is possible to conceive some ad-hoc modi¯cations of this template thatstillcouldbecataloguedasMA.Thereadercanneverthelessbeensured that any such algorithm will follow the general philosophy depicted in this section, and could be possibly rewritten so as to match this template.9 1.3 Design of E®ective MAs The general template of MAs we have depicted in the previous section mustbeinstantiatedwithprecisecomponentsinordertobeusedforsolving an speci¯c problem. This instantiation has to be done carefully so as to obtain an e®ective optimization tool. We will address some design issues in this section. A¯rstobviousremarkmustbedone: thereexistnogeneralapproachfor the design of e®ective MAs. This fact admits di®erent proofs depending on theprecisede¯nitionofe®ective inthepreviousstatement. Suchproofsmay involveclassicalcomplexityresultsandconjecturesif`e®ective'isunderstood as `polynomial-time', the NFL Theorem if we consider a more general set of performance measures, and even Computability Theory if we relax the de¯nition to arbitrary decision problems. For these reasons, we can only de¯ne several design heuristics that will likely result in good-performing MAs, but without explicit guarantees for this. Having introduced this point of caution, the ¯rst element that one has to decide is the representation of solutions. At this point it is necessary to introduce a subtle but important distinction here: representation and codi¯cation are di®erent things. The latter refers to the way solutions are internally stored, and it can be chosen according to memory limitations, manipulation complexity, and other resource-based considerations. On the contrary, the representation refers to an abstract formulation of solutions, relevant from the point of view of the functioning of reproductive operators. This duality was present in discussions contemporary to the early debate on MAs (e.g., see 210), and can be very well-exempli¯ed in the context of permutational problems. For instance, consider the TSP; solutions can be internally encoded as permutations, but if a edge-recombination operator is used (e.g., 150) then solutions are de facto represented as edge lists. The above example about the TSP also serves for illustrating one of the properties of representations that must be sought. Consider that a permutationcanbeexpressedusingdi®erentinformationunits;forinstance, itcanbedeterminedonthebasisofthespeci¯cvaluesofeachposition. This istheposition-based representationofpermutations84. Ontheotherhand, it can be determined on the basis of adjacency relationships between the elements of the permutation. Since the TSP is de¯ned by a matrix of inter- city distances, it seems that edges are more relevant for this problem than absolute positions in the permutation. In e®ect, it turns out that operators manipulating this latter representation perform better than operators that manipulatepositionssuchaspartially-mappedcrossover 85(PMX)orcycle crossover 191 (CX). There have been several attempts for quantifying how good a certain set of information units is for representing solutions for a speci¯c problems. We can cite a few of them:10 ² Minimizing epistasis: epistasis can be de¯ned as the non-additive in- °uence on the guiding function of combining several information units (see 59 for example). Clearly, the higher this non-additive in°uence, the lower the absolute relevance of individual information units. Since thealgorithmwillbeprocessingsuchindividualunits(orsmallgroups of them), the guiding function turns out to be low informative, and prone to misguide the search. ² Minimizing ¯tness variance 212: This criterion is strongly related to thepreviousone. The¯tnessvariancefor acertain informationunitis the variance of the values returned by the guiding function, measured across a representative subset of solutions carrying this information unit. Byminimizingthis¯tness variance, theinformationprovidedby the guiding function is less noisy, with the subsequent advantages for the guidance of the algorithm. ² Maximizing ¯tness correlation: In this case a certain reproductive op- erator is assumed, and the correlation in the values of the guiding function for parents and o®spring is measured. If the ¯tness correla- tion is high, good solutions are likely to produce good solutions, and thus the search will gradually shift toward the most promising regions of the search space. Again, there is a clear relationship with the pre- vious approaches; for instance, if epistasis (or ¯tness variance) is low, then solutions carrying speci¯c features will have similar values for the guiding function; since the reproductive operators will create new solutionsbymanipulatingthesefeatures, theo®springislikelytohave a similar guiding value as well. Obviously, the description of these approaches may appear somewhat idealized,buttheunderlyingphilosophyiswellillustrated. Itmustbenoted that selecting a representation is not an isolated process, but it has a strong liaison with the task of choosing appropriate reproductive operators for the MA. Actually, according to the operator-based view of representations de- scribed above, the existence of multiple operators may imply the consider- ation of di®erent representations of the problem at di®erent stages of the reproductive phase. We will come back to this issue later in this section. In order to tackle the operator-selection problem, we can resort to ex- isting operators, or design new ad hoc operators. In the former case, a suggested line of action could be the following 49: 1. We start from a set of existing operators ­ = f ; ;¢¢¢ ; g. The 1 2 k ¯rst step is identifying the representation of the problem manipulated by each of these operators. 2. Use any of the criterions presented for measuring the goodness of the representation.11 3. Select from ­, such that the representation manipulated by is i i the more trustable. This is called inverse analysis of operators since some kind of inverse engineering is done in order to evaluate the potential usefulness of each operator. Thealternativewouldbea direct analysis inwhichnewoperators would be designed. This could be do as follows: 1. Identify di®erent potential representation for the problem at hand (e.g., recall the previous example on the TSP). 2. Useanyofthecriterionspresentedformeasuringthegoodnessofthese representation. 0 0 0 0 3. Create new operators ­ =f ; ;¢¢¢ ; g via the manipulation of 1 2 m the most trustable information units. In order to accomplish the last step of the direct analysis, there exists a number of templates for the manipulation of abstract information units. 3 Forexample, thetemplatesknownasrandom respectful recombination (R ), Random Assorting Recombination (RAR), and Random Transmitting Re- combination (RTR) have been de¯ned in 211. An example of the success- ful instantiation of some of these templates using the direct analysis in the context of °owshop scheduling can be found in 52. Thegenerictemplatesmentionedaboveareessentiallyblind. Thismeans that they do not use problem-dependent information at any stage of their functioning. Thisuseofblindrecombinationoperatorsistraditionallyjusti- ¯edonthegroundsofnotintroducingexcessivebiasinthesearchalgorithm, thus preventing extremely fast convergence to suboptimal solutions. How- ever, this is a highly arguable point since the behavior of the algorithm is in fact biased by the choice of representation. Even if we neglect this fact, it can be reasonable to pose the possibility of quickly obtaining a subopti- malsolutionandrestartingthealgorithm, ratherthanusingblindoperators for a long time in pursuit of an asymptotically optimal behavior (not even guaranteed in most cases). Reproductiveoperatorsthatuseproblemknowledgearecommonlytermed heuristic or hybrid. In these operators, problem information is utilized to guide the process of producing the o®spring. There are numerous ways to achieve this inclusion of problem knowledge; in essence, we can identify two major aspects into which problem knowledge can be injected: the selection of the parental features that will be transmitted to the descendant, and the 4 selection of non-parental features that will be added to it . 4 Notice that the use of the term `parental information' does not imply the existence of more than one parent. In other words, the discussion is not restricted to recombination operators, but may also include mutation operators.12 Withrespecttotheselectionofparentalfeaturestobeinjectedintheo®- spring, there exists evidence that respect (transmission of common features, as mentioned in the previous section) is bene¯cial for some problems (e.g., see51150). Afterthisinitialtransmission, theo®springcanbecompleted in several ways. For example, Radcli®e and Surry 212 have proposed the 5 use of local-improvers or implicit enumeration schemas . This is done by ¯rstly generating a partial solution by means of a non-heuristic procedure; subsequently, two approaches can be used: ² locally-optimal completion: the child is completed at random, and a local-improver is used restricted to those information units added for completion. ² globally-optimal completion: an implicit enumeration schema is used inorderto¯ndthegloballybestcombinationofinformationunitsthat can be used to complete the child. Related to the latter approach, the implicit enumeration schema can be used to ¯nd the best combination of the information units present in the parents. The resulting recombination would thus be transmitting, but not necessarily respectful for these two properties are incompatible in general. However, respect can be enforced by restricting the search to non-common features. Noticethatthiswouldnotbeglobally-optimalcompletionsincethe wholesearchisrestrictedtoinformationcomprisedintheparents. Thesetof solutions that can be constructed using this parental information is termed dynastic potential, and for this reason this approach is termed dynastically optimal recombination 56 (DOR). This operator is monotonic in the sense that any child generated is at least as good as the best parent. Problem-knowledge need not be necessarily included via iterative algo- rithms. On the contrary, the use of constructive heuristics is a popular choice. A distinguished example is the Edge Assembly Crossover (EAX) 186. EAX is a specialized operator for the TSP (both for symmetric and asymmetric instances) in which the construction of the child comprises two- phases: the ¯rst one involves the generation of an incomplete child via the so-called E-sets (subtours composed of alternating edges from each parent); subsequently,thesesubtoursaremergedintoasinglefeasiblesubtoursusing a greedy repair algorithm. The authors of this operator reported impressive results in terms of accuracy and speed. It has some similarities with the recombination operator proposed in 179. To some extent, the above discussion is also applicable to mutation op- erators, although these exhibit a clearly di®erent role: they must introduce new information. This means that purely transmitting mechanisms would 5 Actually, these approaches can be used even when no initial transmission of common features is performed.13 notbeacceptableforthispurpose. Nevertheless,itisstillpossibletousethe ideas described in the previous paragraphs by noting that the `partial so- lution' mentioned in several situations can be obtained by simply removing some information units from a single solution. A completion procedure as described before can then be used in order to obtain the mutated solution. Once we have one or more knowledge-augmented reproductive opera- tors, it is necessary to make them work in a synergistic fashion. This is a feature of MAs that is also exhibited by other metaheuristics such as vari- able neighborhood search (VNS) 98, although it must me emphasized that it was already included in the early discussions of MAs, before the VNS metaheuristic was formulated. We can quote from 177: \Anotheradvantagethatcanbeexploitedisthatthemostpower- fulcomputersinthenetworkcanbedoingthemosttime-consuming heuristics, while others are using a di®erent heuristics. The pro- gram to do local search in each individual can be di®erent. This enriches the whole, since what is a local minima for one of the computers is not a local minima for another in the network. Dif- ferent heuristics may be working ¯ne due to di®erent reasons. The collective use of them would improve the ¯nal output. In a distributed implementation we can think in a division of jobs, dividing the kind of moves performed in each computing individ- ual. It leads to an interesting concept, where instead of dividing the physical problem (assignment of cities/cells to processors) we divide the set of possible moves. This set is selected among the most e±cient moves for the problem." This idea of synergistically combining di®erent operators (and indeed di®erentsearchtechniques)wasexempli¯edatitsbestbyApplegate, Bixby, Cook, and Chvatal in 1998. They established new breakthrough results for theMin TSP which supports our view that MAs will have a central role as a problem solving methodology. This team solved to optimality an instance of the TSP of 13,509 cities corresponding to all U.S. cities with populations 6 of more than 500 people . The approach, according to Bixby: \...involves ideas from polyhedral combinatorics and combinatorial optimization, integer and linear programming, computer science data structures and algorithms, parallel computing, software engineering, numerical analysis, graph theory, and more". Their approach can possibly be classi¯ed as the most complex MA ever built for a given combinatorial optimization problem. Theseideashavebeenfurtherdevelopedinarecentunpublishedmanuscript, \Finding Tours in the TSP" by the same authors (Bixby et al.), available from their web site. They present results on running an optimal algorithm for solving the Min Weighted Hamiltonian Cycle Problem in a sub- 6 See: graph formed by the union of 25 Chained Lin-Kernighan tours. The ap- proach consistently ¯nds the optimal solution to the original Min TSP instances with up to 4461 cities. They also attempted to apply this idea to an instance with 85,900 cities (the largest instance in TSPLIB) and from that experience they convinced themselves that it also works well for such large instances. Theapproachofrunningalocalsearchalgorithm(ChainedLinKernighan) to produce a collection of tours, following by the dynastical-optimal recom- bination method the authors named tour merging gave a non-optimal tour of only 0.0002 % excess above the proved optimal tour for the 13,509 cities instance. We take this as a clear proof of the bene¯ts of the MA approach and that more work is needed in developing good strategies for complete memetic algorithms, i.e., those that systematically and synergistically use randomized and deterministic methods and can prove optimality. Wewouldliketoclosethissectionbyemphasizingonceagaintheheuris- tic nature of the design principles described in this section. The most in- teresting thing to note here is not the fact that they are just probably-good principles, but the fact that there is still much room for research in method- ological aspects of MAs (e.g., see 125). The open-philosophy of MAs make them suitable for incorporating mechanisms from other optimization tech- niques. In this sense, the reader may ¯nd a plethora of new possibilities for MA design by studying other metaheuristics such as TS, for example. 1.4 Applications of MAs This section will provide an overview of the numerous applications of MAs. This overview is far from exhaustive since new applications are being developed continuously. However, it is intended to be illustrative of the practical impact of these optimization techniques. 1.4.1 NP-hard Combinatorial Optimization problems Traditional NP Optimization problems constitute one of the most typi- calbattle¯eldsofMAs. Aremarkablehistoryofsuccesseshasbeenreported withrespecttotheapplicationofMAstoNP¡hardproblemssuchasthefol- lowing: GraphPartitioning2122159162163,MinNumberPar- titioning 16 17, Max Independent Set 3 102 225, Bin-Packing 219, Min Graph Coloring 44 47 70 74, Set Covering 12, Min Generalised Assignment 41, Multidimensional Knapsack 13 53 91,NonlinearIntegerProgramming234,QuadraticAssignment 20 35 157 161 162,Quadratic Programming 164166,Set Par- titioning 138, and particularly on the Min Travelling Salesman Problem and its variants 79 78 88 89 90 109 119 128 156 158 162 165 181 213 222 .15 Regarding the theory of NP-Completeness, most of them can be cited as \classical" as they appeared in Karp's notorious paper 117 on the re- ducibility of combinatorial problems. Remarkably, in most of them the authors claim that they have developed the best heuristic for the problem at hand. This is important since these problems have been addressed with several with di®erent approaches from the combinatorial optimization tool- box and almost all general-purpose algorithmic techniques have been tested on them. The MA paradigm is not limited to the above mentioned classical prob- lems. There exist additional \non-classical" combinatorial optimization problems of similar or higher complexity in whose resolution MAs have re- vealed themselves as outstanding techniques. As an example of these prob- lems, one can cite partial shape matching 196, Kau®man NK Landscapes 160, spacecraft trajectory design 57, minimum weighted k-cardinality tree subgraph problem 18, minimum k-cut problem 251, uncapacitated hub lo- cation 2, placement problems 110 134 226, vehicle routing 15 113, transportation problems 82 190, and task allocation 97. Another important class of combinatorial optimization problems are those that directly or indirectly correspond to telecommunication network problems. For example, we can cite: frequency allocation 55 118, network design 81 224, degree-constrained minimum spanning tree problem 214, vertex-biconnectivity augmentation 120, assignment of cells to switches in cellular mobile networks 209, and OSPF routing 23, Obviously, this list is by no means complete since its purpose is sim- ply to document the wide applicability of the approach for combinatorial optimization. 1.4.2 Scheduling Problems Undoubtedly, scheduling problems are one of the most important opti- mization domains due to its practical implications. They thus deserve sepa- rate mention, despite they could be included in the NP-hard class surveyed in the previous subsection. MAs have been used to tackle a large variety of scheduling problems. We can cite the following: maintenance scheduling 28 29 30, open shop scheduling 4073142,°owshopscheduling 1036183184,totaltardi- ness single machine scheduling 153, single machine scheduling with setup- times and due-dates 76 137 170, parallel machine scheduling 38 39 154 172, project scheduling 188 197 215, warehouse scheduling 240, production planning 67 173, timetabling 24 25 26 27 31 87 145 175 176 200 201 216, rostering 63 174, and sport games scheduling 46.16 1.4.3 Machine Learning and Robotics Machine learning and robotics are two closely related ¯elds since the di®erent tasks involved in the control of robots are commonly approached using arti¯cial neural networks and/or classi¯er systems. MAs, generally cited as \genetic hybrids" have been used in both ¯elds, i.e., in general optimizationproblemsrelatedtomachinelearning(forexample,thetraining of arti¯cial neural networks), and in robotic applications. With respect to theformer, MAshavebeenappliedto neural network training 1112179 236 249, pattern recognition 4, pattern classi¯cation 132 169, and analysis of time series 71 193. As to the application of MAs to robotics, work has been done in reac- tive rulebase learning in mobile agents 54, path planning 192 205 248, manipulator motion planning 221, time optimal control 37, etc. 1.4.4 Engineering, Electronics and Electromagnetics Electronics and engineering are also two ¯elds in which these methods have been actively used. For example, with regard to engineering problems, workhasbeendoneinthefollowingareas: structure optimization 250, sys- tem modeling 239, fracture mechanics 198, aeronautic design 19 208, trim loss minimization 194, tra±c control 231, power planning 237, cal- ibration of combustion engines 123 204, and process control 45 254. Astopracticalapplicationsinthe¯eldofelectronicsandelectromagnet- ics 42, the following list can illustrate the numerous areas in which these techniques have been utilized: semiconductor manufacturing 121, circuit design 6 7 94 99 244, circuit partitioning 5 computer aided design 14, multilayered periodic strip grating 9, analogue network synthesis 92, service restoration 8, optical coating design 107, and microwave imaging 33 203. 1.4.5 Molecular Optimization Problems Wehaveselectedthisparticularclassofcomputationalproblems,involv- ing nonlinear optimization issues, to help the reader to identify a common trendintheliterature. Unfortunately,theauthorscontinuereferringtotheir technique as `genetic', although they are closer in spirit to MAs 106. TheCaltechreportthatgaveitsnametothe,atthattimeincipient,¯eld of MAs 177 discussed a metaheuristic which can be viewed as a hybrid of GAs and SA developed with M.G. Norman in 1988. In recent years, severalpapersappliedhybridsofGAswithSAorothermethodstoavariety of molecular optimization problems 11 58 64 69 80 93 114 116 135 140 146 148 171 155 199 228 229 235 245 253 255. Hybrid population approaches like this can hardly be catalogued as being `genetic', but this denomination has appeared in previous work by Deaven17 and Ho 65 and then cited by J. Maddox in Nature 149. Other ¯elds of application include cluster physics 187. Additional work has been done in 66 104 105 206 207 245. Other evolutionary approaches to a variety of molecular problems can be found in: 69 101 103 152 168 217238. Theirusefordesignproblemsisparticularlyappealing43116 246. They have also been applied in protein design 68 136, structure prediction 126 127 131, and alignment 34 (see also the discussion in 179 and the literature review in 106). This¯eldisenormouslyactive,andnewapplicationdomainsforMAsare continuously emerging. Among these, we must mention applications related to genomic analysis, such as clustering gene-expression pro¯les 167, or inferring phylogenetic trees 50. 1.4.6 Other Applications In addition to the application areas described above, MAs have been also utilized in other ¯elds such as, for example, medicine 95 96 241, economics 139 195, oceanography 185, mathematics 220 242 243, imaging science and speech processing 32 133 141 151 223 252, etc. For further information about MA applications we suggest querying bibliographical databases or web browsers for the keywords `memetic al- gorithms' and `hybrid genetic algorithm'. We have tried to be illustrative ratherthanexhaustive,pointingoutsomeselectedreferencesforwell-known application areas. This means that, with high probability, many important contributions may have been inadvertently left out. 1.5 Conclusions and Future Directions We believe that MAs have very favorable perspectives for their devel- opment and widespread application. Such a belief is grounded on several reasons. First of all, MAs are showing a great record of e±cient implemen- tations, providing very good results in practical problems as the reader may have checked by inspecting the previous section. We also have reasons to believe that we are near some major leaps forward in our theoretical un- derstanding of these techniques, including for example the worst-case and average-case computational complexity of recombination procedures. On the other hand, the ubiquitous nature of distributed systems, like networks of workstations for example, plus the inherent asynchronous parallelism of MAs and the existence of web-conscious languages like Java are all together an excellent combination to develop highly portable and extendable object- oriented frameworks allowing algorithmic reuse. Wealsoseeasahealthysignthesystematicdevelopmentofotherpartic- ular optimization strategies. If any of the simpler metaheuristics (SA, TS, VNS, GRASP, etc.) performs the same as a more complex method (GAs,18 MAs, Ant Colonies, etc.), an \elegance design" principle should prevail and we must either resort to the simpler method, or to the one that has less free parameters,ortotheonethatiseasiertoimplement. Suchafactshoulddefy ustoadaptthecomplexmethodologytobeatasimplerheuristic,ortocheck if that is possible at all. Anunhealthysign of currentresearch, however, are the attempts to encapsulate metaheuristics on stretched con¯nements. We think that there are several \learned lessons" from work in other metaheuristics. For instance, a Basic Tabu Search scheme (83) decides to accept another new con¯guration (whether a feasible solution or not) with- out restriction to the relative objective function value of the two solutions. This has lead to good performance in some con¯guration spaces where evo- lutionary methods and Simulated Annealing perform poorly. A classical example of this situation is the Min Number Partitioning problem 17. There are many open lines of research in areas such as co-evolution. In 180 we can ¯nd the following quotation: \It may be possible that a future generation of MAs will work in at least two levels and two time scales. In the short-time scale, a set of agents would be searching in the search space associated to the problem while the long-time scale adapts the heuristics associated with the agents. Our work with D. Holstein which will be presented in this book might be classi¯ed as a ¯rst step in this promising direction. However, it is reasonable to think that more complex schemes evolving solutions, agents, as well as representations, will soon be implemented." At that time, we were referring to the use of a metaheuristic called Guided Local Search used in 109 as well as the possibility of co-evolving the neigh- borhood techniques by other means. Unfortunately, this was not studied in depth in Holstein's thesis ?. However, a number of more recent articles are paving the way to more robust MAs 34, 124, 129, 130. Krasnogor has recently introduced the term multimeme algorithms to identify those MAs that also adaptively change the neighborhood de¯nition 131, and with col- leagues is applying the method for the di±cult problem of protein structure prediction 127. Smith also presents a recent study on these issues in 230. More work is necessary, and indeed the protein folding models they are using are a good test-bed for the approach. However, we also hope that the researchers should again concentrate MAs for large-scale challenging instances of the TSP, possibly following the approaches of using population structures 182, 77, self-adapting local search, 128 as well as the powerful recombinationoperatorsthathavebeendevisedforTSPinstances109,156, 165, 179. We have also identi¯ed some problems with evolutionary search methods in instances of the TSP in which the entries of the distance matrix have a large number of decimal digits. This means that there is an inherent problemtobesolved,forevolutionarymethodstodealwith¯tnessfunctionsSection 1.5. Bibliography 19 that have so many decimal digits. Traditional rank-based or ¯tness-based selection schemes to keep new solutions in the current population fail. It would be then reasonable to investigate whether some ideas from basic TS mechanisms could be adapted to allow less stringent selection approaches. Multiparent recombination is also an exciting area to which research e®orts can be directe4d too. From 202 we can read: \The strategy developed by Lin 143 for the TSP is to obtain several local optima and then identify edges that are common to all of them. These are then ¯xed, thus reducing the time to ¯nd more local optima. This idea is developed further in 144 and 86." It is intriguing that such an strategy, which has been aroundformorethanthreedecades,isstillnotacceptedbysome researchers. We think that the use of multiparent recombination with proven good properties is one of the most challenging issues for future developement in MAs, as well as for the whole EC paradigm.20 Chapter 1. Memetic Algorithms