Genetic Algorithms: Theory and Applications

advantages and disadvantages of genetic algorithm how genetic algorithm is helpful and how genetic algorithm is used in data mining
ImogenCameron Profile Pic
ImogenCameron,France,Teacher
Published Date:14-07-2017
Your Website URL(Optional)
Comment
Genetic Algorithms: Theory and Applications Lecture Notes Third Edition—Winter 2003/2004 by Ulrich Bodenhofer Tel.: +43 732 2468 9194 Fax: +43 732 2468 1351 E-mail: WWW: Fuzzy Logic Laboratorium Linz-Hagenberg2Preface This is a printed collection of the contents of the lecture “Genetic Algo- rithms: Theory and Applications” which I gave first in the winter semester 1999/2000attheJohannesKeplerUniversityinLinz. The sources were manifold: Chapters 1 and 2 were written originally for these lecture notes. All examples were implemented from scratch. The third chapter is a distillation of the books of Goldberg 22 and Hoff- mann 26 and a handwritten manuscript of the preceding lecture on ge- netic algorithms which was given by Andreas Stockl ¨ in 1993 at the Jo- hannes Kepler University. Chapters 4, 6, and 8 contain adaptations of previouslypublished materialfrom my own master thesis anda seriesof lectures which was given by Francisco Herrera and myself at the Second Summer School on Advanced Control at the Slovak Technical University, Bratislava,insummer19975,withtheonlyexceptionofSubsection8.3.3 which is a summary of Chapter 7 from Peter Haslinger’s master thesis 24. Chapter 5 was extracted from a recent book by my dear colleagues O. Cordon, ´ F. Herrera, F. Hoffman, and L. Magdalena 13. Chapter 7 waswrittenoriginally,however,stronglyinfluencedbyA.Geyer-Schulz’s worksandH.Horner ¨ ’spaperonhisC++GPkernel29. Iwouldliketothankallthestudentsthatattendedmylecturesonge- netic algorithms so far, for contributing much to these lecture notes with theirvivid,interesting,andstimulatingquestions,objections,anddiscus- sions. Last but not least, I want to express my sincere gratitude to Sabine Lumpi and Susanne Saminger for support in organizational matters, and PeterBauerforproof-reading. UlrichBodenhofer,October2003. 34Contents 1 BasicIdeasandConcepts 11 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2 DefinitionsandTerminology . . . . . . . . . . . . . . . . . . 12 2 ASimpleClassofGAs 17 2.1 GeneticOperationsonBinaryStrings . . . . . . . . . . . . . 18 2.1.1 Selection . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.1.2 Crossover . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.1.3 Mutation . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.2.1 AVerySimpleOne . . . . . . . . . . . . . . . . . . . . 24 2.2.2 AnOscillatingOne-DimensionalFunction . . . . . . 25 2.2.3 ATwo-DimensionalFunction . . . . . . . . . . . . . . 27 2.2.4 GlobalSmoothnessversusLocalPerturbations . . . . 29 2.2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 31 3 Analysis 33 3.1 TheSchemaTheorem . . . . . . . . . . . . . . . . . . . . . . . 36 3.1.1 TheOptimalAllocationofTrials . . . . . . . . . . . . 39 3.1.2 ImplicitParallelism . . . . . . . . . . . . . . . . . . . . 41 3.2 BuildingBlocksandtheCodingProblem . . . . . . . . . . . 43 3.2.1 Example: TheTravelingSalesmanProblem . . . . . . 45 3.3 ConcludingRemarks . . . . . . . . . . . . . . . . . . . . . . . 51 4 Variants 53 4.1 MessyGeneticAlgorithms . . . . . . . . . . . . . . . . . . . . 53 4.2 AlternativeSelectionSchemes . . . . . . . . . . . . . . . . . . 55 4.3 AdaptiveGeneticAlgorithms . . . . . . . . . . . . . . . . . . 56 4.4 HybridGeneticAlgorithms . . . . . . . . . . . . . . . . . . . 56 4.5 Self-OrganizingGeneticAlgorithms . . . . . . . . . . . . . . 57 56 CONTENTS 5 GAVariantsforReal-ValuedOptimizationProblems 59 5.1 Real-CodedGAs. . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.1.1 CrossoverOperatorsforReal-CodedGAs . . . . . . . 60 5.1.2 MutationOperatorsforReal-CodedGAs . . . . . . . 61 5.2 EvolutionaryStrategies . . . . . . . . . . . . . . . . . . . . . . 61 5.2.1 RecombinationinESs . . . . . . . . . . . . . . . . . . 62 5.2.2 MutationinESs . . . . . . . . . . . . . . . . . . . . . . 62 5.2.3 SelectionandSamplinginESs . . . . . . . . . . . . . 63 5.3 EvolutionaryProgramming . . . . . . . . . . . . . . . . . . . 64 5.3.1 OriginalEP . . . . . . . . . . . . . . . . . . . . . . . . 64 5.3.2 D.B.Fogel’sModifiedEP . . . . . . . . . . . . . . . . 64 5.3.3 SelectionandSamplinginEP . . . . . . . . . . . . . . 65 6 TuningofFuzzySystemsUsingGeneticAlgorithms 67 6.1 TuningofFuzzySets . . . . . . . . . . . . . . . . . . . . . . . 69 6.1.1 CodingFuzzySubsetsofanInterval . . . . . . . . . . 69 6.1.2 CodingWholeFuzzyPartitions . . . . . . . . . . . . . 72 6.1.3 StandardFitnessFunctions . . . . . . . . . . . . . . . 73 6.1.4 GeneticOperators . . . . . . . . . . . . . . . . . . . . 73 6.2 APracticalExample . . . . . . . . . . . . . . . . . . . . . . . 74 6.2.1 TheFuzzySystem . . . . . . . . . . . . . . . . . . . . 76 6.2.2 TheOptimizationoftheClassificationSystem . . . . 79 6.2.3 ConcludingRemarks . . . . . . . . . . . . . . . . . . . 84 6.3 FindingRuleBaseswithGAs . . . . . . . . . . . . . . . . . . 84 7 GeneticProgramming 87 7.1 DataRepresentation . . . . . . . . . . . . . . . . . . . . . . . 88 7.1.1 TheChoiceoftheProgrammingLanguage . . . . . . 92 7.2 ManipulatingPrograms . . . . . . . . . . . . . . . . . . . . . 93 7.2.1 RandomInitialization . . . . . . . . . . . . . . . . . . 93 7.2.2 CrossingPrograms . . . . . . . . . . . . . . . . . . . . 93 7.2.3 MutatingPrograms . . . . . . . . . . . . . . . . . . . . 94 7.2.4 TheFitnessFunction . . . . . . . . . . . . . . . . . . . 94 7.3 FuzzyGeneticProgramming(FGP) . . . . . . . . . . . . . . . 97 7.4 AChecklistforApplyingGeneticProgramming . . . . . . . 98 8 ClassifierSystems 101 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 8.2 HollandClassifierSystems . . . . . . . . . . . . . . . . . . . . 103 8.2.1 TheProductionSystem . . . . . . . . . . . . . . . . . 104 8.2.2 TheBucketBrigadeAlgorithm . . . . . . . . . . . . . 106CONTENTS 7 8.2.3 RuleGeneration. . . . . . . . . . . . . . . . . . . . . . 109 8.3 FuzzyClassifierSystemsoftheMichiganType . . . . . . . . 110 8.3.1 DirectlyFuzzifyingHollandClassifierSystems . . . . 111 8.3.2 Bonarini’sELFMethod . . . . . . . . . . . . . . . . . 114 8.3.3 AnImprovedFCS . . . . . . . . . . . . . . . . . . . . 115 8.3.4 OnlineModificationoftheWholeKnowledgeBase . 119 Bibliography 1218 CONTENTSListofFigures 2.1 Agraphicalrepresentationofroulettewheelselection . . . . 20 2.2 One-pointcrossoverofbinarystrings . . . . . . . . . . . . . 21 2.3 Thefunctionf . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2 2.4 Asurfaceplotofthefunctionf . . . . . . . . . . . . . . . . . 27 3 2.5 Thefunctionf anditsderivative . . . . . . . . . . . . . . . . 30 4 3.1 Hypercubesofdimensions1–4 . . . . . . . . . . . . . . . . . 35 3.2 Ahyperplaneinterpretationofschemataforn = 3 . . . . . . 36 3.3 Minimaldeceptiveproblems . . . . . . . . . . . . . . . . . . 46 4.1 Amessycoding . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.2 Positionalpreference . . . . . . . . . . . . . . . . . . . . . . . 54 4.3 Thecutandspliceoperation . . . . . . . . . . . . . . . . . . . 55 6.1 Piecewiselinearmembershipfunctionwithfixedgridpoints 71 6.2 Simplefuzzysetswithpiecewiselinearmembershipfunctions 71 6.3 Simplefuzzysetswithsmoothmembershipfunctions . . . . 71 6.4 AfuzzypartitionwithN = 4trapezoidalparts . . . . . . . . 73 6.5 Exampleforone-pointcrossoveroffuzzypartitions . . . . . 75 6.6 Mutatingafuzzypartition . . . . . . . . . . . . . . . . . . . . 76 6.7 Magnifications of typical representatives of the four types ofpixels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 6.8 Clockwiseenumerationofneighborpixels . . . . . . . . . . 77 6.9 Typicalgrayvaluecurvescorrespondingtothefourtypes . 77 6.10 Thelinguisticvariablesv ande . . . . . . . . . . . . . . . . . 78 6.11 Crosssectionsofafunctionoftype(5.2) . . . . . . . . . . . . 80 6.12 A comparison of results obtained by several different opti- mizationmethods . . . . . . . . . . . . . . . . . . . . . . . . . 82 6.13 Agraphicalrepresentationoftheresults . . . . . . . . . . . . 83 7.1 Thetreerepresentationof(+ ( 3 X) (SIN (+ X 1))) 89 7.2 Thederivationtreeof(NOT(xORy)) . . . . . . . . . . . . . 91 910 LIST OF FIGURES 7.3 Anexampleforcrossingtwobinarylogicalexpressions . . . 95 7.4 Anexampleformutatingaderivationtree. . . . . . . . . . . 96 8.1 BasicarchitectureofaclassifiersystemoftheMichigantype 103 8.2 Thebucketbrigadeprinciple . . . . . . . . . . . . . . . . . . 108 8.3 Anexampleforrepeatedpropagationofpayoffs . . . . . . . 109 8.4 AgraphicalrepresentationofthetableshowninFigure7.3 . 110 8.5 Matchingafuzzycondition . . . . . . . . . . . . . . . . . . . 112 8.6 ThecreationoffuzzymessagesintheimprovedFCS . . . . . 117 8.7 The matching procedure of the modified fuzzy classifier system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117Chapter1 BasicIdeasandConcepts Growing specialization and diversification have brought a host of monographs and textbooks on increasingly specialized topics. How- ever, the “tree” of knowledge of mathematics and related fields does not grow only by putting forth new branches. It also happens, quite often in fact, that branches which were thought to be completely dis- paratearesuddenlyseentoberelated. MichielHazewinkel 1.1 Introduction Applying mathematics to a problem of the real world mostly means, at first,modelingtheproblemmathematically,maybewithhardrestrictions, idealizations, or simplifications, then solving the mathematical problem, and finally drawing conclusions about the real problem based on the so- lutionsofthemathematicalproblem. Since about 60 years, a shift of paradigms has taken place—in some sense,theoppositewayhascomeintofashion. Thepointisthattheworld has done well even in times when nothing about mathematical modeling wasknown. Morespecifically,thereisanenormousnumberofhighlyso- phisticatedprocessesandmechanismsinourworldwhichhavealwaysat- tractedtheinterestofresearchersduetotheiradmirableperfection. Toim- itatesuchprinciplesmathematicallyandtousethemforsolvingabroader class of problems has turned out to be extremely helpful in various disci- plines. Justbriefly,letusmentionthefollowingthreeexamples: 1112 1. BASIC IDEAS AND CONCEPTS ArtificialNeuralNetworks(ANNs): Simple models of nerve cells (neu- rons)andthewaytheyinteract;canbeusedforfunctionapproxima- tion,machinelearning,patternrecognition,etc. (e.g. 38,48). FuzzyControl: Humans are often able to control processes for which no analytic model is available. Such knowledge can be modeled math- ematically by means of linguistic control rules and fuzzy sets (e.g. 32,47). SimulatedAnnealing: Robust probabilistic optimization method mim- icking the solidification of a crystal under slowly decreasing tem- perature;applicabletoawideclassofproblems(e.g. 35,45). The fourth class of such methods will be the main object of study in thislectures—GeneticAlgorithms(GAs). The world as we see it today, with its variety of different creatures, its individuals highly adapted to their environment, with its ecological bal- ance(undertheoptimisticassumptionthatthereisstillone),istheproduct of a three billion years experiment we call evolution, a process based on sexual and asexual reproduction, natural selection, mutation, and so on 14. If we look inside, the complexity and adaptability of today’s crea- tures has been achieved by refining and combining the genetic material overalongperiodoftime. Generally speaking, genetic algorithms are simulations of evolution, of what kind ever. In most cases, however, genetic algorithms are nothing else than prob- abilisticoptimizationmethodswhicharebasedontheprinciplesofevolution. This idea appears first in 1967 in J. D. Bagley’s thesis “The Behavior ofAdaptiveSystemsWhichEmployGeneticandCorrelativeAlgorithms” 1. The theory and applicability was then strongly influenced by J. H. Holland, who can be considered as the pioneer of genetic algorithms 27, 28. Since then, this field has witnessed a tremendous development. The purposeofthislectureistogiveacomprehensiveoverviewofthisclassof methods and their applications in optimization, program induction, and machinelearning. 1.2 DefinitionsandTerminology As a first approach, let us restrict to the view that genetic algorithms are optimizationmethods. Ingeneral,optimizationproblemsaregiveninthe1.2. DEFINITIONS AND TERMINOLOGY 13 followingform: Findanx ∈X suchthatf ismaximalinx ,wheref :X →R 0 0 (1.1) isanarbitraryreal-valuedfunction,i.e. f(x ) = maxf(x). 0 x∈X In practice, it is sometimes almost impossible to obtain global solutions in the strict sense of (1.1). Depending on the actual problem, it can be sufficienttohavealocalmaximumortobeatleastclosetoalocalorglobal maximum. So, let us assume in the following that we are interested in valuesxwheretheobjectivefunctionf is“ashighaspossible”. ThesearchspaceXcanbeseenindirectanalogytothesetofcompeting individualsintherealworld,wheref isthefunctionwhichassignsavalue of“fitness”toeachindividual(thisis,ofcourse,aserioussimplification). In the real world, reproduction and adaptation is carried out on the level of genetic information. Consequently, GAs do not operate on the valuesinthesearchspaceX,butonsomecodedversionsofthem(strings forsimplicity). 1.1 Definition. Assume S to be a set of strings (in non-trivial cases with someunderlyinggrammar). LetX bethesearchspaceofanoptimization problemasabove,thenafunction c : X −→ S x 7−→ c(x) iscalledcodingfunction. Conversely,afunction c˜: S −→ X s 7−→ c˜(s) iscalleddecodingfunction. Inpractice,codinganddecodingfunctions,whichhavetobespecified dependingontheneedsoftheactualproblem,arenotnecessarilybijective. However,itisinmostofthecasesusefultoworkwithinjectivedecoding functions (we will see examples soon). Moreover, the following equality isoftensupposedtobesatisfied: (c◦c˜)≡ id (1.2) S Finally, we can write down the general formulation of the encoded maximizationproblem: ˜ Findans ∈S suchthatf =f ◦c˜isaslargeaspossible 014 1. BASIC IDEAS AND CONCEPTS Thefollowingtablegivesalistofdifferentexpressions,whicharecom- moningenetics,alongwiththeirequivalentintheframeworkofGAs: NaturalEvolution GeneticAlgorithm genotype codedstring phenotype uncodedpoint chromosome string gene stringposition allele valueatacertainposition fitness objectivefunctionvalue After this preparatory work, we can write down the basic structure of ageneticalgorithm. 1.2Algorithm. t := 0; ComputeinitialpopulationB ; 0 WHILE stoppingconditionnotfulfilled DO BEGIN selectindividualsforreproduction; createoffspringsbycrossingindividuals; eventuallymutatesomeindividuals; computenewgeneration END As obvious from the above algorithm, the transition from one genera- tiontothenextconsistsoffourbasiccomponents: Selection: Mechanismforselectingindividuals(strings)forreproduction accordingtotheirfitness(objectivefunctionvalue). Crossover: Method of merging the genetic information of two individu- als;ifthecodingischosenproperly,twogoodparentsproducegood children. Mutation: In real evolution, the genetic material can by changed ran- domly by erroneous reproduction or other deformations of genes, e.g. by gamma radiation. In genetic algorithms, mutation can be realizedasarandomdeformationofthestringswithacertainprob- ability. Thepositiveeffectispreservationofgeneticdiversityand,as aneffect,thatlocalmaximacanbeavoided.1.2. DEFINITIONS AND TERMINOLOGY 15 Sampling: Procedure which computes a new generation from the previ- ousoneanditsoffsprings. Comparedwithtraditionalcontinuousoptimizationmethods, suchas Newton or gradient descent methods, we can state the following signifi- cantdifferences: 1. GAs manipulate coded versions of the problem parameters instead oftheparametersthemselves,i.e. thesearchspaceisS insteadofX itself. 2. While almost all conventional methods search from a single point, GAs always operate on a whole population of points (strings). This contributes much to the robustness of genetic algorithms. It im- proves the chance of reaching the global optimum and, vice versa, reducestheriskofbecomingtrappedinalocalstationarypoint. 3. Normal genetic algorithms do not use any auxiliary information about the objective function value such as derivatives. Therefore, they can be applied to any kind of continuous or discrete optimiza- tion problem. The only thing to be done is to specify a meaningful decodingfunction. 4. GAsuseprobabilistictransitionoperatorswhileconventionalmeth- odsforcontinuousoptimizationapplydeterministictransitionoper- ators. Morespecifically,thewayanewgenerationiscomputedfrom the actual one has some random components (we will see later by thehelpofsomeexampleswhattheserandomcomponentsarelike).16 1. BASIC IDEAS AND CONCEPTSChapter2 ASimpleClassofGAs Once upon a time a fire broke out in a hotel, where just then a sci- entific conference was held. It was night and all guests were sound asleep. As it happened, the conference was attended by researchers from a variety of disciplines. The first to be awakened by the smoke was a mathematician. His first reaction was to run immediately to the bathroom, where, seeing that there was still water running from the tap, he exclaimed: “There is a solution”. At the same time, how- ever,thephysicistwenttoseethefire,tookagoodlookandwentback to his room to get an amount of water, which would be just suffi- cienttoextinguishthefire. Theelectronicengineerwasnotsochoosy andstartedtothrowbucketsandbucketsofwateronthefire. Finally, whenthebiologistawoke,hesaidtohimself: “Thefittestwillsurvive” andwentbacktosleep. AnecdoteoriginallytoldbyC.L.Liu In this chapter, we will present a very simple but extremely important subclass—genetic algorithms working with a fixed number of binary strings of fixed length. For this purpose, let us assume that the strings weconsiderareallfromtheset n S =0,1 , wherenisobviouslythelengthofthestrings. Thepopulationsizewillbe denoted with m in the following. Therefore, the generation at time t is a listofmstringswhichwewilldenotewith B = (b ,b ,...,b ). t 1,t 2,t m,t AllGAsinthischapterwillobeythefollowingstructure: 1718 2. A SIMPLE CLASS OF GAS 2.1Algorithm. t := 0; ComputeinitialpopulationB = (b ,...,b ); 0 1,0 m,0 WHILE stoppingconditionnotfulfilled DO BEGIN FORi := 1TOm DO selectanindividualb fromB ; i,t+1 t FORi := 1TOm−1STEP2DO IF Random0,1≤p THEN C crossb withb ; i,t+1 i+1,t+1 FORi := 1TOm DO eventuallymutateb ; i,t+1 t :=t+1 END Obviously, selection, crossover (done only with a probability of p C here), and mutation are still degrees of freedom, while the sampling op- eration is already specified. As it is easy to see, every selected individual isreplacedbyoneofitschildrenaftercrossoverandmutation;unselected individualsdieimmediately. Thisisarathercommonsamplingoperation, althoughothervariantsareknownandreasonable. In the following, we will study the three remaining operations selec- tion,crossover,andmutation. 2.1 GeneticOperationsonBinaryStrings 2.1.1 Selection Selection is the component which guides the algorithm to the solution by preferring individuals with high fitness over low-fitted ones. It can be a deterministicoperation,butinmostimplementationsithasrandomcom- ponents. One variant, which is very popular nowadays (we will give a theo- retical explanation of its good properties later), is the following scheme,2.1. GENETIC OPERATIONS ON BINARY STRINGS 19 where the probability to choose a certain individual is proportional to its fitness. Itcanberegardedasarandomexperimentwith f(b ) j,t Pb isselected = . (2.1) j,t m P f(b ) k,t k=1 Of course, this formula only makes sense if all the fitness values are pos- + itive. If this is not the case, a non-decreasing transformation ϕ :R →R must be applied (a shift in the simplest case). Then the probabilities can beexpressedas ϕ(f(b )) j,t Pb isselected = (2.2) j,t m P ϕ(f(b )) k,t k=1 We can force the property (2.1) to be satisfied by applying a random experiment which is, in some sense, a generalized roulette game. In this roulette game, the slots are not equally wide, i.e. the different outcomes can occur with different probabilities. Figure 2.1 gives a graphical hint howthisroulettewheelgameworks. The algorithmic formulation of the selection scheme (2.1) can be writ- tendownasfollows,analogouslyforthecaseof(2.2): 2.2Algorithm. x := Random0,1; i := 1 P P i m WHILEim&x f(b )/ f(b )DO j,t j,t j=1 j=1 i :=i+1; selectb ; i,t Forobviousreasons,thismethodisoftencalledproportionalselection. 2.1.2 Crossover Insexualreproduction,asitappearsintherealworld,thegeneticmaterial ofthetwoparentsismixedwhenthegametesoftheparentsmerge. Usu- ally, chromosomes are randomly split and merged, with the consequence20 2. A SIMPLE CLASS OF GAS 0.208 0.083 0.167 0.251 0.208 0.083 Figure 2.1: A graphical representation of roulette wheel selection, where thenumberofalternativesmis6. Thenumbersinsidethearcscorrespond totheprobabilitiestowhichthealternativeisselected. that some genes of a child come from one parent while others come from theotherparents. Thismechanismiscalledcrossover. Itisaverypowerfultoolforintro- ducing new genetic material and maintaining genetic diversity, but with theoutstandingpropertythatgoodparentsalsoproducewell-performing childrenorevenbetterones. Severalinvestigationshavecometothecon- clusionthatcrossoveristhereasonwhysexuallyreproducingspecieshave adaptedfasterthanasexuallyreproducingones. Basically,crossoveristheexchangeofgenesbetweenthechromosomes of the two parents. In the simplest case, we can realize this process by cutting two strings at a randomly chosen position and swapping the two tails. Thisprocess,whichwewillcallone-pointcrossoverinthefollowing, isvisualizedinFigure2.2.