IN THE BEGINNING, God created the Heavens and the Earth. Things got a bit tricky after that, what with non-linear dynamics and all, so he bought himself a PC and invented Genetic
Algorithms. The rest, as they say, is History. Genetic Algorithms are used to build programs which evolve in the manner of real organisms. In the natural world, successive generations of plants and animals
accumulate change towards forms which are better adapted to their environment. Only a small part of any population survives to reproduce, and the intangible pressure of raw statistics favours better-adapted organisms.
This is what Darwin and Wallace called 'natural selection'. The principles governing cumulative change are universal and we can apply them to the design of problem-solving computer programs. Genetic
programs are data-driven; it is the data that evolves, and the driver program is designed so that any data set will make it do something, however inappropriate. Data 'organisms' are collections of bytes or bits,
and selection pressure is based on measuring how close each organism comes to solving our problem when the driver program 'runs' it. Two things are essential for evolution: heritability and variation. The
first requires that our organisms can somehow pass on traits to their offspring. We build organisms out of data items (genes) that encode discrete units of functionality, which combine to produce a meaningful net
effect. Heritability also implies reproduction, either by rearrangement of existing genes (asexual) or by trading genes with other organisms (sexual). Sooner or later, just swapping genes
around will exhaust all combinations in our gene pool. Also, killing off unfit organisms can eradicate genes from the pool, so we need a way to reintroduce them. We maintain vitality by introducing a source of
variation, such as allowing genes to mutate occasionally during reproduction. In nature, random mutations are mostly harmful, but the vast timescales involved produce enough benign
mutations to inject a steady supply of new material into the gene pool. Damaging mutations serve no purpose in themselves, so we design our genetic material such that mutation always turns a gene into another valid
gene. Mutation can be as simple as randomly flipping bits in a gene. Popular myth paints evolution as 'survival of the fittest'. In truth, fitness is defined by which organisms survive, but genetic
programming echoes the popular notion by predefining a spectrum of fitness. We choose organisms for reproduction by watching how close they come to solving our problem and assigning fitness values. Defining and
measuring fitness is one of the more difficult parts of designing a genetic program. Reproduction is where Genetic Algorithms come in. When two organisms have been chosen to reproduce, a GA combines their
genes to make a new organism that inherits traits from both parents. Genetic material is usually combined using chromosome-like operations copied from nature. Chromosomes are linear sequences of genes; new combinations
are made by splitting the chromosomes and interchanging corresponding segments from each parent. The new chromosomes become the offspring's genetic material. This kind of mixing is called crossover. Chrosomes bring a
new level of complexity to evolution. Whether or not the order of genes are decoded, crossover has important implications for groups of genes. Genes which are further apart are more likely to be
separated when the cromosome is split, and conversely, adjacent genes are more likely to be swapped around as a single unit. This is good news for genes that go well together, but it can be stifling if unproductive
groups flourish. We can avoid such problems by introducing an occasional inversion, which reorganises the chromosome to change the separations between genes. Evolution happens over many generations.
Although our seed organisms may be hopelessly inept at solving our problem, each one will have a measurable fitness. By selecting organisms which are closer to a solution, the best partial solutions will persist and
these will combine to produce ever fitter organisms. That's not the whole story, of course. At the heart of any genetic program is the design of suitable genes, and a set of measurable fitness criteria.
Entire books have been written on genetic algorithms themselves. Goldberg's book Genetic Algorithms (ISBN 0-201-15767-5) is a good starting point, whilst the acknowledged implementor's bible is Koza's epic
Genetic Programming (ISBN 0-26241170-5). If you want a non-computing flavour of the power of evolution, Blueprints by Edey and Johanson (ISBN 0-19-286117-4)is an excellent introduction. |