Home
Numbers
Evolution
SourceSafe
Storyboard
PVCS
Objects
Rapid
Picture

Industrial Evolution

IN THE BEGINNING, God created the Heavens and the Earth. Things got a bit tricky after that, what with non-linear dynamics and all, so he bought himself a PC and invented Genetic Algorithms. The rest, as they say, is History.

Genetic Algorithms are used to build programs which evolve in the manner of real organisms. In the natural world, successive generations of plants and animals accumulate change towards forms which are better adapted to their environment. Only a small part of any population survives to reproduce, and the intangible pressure of raw statistics favours better-adapted organisms. This is what Darwin and Wallace called 'natural selection'.

The principles governing cumulative change are universal and we can apply them to the design of problem-solving computer programs. Genetic programs are data-driven; it is the data that evolves, and the driver program is designed so that any data set will make it do something, however inappropriate. Data 'organisms' are collections of bytes or bits, and selection pressure is based on measuring how close each organism comes to solving our problem when the driver program 'runs' it.

Two things are essential for evolution: heritability and variation. The first requires that our organisms can somehow pass on traits to their offspring. We build organisms out of data items (genes) that encode discrete units of functionality, which combine to produce a meaningful net effect.  Heritability  also  implies reproduction, either by rearrangement of existing genes (asexual) or by trading genes with other organisms (sexual).

Sooner or later, just swapping genes around will exhaust all combinations in our gene pool. Also, killing off unfit organisms can eradicate genes from the pool, so we need a way to reintroduce them. We maintain vitality by introducing a source of variation, such as allowing genes to mutate occasionally during reproduction.

In nature, random mutations are mostly  harmful,  but  the  vast timescales involved produce enough benign mutations to inject a steady supply of new material into the gene pool. Damaging mutations serve no purpose in themselves, so we design our genetic material such that mutation always turns a gene into another valid gene. Mutation can be as simple as randomly flipping bits in a gene.

Popular myth paints evolution as 'survival of the fittest'. In truth, fitness is defined by which organisms survive, but genetic programming echoes the popular notion by predefining a spectrum of fitness. We choose organisms for reproduction by watching how close they come to solving our problem and assigning fitness values. Defining and measuring fitness is one of the more difficult parts of designing a genetic program.

Reproduction is where Genetic Algorithms come in. When two organisms have been chosen to reproduce, a GA combines their genes to make a new organism that inherits traits from both parents. Genetic material is usually combined using chromosome-like operations copied from nature. Chromosomes are linear sequences of genes; new combinations are made by splitting the chromosomes and interchanging corresponding segments from each parent. The new chromosomes become the offspring's genetic material. This kind of mixing is called crossover. Chrosomes bring a new level of complexity to evolution. Whether or not the order of genes are decoded, crossover has important implications for groups of genes.

Genes which are further apart are more likely to be separated when the cromosome is split, and conversely, adjacent genes are more likely to be swapped around as a single unit. This is good news for genes that go well together, but it can be stifling if unproductive groups flourish. We can avoid such problems by introducing an occasional inversion, which reorganises the chromosome to change the separations between genes.

Evolution happens over many generations. Although our seed organisms may be hopelessly inept at solving our problem, each one will have a measurable fitness. By selecting organisms which are closer to a solution, the best partial solutions will persist and these will combine to produce ever fitter organisms.

That's not the whole story, of course. At the heart of any genetic program is the design of suitable genes, and a set of measurable fitness criteria. Entire books have been written on genetic algorithms themselves. Goldberg's book Genetic Algorithms (ISBN 0-201-15767-5) is a good starting point, whilst the acknowledged implementor's bible is Koza's epic Genetic Programming (ISBN 0-26241170-5). If you want a non-computing flavour of the power of evolution, Blueprints by Edey and Johanson (ISBN 0-19-286117-4)is an excellent introduction.

Key Spinner

© 1998 - 2005 Mark Hurst. All rights reserved.   Updated 25 June 2005                             sign the guest book