Attribute Relevance Papers

A Case Study in Neural Network Training with the Breeder Genetic Algorithm.

Belanche, Ll.

Abstract

Supervised training from examples of a feed-forward neural network is a classical problem, traditionally tackled by derivative-based methods (DBM) that compute the gradient of the error, such as backpropagation. Conventional methods for non-linear optimization, such as Levenberg-Marquardt, quasi-Newton and conjugate gradient are generally faster and more reliable, provided the objective function has continuous second derivatives. Their main drawbacks are well-known, being one of the more serious the possibility of getting caught in local minima of the error surface. As an alternative, Evolutionary Algorithms (EA) have demonstrated their ability to solve optimization tasks in a wide range of applications. However, their use in the neural network context has concentrated on binary-coded Genetic Algorithms, often without a systematic approach and thus more than often outperformed by even simple DBMs. The result is that EAs are generally considered basically inadequate for this problem. In this paper the possibilities of the Breeder Genetic Algorithm (BGA) to solve the numerical optimization problem are thoroughly explored. A case study is developed and used to tune the BGA for this kind of task, by searching in the space of genetic operators and their parameters, on the one hand, and as a function of selection pressure and population size, on the other. It is found that specific configurations stand out over the rest, in a way that is consistent with previous findings. The importance of finding the relationship between the last two mentioned quantities is also highlighted. In order to assess to some extent the validity of the algorithm, a further batch of experiments is devoted to compare the BGA to a powerful DBM, a global method consisting of conjugate gradient embedded into a simulated annealing schedule. The results show a comparable performance pointing this evolutionary algorithm as a feasible alternative to derivative-based methods.