EN FR
EN FR
CQFD - 2016
Overall Objectives
Application Domains
New Results
Bilateral Contracts and Grants with Industry
Bibliography
Overall Objectives
Application Domains
New Results
Bilateral Contracts and Grants with Industry
Bibliography


Section: New Results

Prediction of Expected Performance for a Genetic Programming Classifier

The following result has been obtained by Pierrick Legrand (Inria CQFD) in collaboration with Y. Martínez, L. Trujillo and E. Galván-López.

The estimation of problem difficulty is an open issue in genetic programming (GP). The goal of this work is to generate models that predict the expected performance of a GP-based classifier when it is applied to an unseen task. Classification problems are described using domain-specific features, some of which are proposed in this work, and these features are given as input to the predictive models. These models are referred to as predictors of expected performance. We extend this approach by using an ensemble of specialized predictors (SPEP), dividing classification problems into groups and choosing the corresponding SPEP. The proposed predictors are trained using 2D synthetic classification problems with balanced datasets. The models are then used to predict the performance of the GP classifier on unseen real-world datasets that are multidimensional and imbalanced. This work is the first to provide a performance prediction of a GP system on test data, while previous works focused on predicting training performance. Accurate predictive models are generated by posing a symbolic regression task and solving it with GP. These results are achieved by using highly descriptive features and including a dimensionality reduction stage that simplifies the learning and testing process. The proposed approach could be extended to other classification algorithms and used as the basis of an expert system for algorithm selection.