Automatic learning is a (highly multi-disciplinary) research field providing methods to extract high level synthetic information (i.e. models) from low level data. Traditionally Science and Engineering have been based on using models derived from first principles. For example, most techniques used in electrical engineering use models derived from Maxwell's equations. These models are verified using both appropriate measurements (data) and classical system identification techniques. However, in many real world applications the underlying first principles are unknown or the system to be modeled is so complex that this approach is intractable. On the other hand, today more and more data have become available, collected directly from such systems or generated through computer-based simulation. In such circumstances, automatic learning can be used to derive effective models. The work described in this thesis concerns the application of automatic learning to data analysis (in other words, to extract previously unknown, potentially useful, and ultimately comprehensible information from data sets). This work has been carried out by first providing a general framework to reformulate a broad and apparently diverse collection of automatic learning models, most of which have been proposed in the recent years. Thanks to this framework, we have been able to present in a consistent way a representative subset of existing models proposed in different fields like artificial intelligence, computer science or statistics. This allowed us to highlight significant differences while masking the irrelevant ones, so as to gain a deeper understanding. The generic problems encountered in automatic learning, like the bias-variance tradeoff, could also be discussed from this perspective. In a second step, we have developed a novel one-dimensional model (one-input one-output), called the Hinges model. Then, we have developed two multidimensional (multiple inputs multiple outputs) extensions of it, called ORTHO and OBLIQUE models. The one-dimensional Hinges model combines a nonparametric model (a so-called scatterplot-smoother, used to provide a first approximation of the noise in the data) with piecewise polynomial models (i.e. hinges, used to provide a closed form approximation of the underlying data). This combination turns out as a computationally very efficient learning algorithm able to produce useful models. Both piecewise linear and piecewise cubic hinges have been considered in this context, giving rise to the Linear Hinges and the Smoothed Hinges, respectively. The multidimensional ORTHO model is a simple additive model whose main strengths are its interpretability together with its computational efficiency. Thus, ORTHO models can be used in an interactive trial and error fashion, so as to discover interesting information contained in a data base and gain physical insight into a problem. The OBLIQUE model is a full projection pursuit model able to provide even more accurate results. The main price to pay for this improvement is an increase in CPU time at the learning stage, and also a certain reduction of interpretability. The complementary nature of the two models, allows one to use them together in a tool-box fashion for data mining in the context of real problems, offering interpretability, capability to identify the input variables that influence most strongly the output, and modeling flexibility. We have applied these methods both to synthetic and real-life problems. We used the synthetic problems as a workbench to understand, improve and assess performances. The real-life problems have allowed us to evaluate how useful the proposed models are in practice, in particular in power system security assessment, where we could use our own physical understanding to verify that the obtained models are indeed physically sound and interpretable.
Universidad Pontificia Comillas. Madrid (España)
29 October 1999