A piecewise linear regression model ensemble for large-scale curve fitting

S. Moreno, E.F. Sánchez-Úbeda

Algorithms Vol. 17, nº. 4, pp. 147-1 - 147-27


The Linear Hinges Model (LHM) is an efficient approach to flexible and robust one-dimensional curve fitting under stringent high-noise conditions. However, it was initially designed to run in a single-core processor, accessing the whole input dataset. The surge in data volumes, coupled with the increase in parallel hardware architectures and specialised frameworks, has led to a growth in interest and a need for new algorithms able to deal with large-scale datasets and techniques to adapt traditional machine learning algorithms to this new paradigm. This paper presents several ensemble alternatives, based on model selection and combination, that allow for obtaining a continuous piecewise linear regression model from large-scale datasets using the learning algorithm of the LHM. Our empirical tests have proved that model combination outperforms model selection and that these methods can provide better results in terms of bias, variance, and execution time than the original algorithm executed over the entire dataset.

Spanish layman's summary:

Este artículo propone métodos de ensamblado para obtener un modelo de regresión lineal a tramos en un contexto de big data. Las pruebas demuestran que la combinación de modelos supera la selección de modelos, ofreciendo mejores resultados en términos de sesgo, varianza y tiempo de ejecución.

English layman's summary:

This article proposes several ensemble methods to obtain a piecewise linear regression model in a big data context. Tests show that model combination outperforms model selection, delivering better results regarding bias, variance, and execution time.


Keywords: one-dimensional piecewise regression; non-linear regression; curve fitting; ensemble model; model selection; model combination; model parallelism

JCR Impact Factor and WoS quartile: 2,300 - Q3 (2022)

DOI reference: DOI icon https://doi.org/10.3390/a17040147

Published on paper: April 2024.

Published on-line: March 2024.

