Go top
Paper information

Data augmentation through multivariate scenario forecasting in data centers using generative adversarial networks

J. Pérez, P. Arroba, J.M. Moya

Applied Intelligence Vol. 53, nº. 2, pp. 1469 - 1486

Summary:

The Cloud paradigm is at a critical point in which the existing energy-efficiency techniques are reaching a plateau, while the computing resources demand at Data Center facilities continues to increase exponentially. The main challenge in achieving a global energy efficiency strategy based on Artificial Intelligence is that we need massive amounts of data to feed the algorithms. This paper proposes a time-series data augmentation methodology based on synthetic scenario forecasting within the Data Center. For this purpose, we will implement a powerful generative algorithm: Generative Adversarial Networks (GANs). Specifically, our work combines the disciplines of GAN-based data augmentation and scenario forecasting, filling the gap in the generation of synthetic data in DCs. Furthermore, we propose a methodology to increase the variability and heterogeneity of the generated data by introducing on-demand anomalies without additional effort or expert knowledge. We also suggest the use of Kullback-Leibler Divergence and Mean Squared Error as new metrics in the validation of synthetic time series generation, as they provide a better overall comparison of multivariate data distributions. We validate our approach using real data collected in an operating Data Center, successfully generating synthetic data helpful for prediction and optimization models. Our research will help optimize the energy consumed in Data Centers, although the proposed methodology can be employed in any similar time-series-like problem.


Spanish layman's summary:

Este artículo propone una metodología de aumento de datos para series temporales basada en la predicción de escenarios sintéticos utilizando Redes Generativas Adversarias, la cual ayudará a mejorar los modelos de optimización de los Centros de Datos, permitiendo un futuro más sostenible y respetuoso con el medio ambiente.


English layman's summary:

This paper proposes a time-series data augmentation methodology based on synthetic scenario forecasting using Generative Adversarial Networks (GAN), which will help to improve Data Center optimization models, enabling a more sustainable and environmentally friendly future.


Keywords: Data Augmentation, Sensor Data, Data Center, Generative Adversarial Networks, Synthetic Data, Scenario Forecasting


JCR Impact Factor and WoS quartile: 5,300 - Q2 (2022)

DOI reference: DOI icon https://doi.org/10.1007/s10489-022-03557-6

Published on paper: January 2023.

Published on-line: April 2022.



Citation:
J. Pérez, P. Arroba, J.M. Moya Data augmentation through multivariate scenario forecasting in data centers using generative adversarial networks. Applied Intelligence. Vol. 53, nº. 2, pp. 1469 - 1486, January 2023. [Online: April 2022]


    Research topics:
  • Data analytics

pdf Preview
Request Request the document to be emailed to you.