Información de la Tesis Doctoral

Efficiently transferring deep reinforcement learning experience to industrial assets

Lucía Güitta López

Dirigida por A.J. López López, J. Boal

Universidad Pontificia Comillas. Madrid (España)

18 de diciembre de 2024

Resumen:

The Fourth Industrial Revolution emphasizes integrating Artificial Intelligence to enhance industrial efficiency, with Deep Reinforcement Learning (DRL) offering solutions for complex sequential decision-making. A critical challenge in DRL is sample efficiency, which can be improved by using virtual environments for agent training. However, transferring virtual learning to real-world applications (sim-to-real) remains a key obstacle. This thesis presents a methodology for efficiently transferring DRL agent experience from virtual environments to real setups, validated using two industrial assets for a pick-and-place robotic manipulator task, focusing in the approach to the targets. The approach avoids proprietary integration by relying on monocular RGB camera inputs, balancing adaptability and computational demands.
The research evaluates four prominent techniques for sim-to-real transfer:
1. Domain Randomization (DR): Training agents in highly variable virtual scenarios improves generalization. Randomizing scene features with this high-level approach enhances performance, but excessive variability reduces success rates. On the other hand, a low-level DR that consists in adding Gaussian noise to images partially bridges the gap between virtual and real environments, increasing zero-shot transfer success rates from 15.8% to 34.1%.
2. Progressive Neural Networks (PNNs): Leveraging lateral connections between "teacher" and "student" networks facilitates knowledge transfer. While the sim-to-sim experiments show an effective transfer of the learned representations, PNNs show partial forgetting in simpler tasks. For the sim-to-real problem, PNNs achieved success rates of 80%-100% in most workspaces with only 60,000 samples, demonstrating few-shot transfer capability.
3. Domain Adaptation (DA): Using an original StyleID-CycleGAN (SICGAN), virtual observations are converted into realistic images, enabling agents to generalize better. DA achieves near-perfect post-training accuracy in a zero-shot transfer and real-world success rates above 85% for most of the workspace, surpassing PNNs in both efficiency and performance without real-world fine-tuning.
4. Semantic Knowledge: Incorporating knowledge graph embeddings into the DRL pipeline with semantic information about the environment reduces training time by up to 60% and improves performance by 15%, offering structured contextual understanding.
The proposed methodology is the result of experimenting with these techniques to optimize sim-to-real transfer for industrial operations. Results highlight that using the SICGAN to translate images in the virtual environment to real-synthetic observations and then perform a zero-shot with the virtually trained agent is the most efficient solution, reducing reliance on real-world interactions while maintaining high success rates across the workspace.

Resumen divulgativo:

La tesis aborda la transferencia del entorno simulado al real en el Aprendizaje por Refuerzo Profundo. Comparando la Aleatorización de Dominio, Redes Neuronales Progresivas y la Adaptación de Dominio, los resultados destacan la StyleID-CycleGAN como la más efectiva para una transferencia directa.

Descriptores: Tecnología Industrial, Visión artificial, Robótica, Inteligencia Artificial

Palabras clave: Deep Reinforcement Learning, Sim-To-Real, Transfer Learning, Domain Randomization, Domain Adaptation, Progressive Neural Networks, Semantic Knowledge, Robotics, Industry 4.0.

Cita:
L. Güitta-López (2024), Efficiently transferring deep reinforcement learning experience to industrial assets. Madrid (España).

Solicitar la tesis al autor

Nombre:
Email:
Empresa/Institución:
Código:
Acepto la Política de Privacidad	Sí.