Literatur vom gleichen Autor/der gleichen Autor*in
plus bei Google Scholar

Bibliografische Daten exportieren
 

CROP: Towards Distributional-Shift Robust Reinforcement Learning Using Compact Reshaped Observation Processing

Titelangaben

Altmann, Philipp ; Ritz, Fabian ; Feuchtinger, Leonard ; Nüßlein, Jonas ; Linnhoff-Popien, Claudia ; Phan, Thomy:
CROP: Towards Distributional-Shift Robust Reinforcement Learning Using Compact Reshaped Observation Processing.
In: Elkind, Edith (Hrsg.): Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI-23). - Vienna, Austria : International Joint Conferences on Artificial Intelligence Organization , 2023 . - S. 3414-3422
ISBN 978-1-956792-03-4
DOI: https://doi.org/10.24963/ijcai.2023/380

Volltext

Link zum Volltext (externe URL): Volltext

Angaben zu Projekten

Projekttitel:
Offizieller Projekttitel
Projekt-ID
Innovationszentrum Mobiles Internet (InnoMI)
Ohne Angabe

Projektfinanzierung: Bayerisches Staatsministerium für Wirtschaft, Infrastruktur, Verkehr und Technologie

Abstract

The safe application of reinforcement learning (RL) requires generalization from limited training data to unseen scenarios. Yet, fulfilling tasks under changing circumstances is a key challenge in RL. Current state-of-the-art approaches for generalization apply data augmentation techniques to increase the diversity of training data. Even though this prevents overfitting to the training environment(s), it hinders policy optimization. Crafting a suitable observation, only containing crucial information, has been shown to be a challenging task itself. To improve data efficiency and generalization capabilities, we propose Compact Reshaped Observation Processing (CROP) to reduce the state information used for policy optimization. By providing only relevant information, overfitting to a specific training layout is precluded and generalization to unseen environments is improved. We formulate three CROPs that can be applied to fully observable observation- and action-spaces and provide methodical foundation. We empirically show the improvements of CROP in a distributionally shifted safety gridworld. We furthermore provide benchmark comparisons to full observability and data-augmentation in two different-sized procedurally generated mazes.

Weitere Angaben

Publikationsform: Aufsatz in einem Buch
Begutachteter Beitrag: Ja
Zusätzliche Informationen: Main Track
Keywords: Deep reinforcement learning; Safety; Robustness
Institutionen der Universität: Fakultäten > Fakultät für Mathematik, Physik und Informatik > Institut für Informatik
Titel an der UBT entstanden: Nein
Themengebiete aus DDC: 000 Informatik,Informationswissenschaft, allgemeine Werke > 004 Informatik
Eingestellt am: 17 Nov 2025 11:37
Letzte Änderung: 17 Nov 2025 11:37
URI: https://eref.uni-bayreuth.de/id/eprint/95258