Literatur vom gleichen Autor/der gleichen Autor*in
plus bei Google Scholar

Bibliografische Daten exportieren
 

Attention-Based Recurrence for Multi-Agent Reinforcement Learning under Stochastic Partial Observability

Titelangaben

Phan, Thomy ; Ritz, Fabian ; Altmann, Philipp ; Zorn, Maximilian ; Nüßlein, Jonas ; Kölle, Michael ; Gabor, Thomas ; Linnhoff-Popien, Claudia:
Attention-Based Recurrence for Multi-Agent Reinforcement Learning under Stochastic Partial Observability.
In: Krause, Andreas ; Brunskill, Emma ; Cho, Kyunghyun ; Engelhardt, Barbara ; Sabato, Sivan ; Scarlett, Jonathan (Hrsg.): Proceedings of the 40th International Conference on Machine Learning. - Red Hook, NY : Curran Associates, Inc. , 2023 . - S. 27840-27853 . - (Proceedings of Machine Learning Research ; 202 )

Volltext

Link zum Volltext (externe URL): Volltext

Angaben zu Projekten

Projekttitel:
Offizieller Projekttitel
Projekt-ID
Innovationszentrum Mobiles Internet (InnoMI)
Ohne Angabe

Projektfinanzierung: Bayerisches Staatsministerium für Wirtschaft, Infrastruktur, Verkehr und Technologie

Abstract

Stochastic partial observability poses a major challenge for decentralized coordination in multi-agent reinforcement learning but is largely neglected in state-of-the-art research due to a strong focus on state-based centralized training for decentralized execution (CTDE) and benchmarks that lack sufficient stochasticity like StarCraft Multi-Agent Challenge (SMAC). In this paper, we propose Attention-based Embeddings of Recurrence In multi-Agent Learning (AERIAL) to approximate value functions under stochastic partial observability. AERIAL replaces the true state with a learned representation of multi-agent recurrence, considering more accurate information about decentralized agent decisions than state-based CTDE. We then introduce MessySMAC, a modified version of SMAC with stochastic observations and higher variance in initial states, to provide a more general and configurable benchmark regarding stochastic partial observability. We evaluate AERIAL in Dec-Tiger as well as in a variety of SMAC and MessySMAC maps, and compare the results with state-based CTDE. Furthermore, we evaluate the robustness of AERIAL and state-based CTDE against various stochasticity configurations in MessySMAC.

Weitere Angaben

Publikationsform: Aufsatz in einem Buch
Begutachteter Beitrag: Ja
Keywords: Multi-Agent Reinforcement Learning; Stochastic Partial Observability; Self-Attention
Institutionen der Universität: Fakultäten > Fakultät für Mathematik, Physik und Informatik > Institut für Informatik
Titel an der UBT entstanden: Nein
Themengebiete aus DDC: 000 Informatik,Informationswissenschaft, allgemeine Werke > 004 Informatik
Eingestellt am: 17 Nov 2025 11:28
Letzte Änderung: 17 Nov 2025 11:28
URI: https://eref.uni-bayreuth.de/id/eprint/95257