Aquarium MARL Environment

1 minute read

Reference

Kölle, M., Erpelding, Y., Ritz, F., Phan, T., Illium, S., and Linnhoff-Popien, C. 2024. Aquarium: A Comprehensive Framework for Exploring Predator-Prey Dynamics through Multi-Agent Reinforcement Learning Algorithms. arXiv preprint arXiv:2401.07056.

Diagram illustrating the multi-agent reinforcement learning cycle within the Aquarium environment

The study of complex interactions using Multi-Agent Reinforcement Learning (MARL), particularly predator-prey dynamics, often requires specialized simulation environments. To streamline research and avoid redundant development efforts, we introduce Aquarium: a versatile, open-source MARL environment specifically designed for investigating predator-prey scenarios and related emergent behaviors.

Key Features of Aquarium:

Framework Integration: Built upon and seamlessly integrates with the popular PettingZoo API, allowing researchers to readily apply existing MARL algorithm implementations (e.g., from Stable-Baselines3, RLlib).
Physics-Based Movement: Simulates agent movement on a two-dimensional, continuous plane with edge-wrapping boundaries, incorporating basic physics for more realistic interactions.
High Customizability: Offers extensive configuration options for:
- Agent-Environment Interactions: Observation spaces, action spaces, and reward functions can be tailored to specific research questions.
- Environmental Parameters: Key dynamics like agent speeds, prey reproduction rates, predator starvation mechanisms, sensor ranges, and more are fully adjustable.
Visualization & Recording: Includes a resource-efficient visualizer and supports video recording of simulation runs, facilitating qualitative analysis and understanding of agent behaviors.

Diagram detailing the construction of the observation vector for an agent

Construction details of the agent observation vector.

Graphs showing average captures or rewards per prey agent under different training regimes

Performance metrics (e.g., average captures/rewards) comparing training strategies.

To demonstrate its capabilities, we conducted preliminary studies using Proximal Policy Optimization (PPO) to train multiple prey agents learning to evade a predator within Aquarium. Consistent with findings in existing MARL literature, our results showed that training agents with individual policies led to suboptimal performance, whereas utilizing parameter sharing among prey agents significantly improved coordination, sample efficiency, and overall evasion success. [Kölle et al. 2024]

Steffen Illium

Aquarium MARL Environment

Reference

Related posts

MAS Emergence Safety

LMU DevOps Admin

Primate Subsegment Sorting

Emergent Social Dynamics