Areas of Research in Multi Agent Reinforcement Learning
Properties of MARL systems that are key to their modeling and depending on these properties we might be branching into specific particularities of areas of research.
2 min readOct 4, 2022
MARL addresses the sequential decision-making problem of multiple autonomous agents that operate in a common environment, each of which aims to optimize its own longterm return by interacting with the environment and other agents.
This blog post assumes you are familiar with questions like. What is RL ? What is Multi Agent RL ? What further this blog post tries to explain the categorization of MARL algorithms.
Training (Control of training and execution)
- Centralized
A central unit takes the decision for each agent in each time step. Policies are updated based on exchange of information during training. - Decentralized
Agents make a decision for themselves. Each agent performs updates on its own and develops an individual policy without utilizing foreign information.
INFORMATION AVAILABLE (What the agents have about other agents)
- Independent Learner
Ignore other’s existence. No rewards & actions info from other’s. - Joint-Action Learner
Observe actions from other’s a-posteriori.
ENVIRONMENT (Conditions of the environment)
- Fully observable
The agents are able to access the whole information and the sensory information. - Partially observable
The agent is only able to observe its local information.
REWARD STRUCTURE (System Behaviour)
- Cooperative
Same reward to all agents. They cooperate to achieve a common goal and avoid individual failure. - Competitive
Sumatory of reward is equal to zero. The agents compete against each other maximizing their own reward and minimizing others.