Areas of Research in Multi Agent Reinforcement Learning

Properties of MARL systems that are key to their modeling and depending on these properties we might be branching into specific particularities of areas of research.

Ankur Dhuriya
2 min readOct 4, 2022

MARL addresses the sequential decision-making problem of multiple autonomous agents that operate in a common environment, each of which aims to optimize its own longterm return by interacting with the environment and other agents.

This blog post assumes you are familiar with questions like. What is RL ? What is Multi Agent RL ? What further this blog post tries to explain the categorization of MARL algorithms.

Training (Control of training and execution)

  1. Centralized
    A central unit takes the decision for each agent in each time step. Policies are updated based on exchange of information during training.
  2. Decentralized
    Agents make a decision for themselves. Each agent performs updates on its own and develops an individual policy without utilizing foreign information.

INFORMATION AVAILABLE (What the agents have about other agents)

  1. Independent Learner
    Ignore other’s existence. No rewards & actions info from other’s.
  2. Joint-Action Learner
    Observe actions from other’s a-posteriori.

ENVIRONMENT (Conditions of the environment)

  1. Fully observable
    The agents are able to access the whole information and the sensory information.
  2. Partially observable
    The agent is only able to observe its local information.

REWARD STRUCTURE (System Behaviour)

  1. Cooperative
    Same reward to all agents. They cooperate to achieve a common goal and avoid individual failure.
  2. Competitive
    Sumatory of reward is equal to zero. The agents compete against each other maximizing their own reward and minimizing others.

--

--