Learning Multi-Robot Coordination through Locality-Based Factorized Multi-Agent Actor-Critic Algorithm

Abstract

In this work, we present a novel cooperative multi-agent reinforcementlearning method called \textbf{Loc}ality based \textbf{Fac}torized\textbf{M}ulti-Agent \textbf{A}ctor-\textbf{C}ritic (Loc-FACMAC). Existingstate-of-the-art algorithms, such as FACMAC, rely on global reward information,which may not accurately reflect the quality of individual robots' actions indecentralized systems. We integrate the concept of locality into criticlearning, where strongly related robots form partitions during training. Robotswithin the same partition have a greater impact on each other, leading to moreprecise policy evaluation. Additionally, we construct a dependency graph tocapture the relationships between robots, facilitating the partitioningprocess. This approach mitigates the curse of dimensionality and preventsrobots from using irrelevant information. Our method improves existingalgorithms by focusing on local rewards and leveraging partition-based learningto enhance training efficiency and performance. We evaluate the performance ofLoc-FACMAC in three environments: Hallway, Multi-cartpole, andBounded-Cooperative-Navigation. We explore the impact of partition sizes on theperformance and compare the result with baseline MARL algorithms such as LOMAQ,FACMAC, and QMIX. The experiments reveal that, if the locality structure isdefined properly, Loc-FACMAC outperforms these baseline algorithms up to 108\%,indicating that exploiting the locality structure in the actor-critic frameworkimproves the MARL performance.