Abstract
Efficient communication can enhance the overall performance of collaborativemulti-agent reinforcement learning. A common approach is to share observationsthrough full communication, leading to significant communication overhead.Existing work attempts to perceive the global state by conducting teammatemodel based on local information. However, they ignore that the uncertaintygenerated by prediction may lead to difficult training. To address thisproblem, we propose a Demand-aware Customized Multi-Agent Communication (DCMAC)protocol, which use an upper bound training to obtain the ideal policy. Byutilizing the demand parsing module, agent can interpret the gain of sendinglocal message on teammate, and generate customized messages via compute thecorrelation between demands and local observation using cross-attentionmechanism. Moreover, our method can adapt to the communication resources ofagents and accelerate the training progress by appropriating the ideal policywhich is trained with joint observation. Experimental results reveal that DCMACsignificantly outperforms the baseline algorithms in both unconstrained andcommunication constrained scenarios.