Parseval Regularization for Continual Reinforcement Learning

Abstract

Loss of plasticity, trainability loss, and primacy bias have been identifiedas issues arising when training deep neural networks on sequences of tasks --all referring to the increased difficulty in training on new tasks. We proposeto use Parseval regularization, which maintains orthogonality of weightmatrices, to preserve useful optimization properties and improve training in acontinual reinforcement learning setting. We show that it provides significantbenefits to RL agents on a suite of gridworld, CARL and MetaWorld tasks. Weconduct comprehensive ablations to identify the source of its benefits andinvestigate the effect of certain metrics associated to network trainabilityincluding weight matrix rank, weight norms and policy entropy.