Hierarchical Prompt Decision Transformer: Improving Few-Shot Policy Generalization with Global and Adaptive Guidance

  • 2024-12-13 03:31:51
  • Zhe Wang, Haozhu Wang, Yanjun Qi
  • 0

Abstract

Decision transformers recast reinforcement learning as a conditional sequencegeneration problem, offering a simple but effective alternative to traditionalvalue or policy-based methods. A recent key development in this area is theintegration of prompting in decision transformers to facilitate few-shot policygeneralization. However, current methods mainly use static prompt segments toguide rollouts, limiting their ability to provide context-specific guidance.Addressing this, we introduce a hierarchical prompting approach enabled byretrieval augmentation. Our method learns two layers of soft tokens as guidingprompts: (1) global tokens encapsulating task-level information abouttrajectories, and (2) adaptive tokens that deliver focused, timestep-specificinstructions. The adaptive tokens are dynamically retrieved from a curated setof demonstration segments, ensuring context-aware guidance. Experiments acrossseven benchmark tasks in the MuJoCo and MetaWorld environments demonstrate theproposed approach consistently outperforms all baseline methods, suggestingthat hierarchical prompting for decision transformers is an effective strategyto enable few-shot policy generalization.