Abstract
Robots need task planning methods to achieve goals that require more thanindividual actions. Recently, large language models (LLMs) have demonstratedimpressive performance in task planning. LLMs can generate a step-by-stepsolution using a description of actions and the goal. Despite the successes inLLM-based task planning, there is limited research studying the securityaspects of those systems. In this paper, we develop Robo-Troj, the firstmulti-trigger backdoor attack for LLM-based task planners, which is the maincontribution of this work. As a multi-trigger attack, Robo-Troj is trained toaccommodate the diversity of robot application domains. For instance, one canuse unique trigger words, e.g., "herical", to activate a specific maliciousbehavior, e.g., cutting hand on a kitchen robot. In addition, we develop anoptimization method for selecting the trigger words that are most effective.Through demonstrating the vulnerability of LLM-based planners, we aim topromote the development of secured robot systems.