debug-gym: A Text-Based Environment for Interactive Debugging

Abstract

Large Language Models (LLMs) are increasingly relied upon for coding tasks,yet in most scenarios it is assumed that all relevant information can be eitheraccessed in context or matches their training data. We posit that LLMs canbenefit from the ability to interactively explore a codebase to gather theinformation relevant to their task. To achieve this, we present a textualenvironment, namely debug-gym, for developing LLM-based agents in aninteractive coding setting. Our environment is lightweight and provides apreset of useful tools, such as a Python debugger (pdb), designed to facilitatean LLM-based agent's interactive debugging. Beyond coding and debugging tasks,this approach can be generalized to other tasks that would benefit frominformation-seeking behavior by an LLM agent.