GREEN-CODE: Learning to Optimize Energy Efficiency in LLM-based Code Generation

Abstract

Large Language Models (LLMs) are becoming integral to daily life, showcasingtheir vast potential across various Natural Language Processing (NLP) tasks.Beyond NLP, LLMs are increasingly used in software development tasks, such ascode completion, modification, bug fixing, and code translation. Softwareengineers widely use tools like GitHub Copilot and Amazon Q, streamliningworkflows and automating tasks with high accuracy. While the resource andenergy intensity of LLM training is often highlighted, inference can be evenmore resource-intensive over time, as it's a continuous process with a highnumber of invocations. Therefore, developing resource-efficient alternativesfor LLM inference is crucial for sustainability. This work proposes GREEN-CODE,a framework for energy-aware code generation in LLMs. GREEN-CODE performsdynamic early exit during LLM inference. We train a Reinforcement Learning (RL)agent that learns to balance the trade-offs between accuracy, latency, andenergy consumption. Our approach is evaluated on two open-source LLMs, Llama3.2 3B and OPT 2.7B, using the JavaCorpus and PY150 datasets. Results show thatour method reduces the energy consumption between 23-50 % on average for codegeneration tasks without significantly affecting accuracy.