Abstract
Recent advances in graph learning have paved the way for innovativeretrieval-augmented generation (RAG) systems that leverage the inherentrelational structures in graph data. However, many existing approaches sufferfrom rigid, fixed settings and significant engineering overhead, limiting theiradaptability and scalability. Additionally, the RAG community has largelyoverlooked the decades of research in the graph database community regardingthe efficient retrieval of interesting substructures on large-scale graphs. Inthis work, we introduce the RAG-on-Graphs Library (RGL), a modular frameworkthat seamlessly integrates the complete RAG pipeline-from efficient graphindexing and dynamic node retrieval to subgraph construction, tokenization, andfinal generation-into a unified system. RGL addresses key challenges bysupporting a variety of graph formats and integrating optimized implementationsfor essential components, achieving speedups of up to 143x compared toconventional methods. Moreover, its flexible utilities, such as dynamic nodefiltering, allow for rapid extraction of pertinent subgraphs while reducingtoken consumption. Our extensive evaluations demonstrate that RGL not onlyaccelerates the prototyping process but also enhances the performance andapplicability of graph-based RAG systems across a range of tasks.