Abstract
Retrieval-Augmented Generation (RAG) systems using large language models(LLMs) often generate inaccurate responses due to the retrieval of irrelevantor loosely related information. Existing methods, which operate at the documentlevel, fail to effectively filter out such content. We propose LLM-driven chunkfiltering, ChunkRAG, a framework that enhances RAG systems by evaluating andfiltering retrieved information at the chunk level. Our approach employssemantic chunking to divide documents into coherent sections and utilizesLLM-based relevance scoring to assess each chunk's alignment with the user'squery. By filtering out less pertinent chunks before the generation phase, wesignificantly reduce hallucinations and improve factual accuracy. Experimentsshow that our method outperforms existing RAG models, achieving higher accuracyon tasks requiring precise information retrieval. This advancement enhances thereliability of RAG systems, making them particularly beneficial forapplications like fact-checking and multi-hop reasoning.