Abstract
Large language models (LLMs) are susceptible to generating hallucinatedinformation, despite the integration of retrieval-augmented generation (RAG).Parallel context extension (PCE) is a line of research attempting toeffectively integrating parallel (unordered) contexts, while it still suffersfrom hallucinations when adapted to RAG scenarios. In this paper, we proposeDePaC (Dehallucinating Parallel Context Extension), which alleviates thehallucination problem with context-aware negative training andinformation-calibrated aggregation. DePaC is designed to alleviate two types ofin-context hallucination: fact fabrication (i.e., LLMs present claims that arenot supported by the contexts) and fact omission (i.e., LLMs fail to presentclaims that can be supported by the contexts). Specifically, (1) for factfabrication, we apply the context-aware negative training that fine-tunes theLLMs with negative supervisions, thus explicitly guiding the LLMs to refuse toanswer when contexts are not related to questions; (2) for fact omission, wepropose the information-calibrated aggregation which prioritizes contextwindows with higher information increment from their contexts. The experimentalresults on nine RAG tasks demonstrate that DePaC significantly alleviates thetwo types of hallucination and consistently achieves better performances onthese tasks.