Abstract
In many practical applications, large language models (LLMs) need toincorporate new knowledge not present in their pre-training data. The primarymethods for this are fine-tuning and retrieval-augmented generation (RAG).Although RAG has emerged as the industry standard for knowledge injection,fine-tuning has not yet achieved comparable success. In this paper, we proposea new fine-tuning technique for learning new knowledge and show that it canreach the performance of RAG. The proposed method is based on theself-distillation approach, which we call prompt distillation. First, wegenerate question-answer pairs about the new knowledge. Then, we fine-tune astudent model on the question-answer pairs to imitate the output distributionsof a teacher model, which additionally receives the new knowledge in itsprompt. The student model is identical to the teacher, except it is equippedwith a LoRA adapter. This training procedure facilitates distilling the newknowledge from the teacher's prompt into the student's weights.