A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis

Abstract

While data synthesis and distillation are promising strategies to enhancesmall language models, current approaches heavily rely on Large Language Models(LLMs), which suffer from high computational costs, environmental inefficiency,and potential biases inherited from monolithic architectures. In contrast,smaller LLMs are more accessible and sustainable, but their individualcapabilities often fall short in generating high-quality, diverse, and reliabledata. Inspired by collaborative human processes (e.g., peer review), we proposea multiple small LLMs involved framework, GRA, that aggregates specializedroles across small LLMs to iterative refinement and quality control typicallyachieved by a single large LLM. In this collaborative framework, multiple smallLLMs assume distinct roles-Generator, Reviewer, and Adjudicator-to simulate apeer-review-inspired data synthesis pipeline. The Generator proposes initialdata samples, the Reviewer critiques their quality and diversity, and theAdjudicator resolves conflicts to finalize the output. By decomposing thesynthesis process into specialized sub-tasks, collaborative small LLMs canachieve data-level parity with large LLM-based distillation. Throughexperiments across multiple benchmarks, we demonstrate that GRA-produced datamatches or exceeds the quality of single large LLM outputs, e.g.,Qwen-2.5-72B-Instruct. Our results challenge the necessity of monolithic largemodels for high-quality data synthesis, advocating instead for strategiccoordination of smaller agents. Our datasets, models, and code are publiclyavailable at https://github.com/GX-XinGao/GRA.