Abstract
Few-shot Chain-of-Thought (CoT) significantly enhances the reasoningcapabilities of large language models (LLMs), functioning as a whole to guidethese models in generating reasoning steps toward final answers. However, weobserve that isolated segments, words, or tokens within CoT demonstrations canunexpectedly disrupt the generation process of LLMs. The model may overlyconcentrate on certain local information present in the demonstration,introducing irrelevant noise into the reasoning process and potentially leadingto incorrect answers. In this paper, we investigate the underlying mechanism ofCoT through dynamically tracing and manipulating the inner workings of LLMs ateach output step, which demonstrates that tokens exhibiting specific attentioncharacteristics are more likely to induce the model to take things out ofcontext; these tokens directly attend to the hidden states tied withprediction, without substantial integration of non-local information. Buildingupon these insights, we propose a Few-shot Attention Intervention method (FAI)that dynamically analyzes the attention patterns of demonstrations toaccurately identify these tokens and subsequently make targeted adjustments tothe attention weights to effectively suppress their distracting effect on LLMs.Comprehensive experiments across multiple benchmarks demonstrate consistentimprovements over baseline methods, with a remarkable 5.91% improvement on theAQuA dataset, further highlighting the effectiveness of FAI.