On Verbalized Confidence Scores for LLMs

Abstract

The rise of large language models (LLMs) and their tight integration into ourdaily life make it essential to dedicate efforts towards their trustworthiness.Uncertainty quantification for LLMs can establish more human trust into theirresponses, but also allows LLM agents to make more informed decisions based oneach other's uncertainty. To estimate the uncertainty in a response, internaltoken logits, task-specific proxy models, or sampling of multiple responses arecommonly used. This work focuses on asking the LLM itself to verbalize itsuncertainty with a confidence score as part of its output tokens, which is apromising way for prompt- and model-agnostic uncertainty quantification withlow overhead. Using an extensive benchmark, we assess the reliability ofverbalized confidence scores with respect to different datasets, models, andprompt methods. Our results reveal that the reliability of these scoresstrongly depends on how the model is asked, but also that it is possible toextract well-calibrated confidence scores with certain prompt methods. We arguethat verbalized confidence scores can become a simple but effective andversatile uncertainty quantification method in the future. Our code isavailable at https://github.com/danielyxyang/llm-verbalized-uq .