Estimating the Self-Consistency of LLMs

2025-09-24 19:00 GMT · 7 months ago aimagpro.com

arXiv:2509.19489v1 Announce Type: new
Abstract: Systems often repeat the same prompt to large language models (LLMs) and aggregate responses to improve reliability. This short note analyzes an estimator of the self-consistency of LLMs and the tradeoffs it induces under a fixed compute budget $B=mn$, where $m$ is the number of prompts sampled from the task distribution and $n$ is the number of repeated LLM calls per prompt; the resulting analysis favors a rough split $m,nproptosqrt{B}$.