Biases in the Blind Spot: Detecting What LLMs Fail to Mention
arXiv:2602.10117v3 Announce Type: replace Abstract: Large Language Models (LLMs) often provide chain-of-thought (CoT) reasoning traces that appear plausible, but may hide internal biases. We call these *unverbalized biases*. Monitoring models via their stated reasoning is therefore unreliable, and existing bias…
