Squish and Release: Exposing Hidden Hallucinations by Making Them Surface as Safety Signals
arXiv:2603.26829v1 Announce Type: new Abstract: Language models detect false premises when asked directly but absorb them under conversational pressure, producing authoritative professional output built on errors they already identified. This failure – order-gap hallucination – is invisible to output inspection…
