Better Think Thrice: Learning to Reason Causally with Double Counterfactual Consistency
arXiv:2602.16787v1 Announce Type: new Abstract: Despite their strong performance on reasoning benchmarks, large language models (LLMs) have proven brittle when presented with counterfactual questions, suggesting weaknesses in their causal reasoning ability. While recent work has demonstrated that labeled counterfactual tasks…
