Archives AI News

Physics Steering: Causal Control of Cross-Domain Concepts in a Physics Foundation Model

arXiv:2511.20798v1 Announce Type: new Abstract: Recent advances in mechanistic interpretability have revealed that large language models (LLMs) develop internal representations corresponding not only to concrete entities but also distinct, human-understandable abstract concepts and behaviour. Moreover, these hidden features can be…

Asymmetric Duos: Sidekicks Improve Uncertainty

arXiv:2505.18636v2 Announce Type: replace Abstract: The go-to strategy to apply deep networks in settings where uncertainty informs decisions–ensembling multiple training runs with random initializations–is ill-suited for the extremely large-scale models and practical fine-tuning workflows of today. We introduce a new…

Effects of Initialization Biases on Deep Neural Network Training Dynamics

arXiv:2511.20826v1 Announce Type: new Abstract: Untrained large neural networks, just after random initialization, tend to favour a small subset of classes, assigning high predicted probabilities to these few classes and approximately zero probability to all others. This bias, termed Initial…

Lost in Serialization: Invariance and Generalization of LLM Graph Reasoners

arXiv:2511.10234v2 Announce Type: replace Abstract: While promising, graph reasoners based on Large Language Models (LLMs) lack built-in invariance to symmetries in graph representations. Operating on sequential graph serializations, LLMs can produce different outputs under node reindexing, edge reordering, or formatting…

Pre-train to Gain: Robust Learning Without Clean Labels

arXiv:2511.20844v1 Announce Type: new Abstract: Training deep networks with noisy labels leads to poor generalization and degraded accuracy due to overfitting to label noise. Existing approaches for learning with noisy labels often rely on the availability of a clean subset…