Archives AI News

Best Agent Identification for General Game Playing

arXiv:2507.00451v2 Announce Type: replace Abstract: We present an efficient and generalised procedure to accurately identify the best (or near best) performing algorithm for each sub-task in a multi-problem domain. Our approach treats this as a set of best arm identification…

ParamBoost: Gradient Boosted Piecewise Cubic Polynomials

arXiv:2604.18864v1 Announce Type: new Abstract: Generalized Additive Models (GAMs) can be used to create non-linear glass-box (i.e. explicitly interpretable) models, where the predictive function is fully observable over the complete input space. However, glass-box interpretability itself does not allow for…

How does the optimizer implicitly bias the model merging loss landscape?

arXiv:2510.04686v2 Announce Type: replace Abstract: Model merging combines independent solutions with different capabilities into a single one while maintaining the same inference cost. Two popular approaches are linear interpolation, which simply averages multiple model weights, and task arithmetic, which combines…

Subgraph Concept Networks: Concept Levels in Graph Classification

arXiv:2604.18868v1 Announce Type: new Abstract: The reasoning process of Graph Neural Networks is complex and considered opaque, limiting trust in their predictions. To alleviate this issue, prior work has proposed concept-based explanations, extracted from clusters in the model’s node embeddings.…

AC-SINDy: Compositional Sparse Identification of Nonlinear Dynamics

arXiv:2604.18889v1 Announce Type: new Abstract: We present AC-SINDy, a compositional extension of the Sparse Identification of Nonlinear Dynamics (SINDy) framework that replaces explicit feature libraries with a structured representation based on arithmetic circuits. Rather than enumerating candidate basis functions, the…

MapPFN: Learning Causal Perturbation Maps in Context

arXiv:2601.21092v2 Announce Type: replace Abstract: Planning effective interventions in biological systems requires treatment-effect models that adapt to unseen biological contexts by identifying their specific underlying mechanisms. Yet single-cell perturbation datasets span only a handful of biological contexts, and existing methods…

Harmful Intent as a Geometrically Recoverable Feature of LLM Residual Streams

arXiv:2604.18901v1 Announce Type: new Abstract: Harmful intent is geometrically recoverable from large language model residual streams: as a linear direction in most layers, and as angular deviation in layers where projection methods fail. Across 12 models spanning four architectural families…

Towards Understanding the Robustness of Sparse Autoencoders

arXiv:2604.18756v1 Announce Type: new Abstract: Large Language Models (LLMs) remain vulnerable to optimization-based jailbreak attacks that exploit internal gradient structure. While Sparse Autoencoders (SAEs) are widely used for interpretability, their robustness implications remain underexplored. We present a study of integrating…