Archives AI News

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training

arXiv:2506.05233v2 Announce Type: replace Abstract: Sequence modeling is currently dominated by causal transformer architectures that use softmax self-attention. Although widely adopted, transformers require scaling memory and compute linearly during inference. A recent stream of work linearized the softmax operation, resulting…

June 5, 2026

Large Language Models Hack Rewards, and Society

arXiv:2606.04075v1 Announce Type: new Abstract: Reinforcement learning (RL) has become a dominant post-training paradigm, enabling large language models (LLMs) to learn from rewards. We observe that societal regulations are structurally similar to reward functions. They define measurable outcomes, thresholds, and…

June 5, 2026

You Only Train Once: Differentiable Subset Selection for Omics Data

arXiv:2512.17678v2 Announce Type: replace Abstract: Selecting compact and informative gene subsets from single-cell transcriptomic data is essential for biomarker discovery, improving interpretability, and cost-effective profiling. However, most existing feature selection approaches either operate as multi-stage pipelines or rely on post…

June 5, 2026

Stein Kernelized Molecular Dynamics for Active Learning of Interatomic Potentials

arXiv:2606.04100v1 Announce Type: new Abstract: Machine learning interatomic potentials (MLIPs) enable efficient and accurate atomistic simulations but depend critically on the quality and diversity of the training data. We introduce Stein kernelized molecular dynamics (SKMD), an enhanced sampling method that…

June 5, 2026

Making Expert Reasoning Learnable with Self-Distillation

arXiv:2602.02405v2 Announce Type: replace Abstract: Improving the reasoning capabilities of large language models (LLMs) typically relies either on the model’s ability to sample a correct solution to be reinforced or the existence of a stronger model able to solve the…

June 5, 2026

Building The Ph(ysical)AI Layer Of Machine Intelligence

arXiv:2606.04106v1 Announce Type: new Abstract: Foundation models achieve generalization through massive-scale training on diverse data, but have limitations with transfer to truly unseen domains without paired training data. We propose principle-driven foundation models that encode signal-theoretic principles (Fourier decomposition, energy…

June 5, 2026

Startup helps retailers track their products in real-time

Using technology invented at MIT, Cartesian’s system for locating objects could also find uses in manufacturing, logistics, and robotics.

June 5, 2026

Bayesian learning for the stochastic shortest path problem

arXiv:2606.04845v1 Announce Type: cross Abstract: Sequential decision-making problems are often modelled as a Markov decision process (MDP). We focus on the stochastic shortest path (SSP) problem, which is an infinite-horizon undiscounted MDP with absorbing terminal states. We develop a Bayesian…

June 5, 2026

Explaining a probabilistic prediction on the simplex with Shapley compositions

arXiv:2408.01382v3 Announce Type: replace Abstract: Originating in game theory, Shapley values are widely used for explaining a machine learning model’s prediction by quantifying the contribution of each feature’s value to the prediction. This requires a scalar prediction as in binary…

June 5, 2026

Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text

arXiv:2605.29076v2 Announce Type: replace-cross Abstract: LLMs have advanced text classification, yet existing paradigms face a trade-off: supervised (label only) fine-tuning is scalable but offers limited reasoning on complex text and lacks broader model transparency, while discrete prompt optimization offers human-readable…

June 5, 2026