Archives AI News

dMX: Differentiable Mixed-Precision Assignment for Low-Precision Floating-Point Formats

arXiv:2606.04115v1 Announce Type: new Abstract: Quantizing large language models (LLMs) to low-precision floating-point representations is central to efficient deployment, yet applying a single bit-width uniformly across all layers is sub-optimal in terms of both performance and accuracy. This work introduces…

June 5, 2026

FoeGlass: Simple In-Context Learning Is Enough for Red Teaming Audio Deepfake Detectors

arXiv:2606.05101v1 Announce Type: cross Abstract: Audio deepfake detection (ADD) models are critical for countering the malicious use of text-to-speech (TTS) models. Evaluating and strengthening ADD models requires developing datasets that span the space of generated audio and highlight high-error regions.…

June 5, 2026

Making Expert Reasoning Learnable with Self-Distillation

arXiv:2602.02405v2 Announce Type: replace Abstract: Improving the reasoning capabilities of large language models (LLMs) typically relies either on the model’s ability to sample a correct solution to be reinforced or the existence of a stronger model able to solve the…

June 5, 2026

Scalable Uncertainty Quantification for Extreme Weather Forecasting via Empirical Neural Tangent Kernels

arXiv:2606.02886v2 Announce Type: replace Abstract: Deep learning weather models now match numerical weather prediction accuracy while running orders of magnitude faster, but produce deterministic forecasts without uncertainty estimates, a critical gap for high-stakes decisions during extreme weather events. This paper…

June 5, 2026

Adaptive Patching Is Harder Than It Looks For Time-Series Forecasting

arXiv:2606.04074v1 Announce Type: new Abstract: Adaptive patching is a recent and compelling proposal for time-series Transformers: allocate finer patches where the sequence looks locally informative. This paper asks under what conditions a content-adaptive patching operator should outperform a tuned uniform…

June 5, 2026

Stationarity-Aware Retrieval-Augmented Time Series Forecasting

arXiv:2606.04135v1 Announce Type: new Abstract: Time series forecasting relies on historical patterns, but real-world series often exhibit non-stationarity and regime shifts that challenge fully parametric forecasters. Inspired by Retrieval-Augmented Generation (RAG), recent work augments forecasters by retrieving relevant historical segments…

June 5, 2026

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training

arXiv:2506.05233v2 Announce Type: replace Abstract: Sequence modeling is currently dominated by causal transformer architectures that use softmax self-attention. Although widely adopted, transformers require scaling memory and compute linearly during inference. A recent stream of work linearized the softmax operation, resulting…

June 5, 2026

Large Language Models Hack Rewards, and Society

arXiv:2606.04075v1 Announce Type: new Abstract: Reinforcement learning (RL) has become a dominant post-training paradigm, enabling large language models (LLMs) to learn from rewards. We observe that societal regulations are structurally similar to reward functions. They define measurable outcomes, thresholds, and…

June 5, 2026

You Only Train Once: Differentiable Subset Selection for Omics Data

arXiv:2512.17678v2 Announce Type: replace Abstract: Selecting compact and informative gene subsets from single-cell transcriptomic data is essential for biomarker discovery, improving interpretability, and cost-effective profiling. However, most existing feature selection approaches either operate as multi-stage pipelines or rely on post…

June 5, 2026

Stein Kernelized Molecular Dynamics for Active Learning of Interatomic Potentials

arXiv:2606.04100v1 Announce Type: new Abstract: Machine learning interatomic potentials (MLIPs) enable efficient and accurate atomistic simulations but depend critically on the quality and diversity of the training data. We introduce Stein kernelized molecular dynamics (SKMD), an enhanced sampling method that…

June 5, 2026