Archives AI News

ES-dLLM: Efficient Inference for Diffusion Large Language Models by Early-Skipping

arXiv:2603.10088v1 Announce Type: new Abstract: Diffusion large language models (dLLMs) are emerging as a promising alternative to autoregressive models (ARMs) due to their ability to capture bidirectional context and the potential for parallel generation. Despite the advantages, dLLM inference remains…

March 12, 2026

KV Cache Transform Coding for Compact Storage in LLM Inference

arXiv:2511.01815v2 Announce Type: replace-cross Abstract: Serving large language models (LLMs) at scale necessitates efficient key-value (KV) cache management. KV caches can be reused across conversation turns via shared-prefix prompts that are common in iterative code editing and chat. However, stale…

March 12, 2026

A Survey of Weight Space Learning: Understanding, Representation, and Generation

arXiv:2603.10090v1 Announce Type: new Abstract: Neural network weights are typically viewed as the end product of training, while most deep learning research focuses on data, features, and architectures. However, recent advances show that the set of all possible weight values…

March 12, 2026

GOT-JEPA: Generic Object Tracking with Model Adaptation and Occlusion Handling using Joint-Embedding Predictive Architecture

arXiv:2602.14771v2 Announce Type: replace-cross Abstract: The human visual system tracks objects by integrating current observations with previously observed information, adapting to target and scene changes, and reasoning about occlusion at fine granularity. In contrast, recent generic object trackers are often…

March 12, 2026

Equivariant Asynchronous Diffusion: An Adaptive Denoising Schedule for Accelerated Molecular Conformation Generation

arXiv:2603.10093v1 Announce Type: new Abstract: Recent 3D molecular generation methods primarily use asynchronous auto-regressive or synchronous diffusion models. While auto-regressive models build molecules sequentially, they’re limited by a short horizon and a discrepancy between training and inference. Conversely, synchronous diffusion…

March 12, 2026

Quantum entanglement provides a competitive advantage in adversarial games

arXiv:2603.10289v1 Announce Type: cross Abstract: Whether uniquely quantum resources confer advantages in fully classical, competitive environments remains an open question. Competitive zero-sum reinforcement learning is particularly challenging, as success requires modelling dynamic interactions between opposing agents rather than static state-action…

March 12, 2026

Rethinking Adam for Time Series Forecasting: A Simple Heuristic to Improve Optimization under Distribution Shifts

arXiv:2603.10095v1 Announce Type: new Abstract: Time-series forecasting often faces challenges from non-stationarity, particularly distributional drift, where the data distribution evolves over time. This dynamic behavior can undermine the effectiveness of adaptive optimizers, such as Adam, which are typically designed for…

March 12, 2026

Dual Space Preconditioning for Gradient Descent in the Overparameterized Regime

arXiv:2603.10485v1 Announce Type: cross Abstract: In this work we study the convergence properties of the Dual Space Preconditioned Gradient Descent, encompassing optimizers such as Normalized Gradient Descent, Gradient Clipping and Adam. We consider preconditioners of the form $nabla K$, where…

March 12, 2026

Denoising the US Census: Succinct Block Hierarchical Regression

arXiv:2603.10099v1 Announce Type: new Abstract: The US Census Bureau Disclosure Avoidance System (DAS) balances confidentiality and utility requirements for the decennial US Census (Abowd et al., 2022). The DAS was used in the 2020 Census to produce demographic datasets critically…

March 12, 2026

Self-Scaled Broyden Family of Quasi-Newton Methods in JAX

arXiv:2603.10599v1 Announce Type: cross Abstract: We present a JAX implementation of the Self-Scaled Broyden family of quasi-Newton methods, fully compatible with JAX and building on the Optimistix~cite{rader_optimistix_2024} optimisation library. The implementation includes BFGS, DFP, Broyden and their Self-Scaled variants(SSBFGS, SSDFP,…

March 12, 2026