Archives AI News

Seer: Online Context Learning for Fast Synchronous LLM Reinforcement Learning

arXiv:2511.14617v3 Announce Type: replace-cross Abstract: Reinforcement Learning (RL) has emerged as a critical technique for advancing modern Large Language Models (LLMs), yet existing synchronous RL systems face severe performance bottlenecks. The rollout phase, which dominates end-to-end iteration time, suffers from…

Generating Counterfactual Patient Timelines from Real-World Data

arXiv:2604.02337v1 Announce Type: new Abstract: Counterfactual simulation – exploring hypothetical consequences under alternative clinical scenarios – holds promise for transformative applications such as personalized medicine and in silico trials. However, it remains challenging due to methodological limitations. Here, we show…

Central Limit Theorems for Stochastic Gradient Descent Quantile Estimators

arXiv:2503.02178v2 Announce Type: replace-cross Abstract: This paper develops asymptotic theory for quantile estimation via stochastic gradient descent (SGD) with a constant learning rate. The quantile loss function is neither smooth nor strongly convex. Beyond conventional perspectives and techniques, we view…

Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains

arXiv:2604.02343v1 Announce Type: new Abstract: We study the compression of LLM-generated text across lossless and lossy regimes, characterizing a compression-compute frontier where more compression is possible at the cost of more compute. For lossless compression, domain-adapted LoRA adapters can improve…

Steering Autoregressive Music Generation with Recursive Feature Machines

arXiv:2510.19127v2 Announce Type: replace Abstract: Controllable music generation remains a significant challenge, with existing methods often requiring model retraining or introducing audible artifacts. We introduce MusicRFM, a framework that adapts Recursive Feature Machines (RFMs) to enable fine-grained, interpretable control over…

LLM Reasoning with Process Rewards for Outcome-Guided Steps

arXiv:2604.02341v1 Announce Type: new Abstract: Mathematical reasoning in large language models has improved substantially with reinforcement learning using verifiable rewards, where final answers can be checked automatically and converted into reliable training signals. Most such pipelines optimize outcome correctness only,…

On Data-Driven Koopman Representations of Nonlinear Delay Differential Equations

arXiv:2604.03086v1 Announce Type: cross Abstract: This work establishes a rigorous bridge between infinite-dimensional delay dynamics and finite-dimensional Koopman learning, with explicit and interpretable error guarantees. While Koopman analysis is well-developed for ordinary differential equations (ODEs) and partially for partial differential…