Archives AI News

Seer: Online Context Learning for Fast Synchronous LLM Reinforcement Learning

arXiv:2511.14617v3 Announce Type: replace-cross Abstract: Reinforcement Learning (RL) has emerged as a critical technique for advancing modern Large Language Models (LLMs), yet existing synchronous RL systems face severe performance bottlenecks. The rollout phase, which dominates end-to-end iteration time, suffers from…

April 6, 2026

Generating Counterfactual Patient Timelines from Real-World Data

arXiv:2604.02337v1 Announce Type: new Abstract: Counterfactual simulation – exploring hypothetical consequences under alternative clinical scenarios – holds promise for transformative applications such as personalized medicine and in silico trials. However, it remains challenging due to methodological limitations. Here, we show…

April 6, 2026

The Geometry of Multi-Task Grokking: Transverse Instability, Superposition, and Weight Decay Phase Structure

arXiv:2602.18523v3 Announce Type: replace Abstract: Grokking — the abrupt transition from memorization to generalization long after near-zero training loss — has been studied mainly in single-task settings. We extend geometric analysis to multi-task modular arithmetic, training shared-trunk Transformers on dual-task…

April 6, 2026

Central Limit Theorems for Stochastic Gradient Descent Quantile Estimators

arXiv:2503.02178v2 Announce Type: replace-cross Abstract: This paper develops asymptotic theory for quantile estimation via stochastic gradient descent (SGD) with a constant learning rate. The quantile loss function is neither smooth nor strongly convex. Beyond conventional perspectives and techniques, we view…

April 6, 2026

Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains

arXiv:2604.02343v1 Announce Type: new Abstract: We study the compression of LLM-generated text across lossless and lossy regimes, characterizing a compression-compute frontier where more compression is possible at the cost of more compute. For lossless compression, domain-adapted LoRA adapters can improve…

April 6, 2026

Steering Autoregressive Music Generation with Recursive Feature Machines

arXiv:2510.19127v2 Announce Type: replace Abstract: Controllable music generation remains a significant challenge, with existing methods often requiring model retraining or introducing audible artifacts. We introduce MusicRFM, a framework that adapts Recursive Feature Machines (RFMs) to enable fine-grained, interpretable control over…

April 6, 2026

LLM Reasoning with Process Rewards for Outcome-Guided Steps

arXiv:2604.02341v1 Announce Type: new Abstract: Mathematical reasoning in large language models has improved substantially with reinforcement learning using verifiable rewards, where final answers can be checked automatically and converted into reliable training signals. Most such pipelines optimize outcome correctness only,…

April 6, 2026

Homophily-aware Supervised Contrastive Counterfactual Augmented Fair Graph Neural Network

arXiv:2604.02342v1 Announce Type: new Abstract: In recent years, Graph Neural Networks (GNNs) have achieved remarkable success in tasks such as node classification, link prediction, and graph representation learning. However, they remain susceptible to biases that can arise not only from…

April 6, 2026

Not All Denoising Steps Are Equal: Model Scheduling for Faster Masked Diffusion Language Models

arXiv:2604.02340v1 Announce Type: new Abstract: Recent advances in masked diffusion language models (MDLMs) narrow the quality gap to autoregressive LMs, but their sampling remains expensive because generation requires many full-sequence denoising passes with a large Transformer and, unlike autoregressive decoding,…

April 6, 2026

SIEVE: Sample-Efficient Parametric Learning from Natural Language

arXiv:2604.02339v1 Announce Type: new Abstract: Natural language context-such as instructions, knowledge, or feedback-contains rich signal for adapting language models. While in-context learning provides adaptation via the prompt, parametric learning persists into model weights and can improve performance further, though is…

April 6, 2026