Archives AI News

Rethinking Entropy Interventions in RLVR: An Entropy Change Perspective

arXiv:2510.10150v3 Announce Type: replace Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) serves as a cornerstone technique for enhancing the reasoning capabilities of Large Language Models (LLMs). However, its training is often plagued by emph{entropy collapse}, a rapid decline in policy…

Evaluating LLM Safety Under Repeated Inference via Accelerated Prompt Stress Testing

arXiv:2602.11786v2 Announce Type: replace Abstract: Traditional benchmarks for large language models (LLMs), such as HELM and AIR-BENCH, primarily assess safety through breadth-oriented evaluation across diverse tasks and risk categories. However, real-world deployment often exposes a different class of risk: operational…

Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence

arXiv:2604.24954v1 Announce Type: new Abstract: We introduce Nemotron 3 Nano Omni, the latest model in the Nemotron multimodal series and the first to natively support audio inputs alongside text, images, and video. Nemotron 3 Nano Omni delivers consistent accuracy improvements…

Sharp Capacity Scaling of Spectral Optimizers in Learning Associative Memory

arXiv:2603.26554v2 Announce Type: replace Abstract: Spectral optimizers such as Muon have recently shown strong empirical performance in large-scale language model training, but the source and extent of their advantage remain poorly understood. We study this question through the linear associative…

Compute Aligned Training: Optimizing for Test Time Inference

arXiv:2604.24957v1 Announce Type: new Abstract: Scaling test-time compute has emerged as a powerful mechanism for enhancing Large Language Model (LLM) performance. However, standard post-training paradigms, Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL), optimize the likelihood of individual samples under a…

CoreFlow: Low-Rank Matrix Generative Models

arXiv:2604.24959v1 Announce Type: new Abstract: Learning matrix-valued distributions from high-dimensional and possibly incomplete training data is challenging: ambient-space generative modeling is computationally expensive and statistically fragile when the matrix dimension is large but the sample size is limited. We propose…