Archives AI News

Rethinking Entropy Interventions in RLVR: An Entropy Change Perspective

arXiv:2510.10150v3 Announce Type: replace Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) serves as a cornerstone technique for enhancing the reasoning capabilities of Large Language Models (LLMs). However, its training is often plagued by emph{entropy collapse}, a rapid decline in policy…

April 29, 2026

Rethinking Layer Redundancy in Large Language Models: Calibration Objectives and Search for Depth Pruning

arXiv:2604.24938v1 Announce Type: new Abstract: Depth pruning improves the inference efficiency of large language models by removing Transformer blocks. Prior work has focused on importance criteria and search algorithms, often treating layer redundancy as an inherent structural property of pretrained…

April 29, 2026

Evaluating LLM Safety Under Repeated Inference via Accelerated Prompt Stress Testing

arXiv:2602.11786v2 Announce Type: replace Abstract: Traditional benchmarks for large language models (LLMs), such as HELM and AIR-BENCH, primarily assess safety through breadth-oriented evaluation across diverse tasks and risk categories. However, real-world deployment often exposes a different class of risk: operational…

April 29, 2026

Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence

arXiv:2604.24954v1 Announce Type: new Abstract: We introduce Nemotron 3 Nano Omni, the latest model in the Nemotron multimodal series and the first to natively support audio inputs alongside text, images, and video. Nemotron 3 Nano Omni delivers consistent accuracy improvements…

April 29, 2026

Sharp Capacity Scaling of Spectral Optimizers in Learning Associative Memory

arXiv:2603.26554v2 Announce Type: replace Abstract: Spectral optimizers such as Muon have recently shown strong empirical performance in large-scale language model training, but the source and extent of their advantage remain poorly understood. We study this question through the linear associative…

April 29, 2026

Compute Aligned Training: Optimizing for Test Time Inference

arXiv:2604.24957v1 Announce Type: new Abstract: Scaling test-time compute has emerged as a powerful mechanism for enhancing Large Language Model (LLM) performance. However, standard post-training paradigms, Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL), optimize the likelihood of individual samples under a…

April 29, 2026

AutoPPA: Automated Circuit PPA Optimization via Contrastive Code-based Rule Library Learning

arXiv:2604.18445v2 Announce Type: replace Abstract: Performance, power, and area (PPA) optimization is a fundamental task in RTL design, requiring a precise understanding of circuit functionality and the relationship between circuit structures and PPA metrics. Recent studies attempt to automate this…

April 29, 2026

CoreFlow: Low-Rank Matrix Generative Models

arXiv:2604.24959v1 Announce Type: new Abstract: Learning matrix-valued distributions from high-dimensional and possibly incomplete training data is challenging: ambient-space generative modeling is computationally expensive and statistically fragile when the matrix dimension is large but the sample size is limited. We propose…

April 29, 2026

OptProver: Bridging Olympiad and Optimization through Continual Training in Formal Theorem Proving

arXiv:2604.23712v2 Announce Type: replace Abstract: Recent advances in formal theorem proving have focused on Olympiad-level mathematics, leaving undergraduate domains largely unexplored. Optimization, fundamental to machine learning, operations research, and scientific computing, remains underserved by existing provers. Its reliance on domain-specific…

April 29, 2026

Enabling privacy-preserving AI training on everyday devices

A new method could bring more accurate and efficient AI models to high-stakes applications like health care and finance, even in under-resourced settings.

April 29, 2026