Archives AI News

Reason in Chains, Learn in Trees: Self-Rectification and Grafting for Multi-turn Agent Policy Optimization

arXiv:2604.07165v1 Announce Type: cross Abstract: Reinforcement learning for Large Language Model agents is often hindered by sparse rewards in multi-step reasoning tasks. Existing approaches like Group Relative Policy Optimization treat sampled trajectories as independent chains, assigning uniform credit to all…

April 10, 2026

Gaussian Approximation for Asynchronous Q-learning

arXiv:2604.07323v1 Announce Type: cross Abstract: In this paper, we derive rates of convergence in the high-dimensional central limit theorem for Polyak-Ruppert averaged iterates generated by the asynchronous Q-learning algorithm with a polynomial stepsize $k^{-omega},, omega in (1/2, 1]$. Assuming that…

April 10, 2026

RAGEN-2: Reasoning Collapse in Agentic RL

arXiv:2604.06268v1 Announce Type: new Abstract: RL training of multi-turn LLM agents is inherently unstable, and reasoning quality directly determines task performance. Entropy is widely used to track reasoning stability. However, entropy only measures diversity within the same input, and cannot…

April 10, 2026

Evaluating PQC KEMs, Combiners, and Cascade Encryption via Adaptive IND-CPA Testing Using Deep Learning

arXiv:2604.06942v1 Announce Type: cross Abstract: Ensuring ciphertext indistinguishability is fundamental to cryptographic security, but empirically validating this property in real implementations and hybrid settings presents practical challenges. The transition to post-quantum cryptography (PQC), with its hybrid constructions combining classical and…

April 10, 2026

SMT-AD: a scalable quantum-inspired anomaly detection approach

arXiv:2604.06265v1 Announce Type: new Abstract: Quantum-inspired tensor networks algorithms have shown to be effective and efficient models for machine learning tasks, including anomaly detection. Here, we propose a highly parallelizable quantum-inspired approach which we call SMT-AD from Superposition of Multiresolution…

April 10, 2026

MO-RiskVAE: A Multi-Omics Variational Autoencoder for Survival Risk Modeling in Multiple MyelomaMO-RiskVAE

arXiv:2604.06267v1 Announce Type: new Abstract: Multimodal variational autoencoders (VAEs) have emerged as a powerful framework for survival risk modeling in multiple myeloma by integrating heterogeneous omics and clinical data. However, when trained under survival supervision, standard latent regularization strategies often…

April 10, 2026

$S^3$: Stratified Scaling Search for Test-Time in Diffusion Language Models

arXiv:2604.06260v1 Announce Type: new Abstract: Test-time scaling investigates whether a fixed diffusion language model (DLM) can generate better outputs when given more inference compute, without additional training. However, naive best-of-$K$ sampling is fundamentally limited because it repeatedly draws from the…

April 10, 2026

Spectral Edge Dynamics Reveal Functional Modes of Learning

arXiv:2604.06256v1 Announce Type: new Abstract: Training dynamics during grokking concentrate along a small number of dominant update directions — the spectral edge — which reliably distinguishes grokking from non-grokking regimes. We show that standard mechanistic interpretability tools (head attribution, activation…

April 10, 2026

FLeX: Fourier-based Low-rank EXpansion for multilingual transfer

arXiv:2604.06253v1 Announce Type: new Abstract: Cross-lingual code generation is critical in enterprise environments where multiple programming languages coexist. However, fine-tuning large language models (LLMs) individually for each language is computationally prohibitive. This paper investigates whether parameter-efficient fine-tuning methods and optimizer…

April 10, 2026

Asymptotic-Preserving Neural Networks for Viscoelastic Parameter Identification in Multiscale Blood Flow Modeling

arXiv:2604.06287v1 Announce Type: new Abstract: Mathematical models and numerical simulations offer a non-invasive way to explore cardiovascular phenomena, providing access to quantities that cannot be measured directly. In this study, we start with a one-dimensional multiscale blood flow model that…

April 10, 2026