Archives AI News

CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR

arXiv:2603.10101v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has significantly advanced the reasoning capacity of Large Language Models (LLMs). However, RLVR solely relies on final answers as outcome rewards, neglecting the correctness of intermediate reasoning steps. Training…

March 12, 2026

V2M-Zero: Zero-Pair Time-Aligned Video-to-Music Generation

arXiv:2603.11042v1 Announce Type: cross Abstract: Generating music that temporally aligns with video events is challenging for existing text-to-music models, which lack fine-grained temporal control. We introduce V2M-Zero, a zero-pair video-to-music generation approach that outputs time-aligned music for video. Our method…

March 12, 2026

Lost in the Middle at Birth: An Exact Theory of Transformer Position Bias

arXiv:2603.10123v1 Announce Type: new Abstract: The “Lost in the Middle” phenomenon — a U-shaped performance curve where LLMs retrieve well from the beginning and end of a context but fail in the middle — is widely attributed to learned Softmax…

March 12, 2026

Mamba Neural Operator: Who Wins? Transformers vs. State-Space Models for PDEs

arXiv:2410.02113v3 Announce Type: replace Abstract: Partial differential equations (PDEs) are widely used to model complex physical systems, but solving them efficiently remains a significant challenge. Recently, Transformers have emerged as the preferred architecture for PDEs due to their ability to…

March 12, 2026

3 Questions: Fortifying our planetary defenses

MIT astronomers are developing a new way to detect, monitor, and mitigate the threats posed by smaller asteroids to our critical space infrastructure.

March 12, 2026

Losing dimensions: Geometric memorization in generative diffusion

arXiv:2410.08727v2 Announce Type: replace-cross Abstract: Diffusion models power leading generative AI, but when and how they memorize training data, especially on low-dimensional manifolds, remains unclear. We find memorization emerges gradually, not abruptly: as data become scarce, diffusion models experience a…

March 12, 2026

Revisiting Value Iteration: Unified Analysis of Discounted and Average-Reward Cases

arXiv:2510.23914v2 Announce Type: replace Abstract: While Value Iteration (VI) is one of the most fundamental algorithms in Reinforcement Learning, its theoretical convergence guarantees still exhibit a persistent mismatch with empirical behavior. In the discounted-reward case, classical theory guarantees geometric convergence…

March 12, 2026

Latent Poincar’e Shaping for Agentic Reinforcement Learning

arXiv:2602.09375v3 Announce Type: replace Abstract: We propose LaPha, a method for training AlphaZero-like LLM agents in a Poincar’e latent space. Under LaPha, the search process can be visualized as a tree rooted at the prompt and growing outward from the…

March 12, 2026

Kernel Tests of Equivalence

arXiv:2603.10886v1 Announce Type: cross Abstract: We propose novel kernel-based tests for assessing the equivalence between distributions. Traditional goodness-of-fit testing is inappropriate for concluding the absence of distributional differences, because failure to reject the null hypothesis may simply be a result…

March 12, 2026

An Algorithm to perform Covariance-Adjusted Support Vector Classification in Non-Euclidean Spaces

arXiv:2504.04371v3 Announce Type: replace Abstract: Traditional Support Vector Machine (SVM) classification is carried out by finding the max-margin classifier for the training data that divides the margin space into two equal sub-spaces. This study demonstrates limitations of performing Support Vector…

March 12, 2026