Archives AI News

Evo: Autoregressive-Diffusion Large Language Models with Evolving Balance

arXiv:2603.06617v1 Announce Type: new Abstract: We introduce textbf{Evo}, a duality latent trajectory model that bridges autoregressive (AR) and diffusion-based language generation within a continuous evolutionary generative framework. Rather than treating AR decoding and diffusion generation as separate paradigms, Evo reconceptualizes…

March 10, 2026

In-Run Data Shapley for Adam Optimizer

arXiv:2602.00329v3 Announce Type: replace Abstract: Reliable data attribution is essential for mitigating bias and reducing computational waste in modern machine learning, with the Shapley value serving as the theoretical gold standard. While recent “In-Run” methods bypass the prohibitive cost of…

March 10, 2026

Distilling and Adapting: A Topology-Aware Framework for Zero-Shot Interaction Prediction in Multiplex Biological Networks

arXiv:2603.06618v1 Announce Type: new Abstract: Multiplex Biological Networks (MBNs), which represent multiple interaction types between entities, are crucial for understanding complex biological systems. Yet, existing methods often inadequately model multiplexity, struggle to integrate structural and sequence information, and face difficulties…

March 10, 2026

When AI Levels the Playing Field: Skill Homogenization, Asset Concentration, and Two Regimes of Inequality

arXiv:2603.05565v2 Announce Type: replace Abstract: Generative AI compresses within-task skill differences while shifting economic value toward concentrated complementary assets, creating an apparent paradox: the technology that equalizes individual performance may widen aggregate inequality. We formalize this tension in a task-based…

March 10, 2026

Not all tokens are needed(NAT): token efficient reinforcement learning

arXiv:2603.06619v1 Announce Type: new Abstract: Reinforcement learning (RL) has become a key driver of progress in large language models, but scaling RL to long chain-of-thought (CoT) trajectories is increasingly constrained by backpropagation over every generated token. Even with optimized rollout…

March 10, 2026

Stronger Enforcement of Instruction Hierarchy via Augmented Intermediate Representations

arXiv:2505.18907v2 Announce Type: replace-cross Abstract: Prompt injection attacks are a critical security vulnerability in large language models (LLMs), allowing attackers to hijack model behavior by injecting malicious instructions within the input context. Recent defense mechanisms have leveraged an Instruction Hierarchy…

March 10, 2026

Reward Under Attack: Analyzing the Robustness and Hackability of Process Reward Models

arXiv:2603.06621v1 Announce Type: new Abstract: Process Reward Models (PRMs) are rapidly becoming the backbone of LLM reasoning pipelines, yet we demonstrate that state-of-the-art PRMs are systematically exploitable under adversarial optimization pressure. To address this, we introduce a three-tiered diagnostic framework…

March 10, 2026

Crowdsourcing the Frontier: Advancing Hybrid Physics-ML Climate Simulation via a $50,000 Kaggle Competition

arXiv:2511.20963v4 Announce Type: replace-cross Abstract: Subgrid machine-learning (ML) parameterizations have the potential to introduce a new generation of climate models that incorporate the effects of higher-resolution physics without incurring the prohibitive computational cost associated with more explicit physics-based simulations. However,…

March 10, 2026

From ARIMA to Attention: Power Load Forecasting Using Temporal Deep Learning

arXiv:2603.06622v1 Announce Type: new Abstract: Accurate short-term power load forecasting is important to effectively manage, optimize, and ensure the robustness of modern power systems. This paper performs an empirical evaluation of a traditional statistical model and deep learning approaches for…

March 10, 2026

MEM: Multi-Scale Embodied Memory for Vision Language Action Models

arXiv:2603.03596v2 Announce Type: replace-cross Abstract: Conventionally, memory in end-to-end robotic learning involves inputting a sequence of past observations into the learned policy. However, in complex multi-stage real-world tasks, the robot’s memory must represent past events at multiple levels of granularity:…

March 10, 2026