Archives AI News

Low-Regret and Low-Complexity Learning for Hierarchical Inference

arXiv:2508.08985v3 Announce Type: replace Abstract: This work focuses on Hierarchical Inference (HI) in edge intelligence systems, where a compact Local-ML model on an end-device works in conjunction with a high-accuracy Remote-ML model on an edge-server. HI aims to reduce latency,…

December 23, 2025

Stable and Efficient Single-Rollout RL for Multimodal Reasoning

arXiv:2512.18215v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has become a key paradigm to improve the reasoning capabilities of Multimodal Large Language Models (MLLMs). However, prevalent group-based algorithms such as GRPO require multi-rollout sampling for each prompt.…

December 23, 2025

DiffEM: Learning from Corrupted Data with Diffusion Models via Expectation Maximization

arXiv:2510.12691v3 Announce Type: replace Abstract: Diffusion models have emerged as powerful generative priors for high-dimensional inverse problems, yet learning them when only corrupted or noisy observations are available remains challenging. In this work, we propose a new method for training…

December 23, 2025

Offline Behavioral Data Selection

arXiv:2512.18246v1 Announce Type: new Abstract: Behavioral cloning is a widely adopted approach for offline policy learning from expert demonstrations. However, the large scale of offline behavioral datasets often results in computationally intensive training when used in downstream tasks. In this…

December 23, 2025

Spectral Concentration at the Edge of Stability: Information Geometry of Kernel Associative Memory

arXiv:2511.23083v5 Announce Type: replace Abstract: High-capacity kernel Hopfield networks exhibit a textit{Ridge of Optimization} characterized by extreme stability. While previously linked to textit{Spectral Concentration}, its origin remains elusive. Here, we analyze the network dynamics on a statistical manifold, revealing that…

December 23, 2025

On the Convergence Rate of LoRA Gradient Descent

arXiv:2512.18248v1 Announce Type: new Abstract: The low-rank adaptation (LoRA) algorithm for fine-tuning large models has grown popular in recent years due to its remarkable performance and low computational requirements. LoRA trains two “adapter” matrices that form a low-rank representation of…

December 23, 2025

One pull of a string is all it takes to deploy these complex structures

A new method could enable users to design portable medical devices, like a splint, that can be rapidly converted from flat panels to a 3D object without any tools.

December 23, 2025

PEDESTRIAN: An Egocentric Vision Dataset for Obstacle Detection on Pavements

arXiv:2512.19190v1 Announce Type: cross Abstract: Walking has always been a primary mode of transportation and is recognized as an essential activity for maintaining good health. Despite the need for safe walking conditions in urban environments, sidewalks are frequently obstructed by…

December 23, 2025

Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning

arXiv:2301.11321v3 Announce Type: replace Abstract: Off-policy learning from multistep returns is crucial for sample-efficient reinforcement learning, but counteracting off-policy bias without exacerbating variance is challenging. Classically, off-policy bias is corrected in a per-decision manner: past temporal-difference errors are re-weighted by…

December 23, 2025

Enhancing Multi-Agent Collaboration with Attention-Based Actor-Critic Policies

arXiv:2507.22782v3 Announce Type: replace-cross Abstract: This paper introduces Team-Attention-Actor-Critic (TAAC), a reinforcement learning algorithm designed to enhance multi-agent collaboration in cooperative environments. TAAC employs a Centralized Training/Centralized Execution scheme incorporating multi-headed attention mechanisms in both the actor and critic. This…

December 23, 2025