Archives AI News

Experiential Reflective Learning for Self-Improving LLM Agents

arXiv:2603.24639v2 Announce Type: replace Abstract: Recent advances in large language models (LLMs) have enabled the development of autonomous agents capable of complex reasoning and multi-step problem solving. However, these agents struggle to adapt to specialized environments and do not leverage…

April 2, 2026

Offline Constrained RLHF with Multiple Preference Oracles

arXiv:2604.00200v1 Announce Type: new Abstract: We study offline constrained reinforcement learning from human feedback with multiple preference oracles. Motivated by applications that trade off performance with safety or fairness, we aim to maximize target population utility subject to a minimum…

April 2, 2026

High Probability Complexity Bounds of Trust-Region Stochastic Sequential Quadratic Programming with Heavy-Tailed Noise

arXiv:2503.19091v3 Announce Type: replace-cross Abstract: In this paper, we consider nonlinear optimization problems with a stochastic objective and deterministic equality constraints. We propose a Trust-Region Stochastic Sequential Quadratic Programming (TR-SSQP) method and establish its high-probability iteration complexity bounds for identifying…

April 2, 2026

Unsupervised 4D Flow MRI Velocity Enhancement and Unwrapping Using Divergence-Free Neural Networks

arXiv:2604.00205v1 Announce Type: new Abstract: This work introduces an unsupervised Divergence and Aliasing-Free neural network (DAF-FlowNet) for 4D Flow Magnetic Resonance Imaging (4D Flow MRI) that jointly enhances noisy velocity fields and corrects phase wrapping artifacts. DAF-FlowNet parameterizes velocities as…

April 2, 2026

TempoControl: Temporal Attention Guidance for Text-to-Video Models

arXiv:2510.02226v3 Announce Type: replace-cross Abstract: Recent advances in generative video models have enabled the creation of high-quality videos based on natural language prompts. However, these models frequently lack fine-grained temporal control, meaning they do not allow users to specify when…

April 2, 2026

Lead Zirconate Titanate Reservoir Computing for Classification of Written and Spoken Digits

arXiv:2604.00207v1 Announce Type: new Abstract: In this paper we extend our earlier work of (Rietman et al. 2022) presenting an application of physical Reservoir Computing (RC) to the classification of handwritten and spoken digits. We utilize an unpoled cube of…

April 2, 2026

Exact Graph Learning via Integer Programming

arXiv:2601.20589v2 Announce Type: replace-cross Abstract: Learning the dependence structure among variables in complex systems is a central problem across medical, natural, and social sciences. These structures can be naturally represented by graphs, and the task of inferring such graphs from…

April 2, 2026

Measuring the Representational Alignment of Neural Systems in Superposition

arXiv:2604.00208v1 Announce Type: new Abstract: Comparing the internal representations of neural networks is a central goal in both neuroscience and machine learning. Standard alignment metrics operate on raw neural activations, implicitly assuming that similar representations produce similar activity patterns. However,…

April 2, 2026

Preference Guided Iterated Pareto Referent Optimisation for Accessible Route Planning

arXiv:2604.00795v1 Announce Type: cross Abstract: We propose the Preference Guided Iterated Pareto Referent Optimisation (PG-IPRO) for urban route planning for people with different accessibility requirements and preferences. With this algorithm the user can interact with the system by giving feedback…

April 2, 2026

Diversity-Aware Reverse Kullback-Leibler Divergence for Large Language Model Distillation

arXiv:2604.00223v1 Announce Type: new Abstract: Reverse Kullback-Leibler (RKL) divergence has recently emerged as the preferred objective for large language model (LLM) distillation, consistently outperforming forward KL (FKL), particularly in regimes with large vocabularies and significant teacher-student capacity mismatch, where RKL…

April 2, 2026