Archives AI News

Utilizing and Calibrating Hindsight Process Rewards via Reinforcement with Mutual Information Self-Evaluation

arXiv:2604.11611v1 Announce Type: cross Abstract: To overcome the sparse reward challenge in reinforcement learning (RL) for agents based on large language models (LLMs), we propose Mutual Information Self-Evaluation (MISE), an RL paradigm that utilizes hindsight generative self-evaluation as dense reward…

April 14, 2026

Regularized Entropy Information Adaptation with Temporal-Awareness Networks for Simultaneous Speech Translation

arXiv:2604.09916v1 Announce Type: new Abstract: Simultaneous Speech Translation (SimulST) requires balancing high translation quality with low latency. Recent work introduced REINA, a method that trains a Read/Write policy based on estimating the information gain of reading more audio. However, we…

April 14, 2026

CROP: Conservative Reward for Model-based Offline Policy Optimization

arXiv:2310.17245v2 Announce Type: replace Abstract: Offline reinforcement learning (RL) aims to optimize a policy using collected data without online interactions. Model-based approaches are particularly appealing for addressing offline RL challenges because of their capability to mitigate the limitations of data…

April 14, 2026

A Tale of Two Temperatures: Simple, Efficient, and Diverse Sampling from Diffusion Language Models

arXiv:2604.09921v1 Announce Type: new Abstract: Much work has been done on designing fast and accurate sampling for diffusion language models (dLLMs). However, these efforts have largely focused on the tradeoff between speed and quality of individual samples; how to additionally…

April 14, 2026

Quotation-Based Data Retention Mechanism for Data Privacy in LLM-Empowered Network Services

arXiv:2503.23001v5 Announce Type: replace Abstract: The deployment of large language models (LLMs) for next-generation network optimization introduces novel data governance challenges. mobile network operators (MNOs) increasingly leverage generative artificial intelligence (AI) for traffic prediction, anomaly detection, and service personalization, requiring…

April 14, 2026

K-STEMIT: Knowledge-Informed Spatio-Temporal Efficient Multi-Branch Graph Neural Network for Subsurface Stratigraphy Thickness Estimation from Radar Data

arXiv:2604.09922v1 Announce Type: new Abstract: Subsurface stratigraphy contains important spatio-temporal information about accumulation, deformation, and layer formation in polar ice sheets. In particular, variations in internal ice layer thickness provide valuable constraints for snow mass balance estimation and projections of…

April 14, 2026

Teaching the Teacher: The Role of Teacher-Student Smoothness Alignment in Genetic Programming-based Symbolic Distillation

arXiv:2507.22767v3 Announce Type: replace Abstract: Obtaining human-readable symbolic formulas via genetic programming-based symbolic distillation of a deep neural network trained on a target dataset presents a promising yet underexplored pathway toward explainable artificial intelligence (XAI). However, the standard pipeline frequently…

April 14, 2026

A Hybrid Intelligent Framework for Uncertainty-Aware Condition Monitoring of Industrial Systems

arXiv:2604.09932v1 Announce Type: new Abstract: Hybrid approaches that combine data-driven learning with physics-based insight have shown promise for improving the reliability of industrial condition monitoring. This work develops a hybrid condition monitoring framework that integrates primary sensor measurements, lagged temporal…

April 14, 2026

Find Your Optimal Teacher: Personalized Data Synthesis via Router-Guided Multi-Teacher Distillation

arXiv:2510.10925v2 Announce Type: replace Abstract: Training student models on synthetic data generated by strong teacher models is a promising way to distilling the capabilities of teachers. However, recent studies show that stronger models are not always optimal teachers, revealing a…

April 14, 2026

Vestibular reservoir computing

arXiv:2604.09943v1 Announce Type: new Abstract: Reservoir computing (RC) is a computational framework known for its training efficiency, making it ideal for physical hardware implementations. However, realizing the complex interconnectivity of traditional reservoirs in physical systems remains a significant challenge. This…

April 14, 2026