Archives AI News

IRIS: Interpolative R’enyi Iterative Self-play for Large Language Model Fine-Tuning

arXiv:2604.20933v1 Announce Type: new Abstract: Self-play fine-tuning enables large language models to improve beyond supervised fine-tuning without additional human annotations by contrasting annotated responses with self-generated ones. Many existing methods rely on a fixed divergence regime. SPIN is closely related…

April 24, 2026

ILDR: Geometric Early Detection of Grokking

arXiv:2604.20923v1 Announce Type: new Abstract: Grokking describes a delayed generalization phenomenon in which a neural network achieves perfect training accuracy long before validation accuracy improves, followed by an abrupt transition to strong generalization. Existing detection signals are indirect: weight norm…

April 24, 2026

Neural surrogates for crystal growth dynamics with variable supersaturation: explicit vs. implicit conditioning

arXiv:2604.21753v1 Announce Type: cross Abstract: Simulations of crystal growth are performed by using Convolutional Recurrent Neural Network surrogate models, trained on a dataset of time sequences computed by numerical integration of Allen-Cahn dynamics including faceting via kinetic anisotropy. Two network…

April 24, 2026

ICNN-enhanced 2SP: Leveraging input convex neural networks for solving two-stage stochastic programming

arXiv:2505.05261v3 Announce Type: replace-cross Abstract: Two-stage stochastic programming (2SP) offers a basic framework for modelling decision-making under uncertainty, yet scalability remains a challenge due to the computational complexity of recourse function evaluation. Existing learning-based methods like Neural Two-Stage Stochastic Programming…

April 24, 2026

Beyond Expected Information Gain: Stable Bayesian Optimal Experimental Design with Integral Probability Metrics and Plug-and-Play Extensions

arXiv:2604.21849v1 Announce Type: cross Abstract: Bayesian Optimal Experimental Design (BOED) provides a rigorous framework for decision-making tasks in which data acquisition is often the critical bottleneck, especially in resource-constrained settings. Traditionally, BOED typically selects designs by maximizing expected information gain…

April 24, 2026

LAF-Based Evaluation and UTTL-Based Learning Strategies with MIATTs

arXiv:2604.20944v1 Announce Type: new Abstract: In many real-world machine learning (ML) applications, the true target cannot be precisely defined due to ambiguity or subjectivity information. To address this challenge, under the assumption that the true target for a given ML…

April 24, 2026

Data-Driven Open-Loop Simulation for Digital-Twin Operator Decision Support in Wastewater Treatment

arXiv:2604.20935v1 Announce Type: new Abstract: Wastewater treatment plants (WWTPs) need digital-twin-style decision support tools that can simulate plant response under prescribed control plans, tolerate irregular and missing sensing, and remain informative over 12-36 h planning horizons. Meeting these requirements with…

April 24, 2026

Revealing Geography-Driven Signals in Zone-Level Claim Frequency Models: An Empirical Study using Environmental and Visual Predictors

arXiv:2604.21893v1 Announce Type: cross Abstract: Geographic context is often consider relevant to motor insurance risk, yet public actuarial datasets provide limited location identifiers, constraining how this information can be incorporated and evaluated in claim-frequency models. This study examines how geographic…

April 24, 2026

Mitigating Lost in Multi-turn Conversation via Curriculum RL with Verifiable Accuracy and Abstention Rewards

arXiv:2510.18731v2 Announce Type: replace-cross Abstract: Large Language Models demonstrate strong capabilities in single-turn instruction following but suffer from Lost-in-Conversation (LiC), a degradation in performance as information is revealed progressively in multi-turn settings. Motivated by the current progress on Reinforcement Learning…

April 24, 2026

Early Detection of Latent Microstructure Regimes in Limit Order Books

arXiv:2604.20949v1 Announce Type: new Abstract: Limit order books can transition rapidly from stable to stressed conditions, yet standard early-warning signals such as order flow imbalance and short-term volatility are inherently reactive. We formalise this limitation via a three-regime causal data-generating…

April 24, 2026