Archives AI News

Analytical FFN-to-MoE Restructuring via Activation Pattern Analysis

arXiv:2502.04416v3 Announce Type: replace Abstract: Scaling large language models (LLMs) improves performance but significantly increases inference costs, with feed-forward networks (FFNs) consuming the majority of computational resources. While Mixture-of-Experts (MoE) architectures can reduce this cost through sparse activation, restructuring existing…

April 24, 2026

ICNN-enhanced 2SP: Leveraging input convex neural networks for solving two-stage stochastic programming

arXiv:2505.05261v3 Announce Type: replace-cross Abstract: Two-stage stochastic programming (2SP) offers a basic framework for modelling decision-making under uncertainty, yet scalability remains a challenge due to the computational complexity of recourse function evaluation. Existing learning-based methods like Neural Two-Stage Stochastic Programming…

April 24, 2026

Beyond Expected Information Gain: Stable Bayesian Optimal Experimental Design with Integral Probability Metrics and Plug-and-Play Extensions

arXiv:2604.21849v1 Announce Type: cross Abstract: Bayesian Optimal Experimental Design (BOED) provides a rigorous framework for decision-making tasks in which data acquisition is often the critical bottleneck, especially in resource-constrained settings. Traditionally, BOED typically selects designs by maximizing expected information gain…

April 24, 2026

LAF-Based Evaluation and UTTL-Based Learning Strategies with MIATTs

arXiv:2604.20944v1 Announce Type: new Abstract: In many real-world machine learning (ML) applications, the true target cannot be precisely defined due to ambiguity or subjectivity information. To address this challenge, under the assumption that the true target for a given ML…

April 24, 2026

Differentially Private Model Merging

arXiv:2604.20985v1 Announce Type: new Abstract: In machine learning applications, privacy requirements during inference or deployment time could change constantly due to varying policies, regulations, or user experience. In this work, we aim to generate a magnitude of models to satisfy…

April 24, 2026

Revealing Geography-Driven Signals in Zone-Level Claim Frequency Models: An Empirical Study using Environmental and Visual Predictors

arXiv:2604.21893v1 Announce Type: cross Abstract: Geographic context is often consider relevant to motor insurance risk, yet public actuarial datasets provide limited location identifiers, constraining how this information can be incorporated and evaluated in claim-frequency models. This study examines how geographic…

April 24, 2026

HyperAdapt: Simple High-Rank Adaptation

arXiv:2509.18629v3 Announce Type: replace Abstract: Foundation models excel across diverse tasks, but adapting them to specialized applications often requires fine-tuning, an approach that is memory and compute-intensive. Parameter-efficient fine-tuning (PEFT) methods mitigate this by updating only a small subset of…

April 24, 2026

Mitigating Lost in Multi-turn Conversation via Curriculum RL with Verifiable Accuracy and Abstention Rewards

arXiv:2510.18731v2 Announce Type: replace-cross Abstract: Large Language Models demonstrate strong capabilities in single-turn instruction following but suffer from Lost-in-Conversation (LiC), a degradation in performance as information is revealed progressively in multi-turn settings. Motivated by the current progress on Reinforcement Learning…

April 24, 2026

Early Detection of Latent Microstructure Regimes in Limit Order Books

arXiv:2604.20949v1 Announce Type: new Abstract: Limit order books can transition rapidly from stable to stressed conditions, yet standard early-warning signals such as order flow imbalance and short-term volatility are inherently reactive. We formalise this limitation via a three-regime causal data-generating…

April 24, 2026

Droplet-LNO: Physics-Informed Laplace Neural Operators for Accurate Prediction of Droplet Spreading Dynamics on Complex Surfaces

arXiv:2604.20993v1 Announce Type: new Abstract: Spreading of liquid droplets on solid substrates constitutes a classic multiphysics problem with widespread applications ranging from inkjet printing, spray cooling, to biomedical microfluidic systems. Yet, accurate computational fluid dynamic (CFD) simulations are prohibitively expensive,…

April 24, 2026