Archives AI News

Differentially Private Model Merging

arXiv:2604.20985v1 Announce Type: new Abstract: In machine learning applications, privacy requirements during inference or deployment time could change constantly due to varying policies, regulations, or user experience. In this work, we aim to generate a magnitude of models to satisfy…

April 24, 2026

HyperAdapt: Simple High-Rank Adaptation

arXiv:2509.18629v3 Announce Type: replace Abstract: Foundation models excel across diverse tasks, but adapting them to specialized applications often requires fine-tuning, an approach that is memory and compute-intensive. Parameter-efficient fine-tuning (PEFT) methods mitigate this by updating only a small subset of…

April 24, 2026

Droplet-LNO: Physics-Informed Laplace Neural Operators for Accurate Prediction of Droplet Spreading Dynamics on Complex Surfaces

arXiv:2604.20993v1 Announce Type: new Abstract: Spreading of liquid droplets on solid substrates constitutes a classic multiphysics problem with widespread applications ranging from inkjet printing, spray cooling, to biomedical microfluidic systems. Yet, accurate computational fluid dynamic (CFD) simulations are prohibitively expensive,…

April 24, 2026

Tree Training: Accelerating Agentic LLMs Training via Shared Prefix Reuse

arXiv:2511.00413v5 Announce Type: replace Abstract: Agentic large language model (LLM) training often involves multi-turn interaction trajectories that branch into multiple execution paths due to concurrent tool use, think-mode, sub-agent, context management and other runtime designs. As a result, the tokens…

April 24, 2026

SGD at the Edge of Stability: The Stochastic Sharpness Gap

arXiv:2604.21016v1 Announce Type: new Abstract: When training neural networks with full-batch gradient descent (GD) and step size $eta$, the largest eigenvalue of the Hessian — the sharpness $S(boldsymbol{theta})$ — rises to $2/eta$ and hovers there, a phenomenon termed the Edge…

April 24, 2026

BackPlay: Head-Only Look-Back Self-Correction for Diffusion Language Models

arXiv:2601.06428v3 Announce Type: replace Abstract: Diffusion Language Models (DLMs) decode multiple tokens in parallel, but aggressive multi-token decoding amplifies cross-token dependency errors and can sharply degrade generation quality. We propose BackPlay, a frozen-backbone self-correction framework that trains only a lightweight…

April 24, 2026

MCAP: Deployment-Time Layer Profiling for Memory-Constrained LLM Inference

arXiv:2604.21026v1 Announce Type: new Abstract: Deploying large language models to heterogeneous hardware is often constrained by memory, not compute. We introduce MCAP (Monte Carlo Activation Profiling), a load-time per-layer importance estimator that enables dynamic precision and memory placement decisions on…

April 24, 2026

Continuous-Utility Direct Preference Optimization

arXiv:2602.00931v2 Announce Type: replace Abstract: Large language model reasoning is often treated as a monolithic capability, relying on binary preference supervision that fails to capture partial progress or fine-grained reasoning quality. We introduce Continuous Utility Direct Preference Optimization (CU-DPO), a…

April 24, 2026

A Deep U-Net Framework for Flood Hazard Mapping Using Hydraulic Simulations of the Wupper Catchment

arXiv:2604.21028v1 Announce Type: new Abstract: The increasing frequency and severity of global flood events highlights the need for the development of rapid and reliable flood prediction tools. This process traditionally relies on computationally expensive hydraulic simulations. This research presents a…

April 24, 2026

BioTrain: Sub-MB, Sub-50mW On-Device Fine-Tuning for Edge-AI on Biosignals

arXiv:2604.13359v2 Announce Type: replace Abstract: Biosignals exhibit substantial cross-subject and cross-session variability, inducing severe domain shifts that degrade post-deployment performance for small, edge-oriented AI models. On-device adaptation is therefore essential to both preserve user privacy and ensure system reliability. However,…

April 24, 2026