Archives AI News

Exploring Time-Step Size in Reinforcement Learning for Sepsis Treatment

arXiv:2511.20913v1 Announce Type: new Abstract: Existing studies on reinforcement learning (RL) for sepsis management have mostly followed an established problem setup, in which patient data are aggregated into 4-hour time steps. Although concerns have been raised regarding the coarseness of…

November 27, 2025

Single- vs. Dual-Policy Reinforcement Learning for Dynamic Bike Rebalancing

arXiv:2402.03589v2 Announce Type: replace Abstract: Bike-sharing systems (BSS) provide a sustainable urban mobility solution, but ensuring their reliability requires effective rebalancing strategies to address stochastic demand and prevent station imbalances. This paper proposes reinforcement learning (RL) algorithms for dynamic rebalancing…

November 27, 2025

Operationalizing Quantized Disentanglement

arXiv:2511.20927v1 Announce Type: new Abstract: Recent theoretical work established the unsupervised identifiability of quantized factors under any diffeomorphism. The theory assumes that quantization thresholds correspond to axis-aligned discontinuities in the probability density of the latent factors. By constraining a learned…

November 27, 2025

No Request Left Behind: Tackling Heterogeneity in Long-Context LLM Inference with Medha

arXiv:2409.17264v5 Announce Type: replace Abstract: Deploying million-token Large Language Models (LLMs) is challenging because production workloads are highly heterogeneous, mixing short queries and long documents. This heterogeneity, combined with the quadratic complexity of attention, creates severe convoy effects where long-running…

November 27, 2025

A Gray-box Attack against Latent Diffusion Model-based Image Editing by Posterior Collapse

arXiv:2408.10901v4 Announce Type: replace-cross Abstract: Recent advancements in Latent Diffusion Models (LDMs) have revolutionized image synthesis and manipulation, raising significant concerns about data misappropriation and intellectual property infringement. While adversarial attacks have been extensively explored as a protective measure against…

November 27, 2025

QiMeng-SALV: Signal-Aware Learning for Verilog Code Generation

arXiv:2510.19296v3 Announce Type: replace Abstract: The remarkable progress of Large Language Models (LLMs) presents promising opportunities for Verilog code generation which is significantly important for automated circuit design. The lacking of meaningful functional rewards hinders the preference optimization based on…

November 27, 2025

scipy.spatial.transform: Differentiable Framework-Agnostic 3D Transformations in Python

arXiv:2511.18157v2 Announce Type: replace Abstract: Three-dimensional rigid-body transforms, i.e. rotations and translations, are central to modern differentiable machine learning pipelines in robotics, vision, and simulation. However, numerically robust and mathematically correct implementations, particularly on SO(3), are error-prone due to issues…

November 27, 2025

A Unifying View of Linear Function Approximation in Off-Policy RL Through Matrix Splitting and Preconditioning

arXiv:2501.01774v3 Announce Type: replace Abstract: In off-policy policy evaluation (OPE) tasks within reinforcement learning, Temporal Difference Learning(TD) and Fitted Q-Iteration (FQI) have traditionally been viewed as differing in the number of updates toward the target value function: TD makes one…

November 27, 2025

Fair Algorithms with Probing for Multi-Agent Multi-Armed Bandits

arXiv:2506.14988v4 Announce Type: replace Abstract: We propose a multi-agent multi-armed bandit (MA-MAB) framework aimed at ensuring fair outcomes across agents while maximizing overall system performance. A key challenge in this setting is decision-making under limited information about arm rewards. To…

November 27, 2025

Differentiable Physics-Neural Models enable Learning of Non-Markovian Closures for Accelerated Coarse-Grained Physics Simulations

arXiv:2511.21369v1 Announce Type: cross Abstract: Numerical simulations provide key insights into many physical, real-world problems. However, while these simulations are solved on a full 3D domain, most analysis only require a reduced set of metrics (e.g. plane-level concentrations). This work…

November 27, 2025