Archives AI News

Zero-Order Optimization for LLM Fine-Tuning via Learnable Direction Sampling

arXiv:2602.13659v1 Announce Type: new Abstract: Fine-tuning large pretrained language models (LLMs) is a cornerstone of modern NLP, yet its growing memory demands (driven by backpropagation and large optimizer States) limit deployment in resource-constrained settings. Zero-order (ZO) methods bypass backpropagation by…

February 17, 2026

Evolution Strategies at the Hyperscale

arXiv:2511.16652v2 Announce Type: replace Abstract: Evolution Strategies (ES) is a class of powerful black-box optimisation methods that are highly parallelisable and can handle non-differentiable and noisy objectives. However, na”ive ES becomes prohibitively expensive at scale on GPUs due to the…

February 17, 2026

Optimized Certainty Equivalent Risk-Controlling Prediction Sets

arXiv:2602.13660v1 Announce Type: new Abstract: In safety-critical applications such as medical image segmentation, prediction systems must provide reliability guarantees that extend beyond conventional expected loss control. While risk-controlling prediction sets (RCPS) offer probabilistic guarantees on the expected risk, they fail…

February 17, 2026

Efficient Tensor Completion Algorithms for Highly Oscillatory Operators

arXiv:2510.17734v3 Announce Type: replace-cross Abstract: This paper presents low-complexity tensor completion algorithms and their efficient implementation to reconstruct highly oscillatory operators discretized as $ntimes n$ matrices. The underlying tensor decomposition is based on the reshaping of the input matrix and…

February 17, 2026

Deep Two-Way Matrix Reordering for Relational Data Analysis

arXiv:2103.14203v5 Announce Type: replace-cross Abstract: Matrix reordering is a task to permute the rows and columns of a given observed matrix such that the resulting reordered matrix shows meaningful or interpretable structural patterns. Most existing matrix reordering techniques share the…

February 17, 2026

Lorica: A Synergistic Fine-Tuning Framework for Advancing Personalized Adversarial Robustness

arXiv:2506.05402v3 Announce Type: replace-cross Abstract: The growing use of large pre-trained models in edge computing has made model inference on mobile clients both feasible and popular. Yet these devices remain vulnerable to adversarial attacks, threatening model robustness and security. Federated…

February 17, 2026

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

arXiv:2509.16117v2 Announce Type: replace Abstract: Online reinforcement learning (RL) has been central to post-training language models, but its extension to diffusion models remains challenging due to intractable likelihoods. Recent works discretize the reverse sampling process to enable GRPO-style training, yet…

February 17, 2026

Critic-Guided Reinforcement Unlearning in Text-to-Image Diffusion

arXiv:2601.03213v3 Announce Type: replace Abstract: Machine unlearning in text-to-image diffusion models aims to remove targeted concepts while preserving overall utility. Prior diffusion unlearning methods typically rely on supervised weight edits or global penalties; reinforcement-learning (RL) approaches, while flexible, often optimize…

February 17, 2026

Faster Molecular Dynamics with Neural Network Potentials via Distilled Multiple Time-Stepping and Non-Conservative Forces

arXiv:2602.14975v1 Announce Type: cross Abstract: Following our previous work (J. Phys. Chem. Lett., 2026, 17, 5, 1288-1295), we propose the DMTS-NC approach, a distilled multi-time-step (DMTS) strategy using non conservative (NC) forces to further accelerate atomistic molecular dynamics simulations using…

February 17, 2026

Robust Multi-Objective Controlled Decoding of Large Language Models

arXiv:2503.08796v2 Announce Type: replace Abstract: We introduce Robust Multi-Objective Decoding (RMOD), a novel inference-time algorithm that robustly aligns Large Language Models (LLMs) to multiple human objectives (e.g., instruction-following, helpfulness, safety) by maximizing the worst-case rewards. RMOD formulates the robust decoding…

February 17, 2026