Archives AI News

Learning Dynamics of VLM Finetuning

arXiv:2510.11978v1 Announce Type: new Abstract: Preference-based finetuning of vision–language models (VLMs) is brittle: trivially wrong negatives inject uninformative gradients that destabilize training. We recast alignment as textbf{learning-dynamics–aware optimization} and introduce textbf{Cooling-Weighted DPO (CW-DPO)}, a two-stage recipe that explicitly models and…

October 15, 2025

Toward Fair Graph Neural Networks Via Dual-Teacher Knowledge Distillation

arXiv:2412.00382v2 Announce Type: replace Abstract: Graph Neural Networks (GNNs) have demonstrated strong performance in graph representation learning across various real-world applications. However, they often produce biased predictions caused by sensitive attributes, such as religion or gender, an issue that has…

October 15, 2025

Learning by Steering the Neural Dynamics: A Statistical Mechanics Perspective

arXiv:2510.11984v1 Announce Type: new Abstract: Despite the striking successes of deep neural networks trained with gradient-based optimization, these methods differ fundamentally from their biological counterparts. This gap raises key questions about how nature achieves robust, sample-efficient learning at minimal energy…

October 15, 2025

How Vulnerable Is My Learned Policy? Universal Adversarial Perturbation Attacks On Modern Behavior Cloning Policies

arXiv:2502.03698v3 Announce Type: replace Abstract: Learning from Demonstration (LfD) algorithms have shown promising results in robotic manipulation tasks, but their vulnerability to offline universal perturbation attacks remains underexplored. This paper presents a comprehensive study of adversarial attacks on both classic…

October 15, 2025

Nonlinear discretizations and Newton’s method: characterizing stationary points of regression objectives

arXiv:2510.11987v1 Announce Type: new Abstract: Second-order methods are emerging as promising alternatives to standard first-order optimizers such as gradient descent and ADAM for training neural networks. Though the advantages of including curvature information in computing optimization steps have been celebrated…

October 15, 2025

MergeBench: A Benchmark for Merging Domain-Specialized LLMs

arXiv:2505.10833v4 Announce Type: replace Abstract: Model merging provides a scalable alternative to multi-task training by combining specialized finetuned models through parameter arithmetic, enabling efficient deployment without the need for joint training or access to all task data. While recent methods…

October 15, 2025

Mamaba Can Learn Low-Dimensional Targets In-Context via Test-Time Feature Learning

arXiv:2510.12026v1 Announce Type: new Abstract: Mamba, a recently proposed linear-time sequence model, has attracted significant attention for its computational efficiency and strong empirical performance. However, a rigorous theoretical understanding of its underlying mechanisms remains limited. In this work, we provide…

October 15, 2025

Why some quantum materials stall while others scale

In a new study, MIT researchers evaluated quantum materials’ potential for scalable commercial success — and identified promising candidates.

October 15, 2025

Offline Fictitious Self-Play for Competitive Games

arXiv:2403.00841v2 Announce Type: replace-cross Abstract: Offline Reinforcement Learning (RL) enables policy improvement from fixed datasets without online interactions, making it highly suitable for real-world applications lacking efficient simulators. Despite its success in the single-agent setting, offline multi-agent RL remains a…

October 15, 2025

Reconstruction of SINR Maps from Sparse Measurements using Group Equivariant Non-Expansive Operators

arXiv:2507.19349v2 Announce Type: replace Abstract: As sixth generation (6G) wireless networks evolve, accurate signal-to-interference-noise ratio (SINR) maps are becoming increasingly critical for effective resource management and optimization. However, acquiring such maps at high resolution is often cost-prohibitive, creating a severe…

October 15, 2025