Archives AI News

Teaching Metric Distance to Discrete Autoregressive Language Models

arXiv:2503.02379v4 Announce Type: replace Abstract: As large language models expand beyond natural language to domains such as mathematics, multimodal understanding, and embodied agents, tokens increasingly reflect metric relationships rather than purely linguistic meaning. We introduce DIST2Loss, a distance-aware framework designed…

October 8, 2025

CMT-Benchmark: A Benchmark for Condensed Matter Theory Built by Expert Researchers

arXiv:2510.05228v1 Announce Type: new Abstract: Large language models (LLMs) have shown remarkable progress in coding and math problem-solving, but evaluation on advanced research-level problems in hard sciences remains scarce. To fill this gap, we present CMT-Benchmark, a dataset of 50…

October 8, 2025

Pref-GUIDE: Continual Policy Learning from Real-Time Human Feedback via Preference-Based Learning

arXiv:2508.07126v2 Announce Type: replace Abstract: Training reinforcement learning agents with human feedback is crucial when task objectives are difficult to specify through dense reward functions. While prior methods rely on offline trajectory comparisons to elicit human preferences, such data is…

October 8, 2025

Simultaneous Learning and Optimization via Misspecified Saddle Point Problems

arXiv:2510.05241v1 Announce Type: new Abstract: We study a class of misspecified saddle point (SP) problems, where the optimization objective depends on an unknown parameter that must be learned concurrently from data. Unlike existing studies that assume parameters are fully known…

October 8, 2025

ECLipsE-Gen-Local: Efficient Compositional Local Lipschitz Estimates for Deep Neural Networks

arXiv:2510.05261v1 Announce Type: new Abstract: The Lipschitz constant is a key measure for certifying the robustness of neural networks to input perturbations. However, computing the exact constant is NP-hard, and standard approaches to estimate the Lipschitz constant involve solving a…

October 8, 2025

A Generative Approach to LLM Harmfulness Mitigation with Red Flag Tokens

arXiv:2502.16366v4 Announce Type: replace-cross Abstract: Many safety post-training methods for large language models (LLMs) are designed to modify the model’s behaviour from producing unsafe answers to issuing refusals. However, such distribution shifts are often brittle and degrade performance on desirable…

October 8, 2025

Decoding Partial Differential Equations: Cross-Modal Adaptation of Decoder-only Models to PDEs

arXiv:2510.05278v1 Announce Type: new Abstract: Large language models have shown great success on natural language tasks in recent years, but they have also shown great promise when adapted to new modalities, e.g., for scientific machine learning tasks. Even though decoder-only…

October 8, 2025

A Fairness-Aware Strategy for B5G Physical-layer Security Leveraging Reconfigurable Intelligent Surfaces

arXiv:2506.06344v3 Announce Type: replace-cross Abstract: Reconfigurable Intelligent Surfaces are composed of physical elements that can dynamically alter electromagnetic wave properties to enhance beamforming and lead to improvements in areas with low coverage properties. Combined with Reinforcement Learning techniques, they have…

October 8, 2025

Adjusting the Output of Decision Transformer with Action Gradient

arXiv:2510.05285v1 Announce Type: new Abstract: Decision Transformer (DT), which integrates reinforcement learning (RL) with the transformer model, introduces a novel approach to offline RL. Unlike classical algorithms that take maximizing cumulative discounted rewards as objective, DT instead maximizes the likelihood…

October 8, 2025

Computing frustration and near-monotonicity in deep neural networks

arXiv:2510.05286v1 Announce Type: new Abstract: For the signed graph associated to a deep neural network, one can compute the frustration level, i.e., test how close or distant the graph is to structural balance. For all the pretrained deep convolutional neural…

October 8, 2025