Archives AI News

Jailbreaking LLMs Without Gradients or Priors: Effective and Transferable Attacks

arXiv:2601.03420v1 Announce Type: new Abstract: As Large Language Models (LLMs) are increasingly deployed in safety-critical domains, rigorously evaluating their robustness against adversarial jailbreaks is essential. However, current safety evaluations often overestimate robustness because existing automated attacks are limited by restrictive…

January 8, 2026

Compact Example-Based Explanations for Language Models

arXiv:2601.03786v1 Announce Type: cross Abstract: Training data influence estimation methods quantify the contribution of training documents to a model’s output, making them a promising source of information for example-based explanations. As humans cannot interpret thousands of documents, only a small…

January 8, 2026

Spectral Archaeology: The Causal Topology of Model Evolution

arXiv:2601.03424v1 Announce Type: new Abstract: Behavioral benchmarks tell us textit{what} a model does, but not textit{how}. We introduce a training-free mechanistic probe using attention-graph spectra. Treating each layer as a token graph, we compute algebraic connectivity ($lambda_2$), smoothness, and spectral…

January 8, 2026

Current Agents Fail to Leverage World Model as Tool for Foresight

arXiv:2601.03905v1 Announce Type: cross Abstract: Agents built on vision-language models increasingly face tasks that demand anticipating future states rather than relying on short-horizon reasoning. Generative world models offer a promising remedy: agents could use them as external simulators to foresee…

January 8, 2026

The Illusion of Specialization: Unveiling the Domain-Invariant “Standing Committee” in Mixture-of-Experts Models

arXiv:2601.03425v1 Announce Type: new Abstract: Mixture of Experts models are widely assumed to achieve domain specialization through sparse routing. In this work, we question this assumption by introducing COMMITTEEAUDIT, a post hoc framework that analyzes routing behavior at the level…

January 8, 2026

A Single-Loop Bilevel Deep Learning Method for Optimal Control of Obstacle Problems

arXiv:2601.04120v1 Announce Type: cross Abstract: Optimal control of obstacle problems arises in a wide range of applications and is computationally challenging due to its nonsmoothness, nonlinearity, and bilevel structure. Classical numerical approaches rely on mesh-based discretization and typically require solving…

January 8, 2026

VNU-Bench: A Benchmarking Dataset for Multi-Source Multimodal News Video Understanding

arXiv:2601.03434v1 Announce Type: new Abstract: News videos are carefully edited multimodal narratives that combine narration, visuals, and external quotations into coherent storylines. In recent years, there have been significant advances in evaluating multimodal large language models (MLLMs) for news video…

January 8, 2026

Tipping Point Forecasting in Non-Stationary Dynamics on Function Spaces

arXiv:2308.08794v2 Announce Type: replace Abstract: Tipping points are abrupt, drastic, and often irreversible changes in the evolution of non-stationary and chaotic dynamical systems. For instance, increased greenhouse gas concentrations are predicted to lead to drastic decreases in low cloud cover,…

January 8, 2026

Soft Contextualized Encoder For User Defined Text Classification

arXiv:2601.03450v1 Announce Type: new Abstract: User-Defined Text Classification (UDTC) considers the challenge of classifying input text to user-specified, previously unseen classes, a setting that arises frequently in real-world applications such as enterprise analytics, content moderation, and domain-specific information retrieval. We…

January 8, 2026

TimeDistill: Efficient Long-Term Time Series Forecasting with MLP via Cross-Architecture Distillation

arXiv:2502.15016v3 Announce Type: replace Abstract: Transformer-based and CNN-based methods demonstrate strong performance in long-term time series forecasting. However, their high computational and storage requirements can hinder large-scale deployment. To address this limitation, we propose integrating lightweight MLP with advanced architectures…

January 8, 2026