Archives AI News

Mechanistic Interpretability with Sparse Autoencoder Neural Operators

arXiv:2509.03738v3 Announce Type: replace Abstract: We introduce sparse autoencoder neural operators (SAE-NOs), a new class of sparse autoencoders that operate directly in infinite-dimensional function spaces. We generalize the linear representation hypothesis to a functional representation hypothesis, enabling concept learning beyond…

Adaptive Time Series Reasoning via Segment Selection

arXiv:2602.18645v1 Announce Type: new Abstract: Time series reasoning tasks often start with a natural language question and require targeted analysis of a time series. Evidence may span the full series or appear in a few short intervals, so the model…

The Unseen Frontier: Pushing the Limits of LLM Sparsity with Surrogate-Free ADMM

arXiv:2510.01650v2 Announce Type: replace Abstract: Neural network pruning is a promising technique to mitigate the excessive computational and memory requirements of large language models (LLMs). Despite its promise, however, progress in this area has diminished, as conventional methods are seemingly…

Information-Guided Noise Allocation for Efficient Diffusion Training

arXiv:2602.18647v1 Announce Type: new Abstract: Training diffusion models typically relies on manually tuned noise schedules, which can waste computation on weakly informative noise regions and limit transfer across datasets, resolutions, and representations. We revisit noise schedule allocation through an information-theoretic…

Leak@$k$: Unlearning Does Not Make LLMs Forget Under Probabilistic Decoding

arXiv:2511.04934v2 Announce Type: replace Abstract: Unlearning in large language models (LLMs) is critical for regulatory compliance and for building ethical generative AI systems that avoid producing private, toxic, illegal, or copyrighted content. Despite rapid progress, in this work we show…

Efficient Discriminative Joint Encoders for Large Scale Vision-Language Reranking

arXiv:2510.06820v2 Announce Type: replace-cross Abstract: Multimodal retrieval still leans on embedding-based models like CLIP for fast vector search over pre-computed image embeddings. Yet, unlike text retrieval, where joint-encoder rerankers are standard, comparable vision-language rerankers are largely absent. We find that…

Interpretable Failure Analysis in Multi-Agent Reinforcement Learning Systems

arXiv:2602.08104v2 Announce Type: replace-cross Abstract: Multi-Agent Reinforcement Learning (MARL) is increasingly deployed in safety-critical domains, yet methods for interpretable failure detection and attribution remain underdeveloped. We introduce a two-stage gradient-based framework that provides interpretable diagnostics for three critical failure analysis…