Archives AI News

Federated Learning of Nonlinear Temporal Dynamics with Graph Attention-based Cross-Client Interpretability

arXiv:2602.13485v1 Announce Type: new Abstract: Networks of modern industrial systems are increasingly monitored by distributed sensors, where each system comprises multiple subsystems generating high dimensional time series data. These subsystems are often interdependent, making it important to understand how temporal…

February 17, 2026

Online Posterior Sampling with a Diffusion Prior

arXiv:2410.03919v2 Announce Type: replace Abstract: Posterior sampling in contextual bandits with a Gaussian prior can be implemented exactly or approximately using the Laplace approximation. The Gaussian prior is computationally efficient but it cannot describe complex distributions. In this work, we…

February 17, 2026

Preventing Rank Collapse in Federated Low-Rank Adaptation with Client Heterogeneity

arXiv:2602.13486v1 Announce Type: new Abstract: Federated low-rank adaptation (FedLoRA) has facilitated communication-efficient and privacy-preserving fine-tuning of foundation models for downstream tasks. In practical federated learning scenarios, client heterogeneity in system resources and data distributions motivates heterogeneous LoRA ranks across clients.…

February 17, 2026

Calibrated Predictive Lower Bounds on Time-to-Unsafe-Sampling in LLMs

arXiv:2506.13593v5 Announce Type: replace Abstract: We introduce time-to-unsafe-sampling, a novel safety measure for generative models, defined as the number of generations required by a large language model (LLM) to trigger an unsafe (e.g., toxic) response. While providing a new dimension…

February 17, 2026

TrasMuon: Trust-Region Adaptive Scaling for Orthogonalized Momentum Optimizers

arXiv:2602.13498v1 Announce Type: new Abstract: Muon-style optimizers leverage Newton-Schulz (NS) iterations to orthogonalize updates, yielding update geometries that often outperform Adam-series methods. However, this orthogonalization discards magnitude information, rendering training sensitive to step-size hyperparameters and vulnerable to high-energy bursts. To…

February 17, 2026

Discrete State Diffusion Models: A Sample Complexity Perspective

arXiv:2510.10854v2 Announce Type: replace Abstract: Diffusion models have demonstrated remarkable performance in generating high-dimensional samples across domains such as vision, language, and the sciences. Although continuous-state diffusion models have been extensively studied both empirically and theoretically, discrete-state diffusion models, essential…

February 17, 2026

$gamma$-weakly $theta$-up-concavity: Linearizable Non-Convex Optimization with Applications to DR-Submodular and OSS Functions

arXiv:2602.13506v1 Announce Type: new Abstract: Optimizing monotone non-convex functions is a fundamental challenge across machine learning and combinatorial optimization. We introduce and study $gamma$-weakly $theta$-up-concavity, a novel first-order condition that characterizes a broad class of such functions. This condition provides…

February 17, 2026

NeuroPareto: Calibrated Acquisition for Costly Many-Goal Search in Vast Parameter Spaces

arXiv:2602.03901v2 Announce Type: replace Abstract: The pursuit of optimal trade-offs in high-dimensional search spaces under stringent computational constraints poses a fundamental challenge for contemporary multi-objective optimization. We develop NeuroPareto, a cohesive architecture that integrates rank-centric filtering, uncertainty disentanglement, and history-conditioned…

February 17, 2026

Singular Vectors of Attention Heads Align with Features

arXiv:2602.13524v1 Announce Type: new Abstract: Identifying feature representations in language models is a central task in mechanistic interpretability. Several recent studies have made an implicit assumption that feature representations can be inferred in some cases from singular vectors of attention…

February 17, 2026

Guaranteed Nonconvex Low-Rank Tensor Estimation via Scaled Gradient Descent

arXiv:2501.01696v2 Announce Type: replace-cross Abstract: Tensors, which give a faithful and effective representation to deliver the intrinsic structure of multi-dimensional data, play a crucial role in an increasing number of signal processing and machine learning problems. However, tensor data are…

February 17, 2026