Archives AI News

The Speed-up Factor: A Quantitative Multi-Iteration Active Learning Performance Metric

arXiv:2602.13359v1 Announce Type: new Abstract: Machine learning models excel with abundant annotated data, but annotation is often costly and time-intensive. Active learning (AL) aims to improve the performance-to-annotation ratio by using query methods (QMs) to iteratively select the most informative…

February 17, 2026

Exploring the Performance of ML/DL Architectures on the MNIST-1D Dataset

arXiv:2602.13348v1 Announce Type: new Abstract: Small datasets like MNIST have historically been instrumental in advancing machine learning research by providing a controlled environment for rapid experimentation and model evaluation. However, their simplicity often limits their utility for distinguishing between advanced…

February 17, 2026

ShapBPT: Image Feature Attributions Using Data-Aware Binary Partition Trees

arXiv:2602.07047v2 Announce Type: replace-cross Abstract: Pixel-level feature attributions are an important tool in eXplainable AI for Computer Vision (XCV), providing visual insights into how image features influence model predictions. The Owen formula for hierarchical Shapley values has been widely used…

February 17, 2026

Finding Highly Interpretable Prompt-Specific Circuits in Language Models

arXiv:2602.13483v1 Announce Type: new Abstract: Understanding the internal circuits that language models use to solve tasks remains a central challenge in mechanistic interpretability. Most prior work identifies circuits at the task level by averaging across many prompts, implicitly assuming a…

February 17, 2026

Solving Inverse Parametrized Problems via Finite Elements and Extreme Learning Networks

arXiv:2602.14757v1 Announce Type: cross Abstract: We develop an interpolation-based reduced-order modeling framework for parameter-dependent partial differential equations arising in control, inverse problems, and uncertainty quantification. The solution is discretized in the physical domain using finite element methods, while the dependence…

February 17, 2026

Federated Learning of Nonlinear Temporal Dynamics with Graph Attention-based Cross-Client Interpretability

arXiv:2602.13485v1 Announce Type: new Abstract: Networks of modern industrial systems are increasingly monitored by distributed sensors, where each system comprises multiple subsystems generating high dimensional time series data. These subsystems are often interdependent, making it important to understand how temporal…

February 17, 2026

Online Posterior Sampling with a Diffusion Prior

arXiv:2410.03919v2 Announce Type: replace Abstract: Posterior sampling in contextual bandits with a Gaussian prior can be implemented exactly or approximately using the Laplace approximation. The Gaussian prior is computationally efficient but it cannot describe complex distributions. In this work, we…

February 17, 2026

Preventing Rank Collapse in Federated Low-Rank Adaptation with Client Heterogeneity

arXiv:2602.13486v1 Announce Type: new Abstract: Federated low-rank adaptation (FedLoRA) has facilitated communication-efficient and privacy-preserving fine-tuning of foundation models for downstream tasks. In practical federated learning scenarios, client heterogeneity in system resources and data distributions motivates heterogeneous LoRA ranks across clients.…

February 17, 2026

Calibrated Predictive Lower Bounds on Time-to-Unsafe-Sampling in LLMs

arXiv:2506.13593v5 Announce Type: replace Abstract: We introduce time-to-unsafe-sampling, a novel safety measure for generative models, defined as the number of generations required by a large language model (LLM) to trigger an unsafe (e.g., toxic) response. While providing a new dimension…

February 17, 2026

TrasMuon: Trust-Region Adaptive Scaling for Orthogonalized Momentum Optimizers

arXiv:2602.13498v1 Announce Type: new Abstract: Muon-style optimizers leverage Newton-Schulz (NS) iterations to orthogonalize updates, yielding update geometries that often outperform Adam-series methods. However, this orthogonalization discards magnitude information, rendering training sensitive to step-size hyperparameters and vulnerable to high-energy bursts. To…

February 17, 2026