Archives AI News

Maximum Entropy Semi-Supervised Inverse Reinforcement Learning

arXiv:2604.20074v1 Announce Type: new Abstract: A popular approach to apprenticeship learning (AL) is to formulate it as an inverse reinforcement learning (IRL) problem. The MaxEnt-IRL algorithm successfully integrates the maximum entropy principle into IRL and unlike its predecessors, it resolves…

April 23, 2026

Statistics, Not Scale: Modular Medical Dialogue with Bayesian Belief Engine

arXiv:2604.20022v1 Announce Type: new Abstract: Large language models are increasingly deployed as autonomous diagnostic agents, yet they conflate two fundamentally different capabilities: natural-language communication and probabilistic reasoning. We argue that this conflation is an architectural flaw, not an engineering shortcoming.…

April 23, 2026

Auto-ART: Structured Literature Synthesis and Automated Adversarial Robustness Testing

arXiv:2604.20704v1 Announce Type: cross Abstract: Adversarial robustness evaluation underpins every claim of trustworthy ML deployment, yet the field suffers from fragmented protocols and undetected gradient masking. We make two contributions. (1) Structured synthesis. We analyze nine peer-reviewed corpus sources (2020–2026)…

April 23, 2026

Language Models Learn Universal Representations of Numbers and Here’s Why You Should Care

arXiv:2510.26285v2 Announce Type: replace-cross Abstract: Prior work has shown that large language models (LLMs) often converge to accurate input embedding for numbers, based on sinusoidal representations. In this work, we quantify that these representations are in fact strikingly systematic, to…

April 23, 2026

Analysis of Nystrom method with sequential ridge leverage scores

arXiv:2604.20077v1 Announce Type: new Abstract: Large-scale kernel ridge regression (KRR) is limited by the need to store a large kernel matrix K_t. To avoid storing the entire matrix K_t, Nystrom methods subsample a subset of columns of the kernel matrix,…

April 23, 2026

Replicable Bandits with UCB based Exploration

arXiv:2604.20024v1 Announce Type: new Abstract: We study replicable algorithms for stochastic multi-armed bandits (MAB) and linear bandits with UCB (Upper Confidence Bound) based exploration. A bandit algorithm is $rho$-replicable if two executions using shared internal randomness but independent reward realizations,…

April 23, 2026

Convergent Evolution: How Different Language Models Learn Similar Number Representations

arXiv:2604.20817v1 Announce Type: cross Abstract: Language models trained on natural text learn to represent numbers using periodic features with dominant periods at $T=2, 5, 10$. In this paper, we identify a two-tiered hierarchy of these features: while Transformers, Linear RNNs,…

April 23, 2026

Improved large-scale graph learning through ridge spectral sparsification

arXiv:2604.20078v1 Announce Type: new Abstract: Graph-based techniques and spectral graph theory have enriched the field of machine learning with a variety of critical advances. A central object in the analysis is the graph Laplacian L, which encodes the structure of…

April 23, 2026

MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design

arXiv:2412.14590v2 Announce Type: replace Abstract: Quantization has become one of the most effective methodologies to compress LLMs into smaller size. However, the existing quantization solutions still show limitations of either non-negligible accuracy drop or low system efficiency. In this paper,…

April 23, 2026

On the Quantization Robustness of Diffusion Language Models in Coding Benchmarks

arXiv:2604.20079v1 Announce Type: new Abstract: Auto-regressive Large Language Models (LLMs) achieve strong performance on coding tasks, but incur high memory and inference costs. Diffusion-based language models (d-LLMs) offer bounded inference cost via iterative denoising, but their behavior under post-training quantization…

April 23, 2026