Archives AI News

$k$-server-bench: Automating Potential Discovery for the $k$-Server Conjecture

arXiv:2604.07240v1 Announce Type: cross Abstract: We introduce a code-based challenge for automated, open-ended mathematical discovery based on the $k$-server conjecture, a central open problem in competitive analysis. The task is to discover a potential function satisfying a large graph-structured system…

April 10, 2026

Limits of Difficulty Scaling: Hard Samples Yield Diminishing Returns in GRPO-Tuned SLMs

arXiv:2604.06298v1 Announce Type: new Abstract: Recent alignment work on Large Language Models (LLMs) suggests preference optimization can improve reasoning by shifting probability mass toward better solutions. We test this claim in a resource-constrained setting by applying GRPO with LoRA to…

April 10, 2026

Novel Interpretable and Robust Web-based AI Platform for Phishing Email Detection

arXiv:2405.11619v2 Announce Type: replace Abstract: Phishing emails continue to pose a significant threat, causing financial losses and security breaches. This study addresses limitations in existing research, such as reliance on proprietary datasets and lack of real-world application, by proposing a…

April 10, 2026

Drifting Fields are not Conservative

arXiv:2604.06333v1 Announce Type: new Abstract: Drifting models generate high-quality samples in a single forward pass by transporting generated samples toward the data distribution using a vector valued drift field. We investigate whether this procedure is equivalent to optimizing a scalar…

April 10, 2026

Entropy After for reasoning model early exiting

arXiv:2509.26522v3 Announce Type: replace Abstract: Reasoning LLMs show improved performance with longer chains of thought. However, recent work has highlighted their tendency to overthink, continuing to revise answers even after reaching the correct solution. We quantitatively confirm this inefficiency from…

April 10, 2026

BiScale-GTR: Fragment-Aware Graph Transformers for Multi-Scale Molecular Representation Learning

arXiv:2604.06336v1 Announce Type: new Abstract: Graph Transformers have recently attracted attention for molecular property prediction by combining the inductive biases of graph neural networks (GNNs) with the global receptive field of Transformers. However, many existing hybrid architectures remain GNN-dominated, causing…

April 10, 2026

Low-Rank Key Value Attention

arXiv:2601.11471v3 Announce Type: replace Abstract: The key-value (KV) cache is a primary memory bottleneck in Transformers. We propose Low-Rank Key-Value (LRKV) attention, which reduces KV cache memory by exploiting redundancy across attention heads, while being compute efficient. Each layer uses…

April 10, 2026

Bi-Level Optimization for Single Domain Generalization

arXiv:2604.06349v1 Announce Type: new Abstract: Generalizing from a single labeled source domain to unseen target domains, without access to any target data during training, remains a fundamental challenge in robust machine learning. We address this underexplored setting, known as Single…

April 10, 2026

Algebraic Diversity: Group-Theoretic Spectral Estimation from Single Observations

arXiv:2604.03634v2 Announce Type: replace Abstract: We establish that temporal averaging over multiple observations is the degenerate case of algebraic group action with the trivial group $G={e}$. A General Replacement Theorem proves that a group-averaged estimator from one snapshot achieves equivalent…

April 10, 2026

Stochastic Gradient Descent in the Saddle-to-Saddle Regime of Deep Linear Networks

arXiv:2604.06366v1 Announce Type: new Abstract: Deep linear networks (DLNs) are used as an analytically tractable model of the training dynamics of deep neural networks. While gradient descent in DLNs is known to exhibit saddle-to-saddle dynamics, the impact of stochastic gradient…

April 10, 2026