Archives AI News

Sequential KV Cache Compression via Probabilistic Language Tries: Beyond the Per-Vector Shannon Limit

arXiv:2604.15356v1 Announce Type: new Abstract: Recent work on KV cache quantization, culminating in TurboQuant, has approached the Shannon entropy limit for per-vector compression of transformer key-value caches. We observe that this limit applies to a strictly weaker problem than the…

April 20, 2026

When Missing Becomes Structure: Intent-Preserving Policy Completion from Financial KOL Discourse

arXiv:2604.14333v2 Announce Type: replace Abstract: Key Opinion Leader (KOL) discourse on social media is widely consumed as investment guidance, yet turning it into executable trading strategies without injecting assumptions about unspecified execution decisions remains an open problem. We observe that…

April 20, 2026

Constant-Factor Approximations for Doubly Constrained Fair k-Center, k-Median and k-Means

arXiv:2604.16061v1 Announce Type: cross Abstract: We study discrete k-clustering problems in general metric spaces that are constrained by a combination of two different fairness conditions within the demographic fairness model. Given a metric space (P,d), where every point in P…

April 20, 2026

What Makes LLMs Effective Sequential Recommenders? A Study on Preference Intensity and Temporal Context

arXiv:2506.02261v3 Announce Type: replace-cross Abstract: What enables large language models (LLMs) to effectively model user preferences in sequential recommendation? Our investigation reveals that existing preference-alignment approaches largely rely on binary pairwise comparisons, overlooking two critical factors: preference intensity (the structured…

April 20, 2026

Dispatch-Aware Ragged Attention for Pruned Vision Transformers

arXiv:2604.15408v1 Announce Type: new Abstract: Token pruning methods for Vision Transformers (ViTs) promise quadratic reductions in attention FLOPs by dropping uninformative patches. Yet when pruned sequences are executed with state-of-the-art variable-length attention APIs — including FlashAttention-2’s varlen and PyTorch’s NestedTensor…

April 20, 2026

Revisiting Entropy Regularization: Adaptive Coefficient Unlocks Its Potential for LLM Reinforcement Learning

arXiv:2510.10959v3 Announce Type: replace Abstract: Reasoning ability has become a defining capability of Large Language Models (LLMs), with Reinforcement Learning with Verifiable Rewards (RLVR) emerging as a key paradigm to enhance it. However, RLVR training often suffers from policy entropy…

April 20, 2026

Jailbreak Scaling Laws for Large Language Models: Polynomial-Exponential Crossover

arXiv:2603.11331v2 Announce Type: replace Abstract: Adversarial attacks can reliably steer safety-aligned large language models toward unsafe behavior. Empirically, we find that strong adversarial prompt-injection attacks can amplify attack success rate from the slow polynomial growth observed without injection to exponential…

April 20, 2026

Exploitation Over Exploration: Unmasking the Bias in Linear Bandit Recommender Offline Evaluation

arXiv:2507.18756v2 Announce Type: replace Abstract: Multi-Armed Bandit (MAB) algorithms are widely used in recommender systems that require continuous, incremental learning. A core aspect of MABs is the exploration-exploitation trade-off: choosing between exploiting items likely to be enjoyed and exploring new…

April 20, 2026

PRL-Bench: A Comprehensive Benchmark Evaluating LLMs’ Capabilities in Frontier Physics Research

arXiv:2604.15411v1 Announce Type: new Abstract: The paradigm of agentic science requires AI systems to conduct robust reasoning and engage in long-horizon, autonomous exploration. However, current scientific benchmarks remain confined to domain knowledge comprehension and complex reasoning, failing to evaluate the…

April 20, 2026

Teaching Language Models Mechanistic Explainability Through MechSMILES

arXiv:2512.05722v2 Announce Type: replace Abstract: Chemical reaction mechanisms are the foundation of how chemists evaluate reactivity and feasibility, yet current Computer-Assisted Synthesis Planning (CASP) systems operate without this mechanistic reasoning. We introduce a computational framework that teaches language models to…

April 20, 2026