Archives AI News

Robust AI Evaluation through Maximal Lotteries

arXiv:2602.21297v1 Announce Type: new Abstract: The standard way to evaluate language models on subjective tasks is through pairwise comparisons: an annotator chooses the “better” of two responses to a prompt. Leaderboards aggregate these comparisons into a single Bradley-Terry (BT) ranking,…

February 26, 2026

Group Orthogonalized Policy Optimization:Group Policy Optimization as Orthogonal Projection in Hilbert Space

arXiv:2602.21269v1 Announce Type: new Abstract: We present Group Orthogonalized Policy Optimization (GOPO), a new alignment algorithm for large language models derived from the geometry of Hilbert function spaces. Instead of optimizing on the probability simplex and inheriting the exponential curvature…

February 26, 2026

AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression

arXiv:2602.21233v1 Announce Type: new Abstract: This technical report introduces AngelSlim, a comprehensive and versatile toolkit for large model compression developed by the Tencent Hunyuan team. By consolidating cutting-edge algorithms, including quantization, speculative decoding, token pruning, and distillation. AngelSlim provides a…

February 26, 2026

Urban Vibrancy Embedding and Application on Traffic Prediction

arXiv:2602.21232v1 Announce Type: new Abstract: Urban vibrancy reflects the dynamic human activity within urban spaces and is often measured using mobile data that captures floating population trends. This study proposes a novel approach to derive Urban Vibrancy embeddings from real-time…

February 26, 2026

Heuristic Adaptation of Potentially Misspecified Domain Support for Likelihood-Free Inference in Stochastic Dynamical Systems

arXiv:2510.26656v3 Announce Type: replace-cross Abstract: In robotics, likelihood-free inference (LFI) can provide the domain distribution that adapts a learnt agent in a parametric set of deployment conditions. LFI assumes an arbitrary support for sampling, which remains constant as the initial…

February 26, 2026

Uncertainty-Aware Diffusion Model for Multimodal Highway Trajectory Prediction via DDIM Sampling

arXiv:2602.21319v1 Announce Type: new Abstract: Accurate and uncertainty-aware trajectory prediction remains a core challenge for autonomous driving, driven by complex multi-agent interactions, diverse scene contexts and the inherently stochastic nature of future motion. Diffusion-based generative models have recently shown strong…

February 26, 2026

Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks

arXiv:2602.20156v3 Announce Type: replace-cross Abstract: LLM agents are evolving rapidly, powered by code execution, tools, and the recently introduced agent skills feature. Skills allow users to extend LLM applications with specialized third-party code, knowledge, and instructions. Although this can extend…

February 26, 2026

Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data

arXiv:2602.21320v1 Announce Type: new Abstract: Large language models (LLMs) are becoming the foundation for autonomous agents that can use tools to solve complex tasks. Reinforcement learning (RL) has emerged as a common approach for injecting such agentic capabilities, but typically…

February 26, 2026

Parallel Split Learning with Global Sampling

arXiv:2407.15738v5 Announce Type: replace Abstract: Distributed deep learning in resource-constrained environments faces scalability and generalization challenges due to large effective batch sizes and non-identically distributed client data. We introduce a server-driven sampling strategy that maintains a fixed global batch size…

February 26, 2026

Dynamic Symmetric Point Tracking: Tackling Non-ideal Reference in Analog In-memory Training

arXiv:2602.21321v1 Announce Type: new Abstract: Analog in-memory computing (AIMC) performs computation directly within resistive crossbar arrays, offering an energy-efficient platform to scale large vision and language models. However, non-ideal analog device properties make the training on AIMC devices challenging. In…

February 26, 2026