Archives AI News

LATMiX: Learnable Affine Transformations for Microscaling Quantization of LLMs

arXiv:2602.17681v1 Announce Type: new Abstract: Post-training quantization (PTQ) is a widely used approach for reducing the memory and compute costs of large language models (LLMs). Recent studies have shown that applying invertible transformations to activations can significantly improve quantization robustness…

February 23, 2026

BioBridge: Bridging Proteins and Language for Enhanced Biological Reasoning with LLMs

arXiv:2602.17680v1 Announce Type: new Abstract: Existing Protein Language Models (PLMs) often suffer from limited adaptability to multiple tasks and exhibit poor generalization across diverse biological contexts. In contrast, general-purpose Large Language Models (LLMs) lack the capability to interpret protein sequences…

February 23, 2026

Uncertainty-Aware Vision-Language Segmentation for Medical Imaging

arXiv:2602.14498v2 Announce Type: replace-cross Abstract: We introduce a novel uncertainty-aware multimodal segmentation framework that leverages both radiological images and associated clinical text for precise medical diagnosis. We propose a Modality Decoding Attention Block (MoDAB) with a lightweight State Space Mixer…

February 23, 2026

Curriculum Learning for Efficient Chain-of-Thought Distillation via Structure-Aware Masking and GRPO

arXiv:2602.17686v1 Announce Type: new Abstract: Distilling Chain-of-Thought (CoT) reasoning from large language models into compact student models presents a fundamental challenge: teacher rationales are often too verbose for smaller models to faithfully reproduce. Existing approaches either compress reasoning into single-step,…

February 23, 2026

CityGuard: Graph-Aware Private Descriptors for Bias-Resilient Identity Search Across Urban Cameras

arXiv:2602.18047v1 Announce Type: cross Abstract: City-scale person re-identification across distributed cameras must handle severe appearance changes from viewpoint, occlusion, and domain shift while complying with data protection rules that prevent sharing raw imagery. We introduce CityGuard, a topology-aware transformer for…

February 23, 2026

AnCoder: Anchored Code Generation via Discrete Diffusion Models

arXiv:2602.17688v1 Announce Type: new Abstract: Diffusion language models offer a compelling alternative to autoregressive code generation, enabling global planning and iterative refinement of complex program logic. However, existing approaches fail to respect the rigid structure of programming languages and, as…

February 23, 2026

Clapeyron Neural Networks for Single-Species Vapor-Liquid Equilibria

arXiv:2602.18313v1 Announce Type: cross Abstract: Machine learning (ML) approaches have shown promising results for predicting molecular properties relevant for chemical process design. However, they are often limited by scarce experimental property data and lack thermodynamic consistency. As such, thermodynamics-informed ML,…

February 23, 2026

Robust Pre-Training of Medical Vision-and-Language Models with Domain-Invariant Multi-Modal Masked Reconstruction

arXiv:2602.17689v1 Announce Type: new Abstract: Medical vision-language models show strong potential for joint reasoning over medical images and clinical text, but their performance often degrades under domain shift caused by variations in imaging devices, acquisition protocols, and reporting styles. Existing…

February 23, 2026

Lean Formalization of Generalization Error Bound by Rademacher Complexity and Dudley’s Entropy Integral

arXiv:2503.19605v4 Announce Type: replace Abstract: Understanding and certifying the generalization performance of machine learning algorithms — i.e. obtaining theoretical estimates of the test error from a finite training sample — is a central theme of statistical learning theory. Among the…

February 23, 2026

Tethered Reasoning: Decoupling Entropy from Hallucination in Quantized LLMs via Manifold Steering

arXiv:2602.17691v1 Announce Type: new Abstract: Quantized language models face a fundamental dilemma: low sampling temperatures yield repetitive, mode-collapsed outputs, while high temperatures (T > 2.0) cause trajectory divergence and semantic incoherence. We present HELIX, a geometric framework that decouples output…

February 23, 2026