Archives AI News

Robust Multi-Objective Controlled Decoding of Large Language Models

arXiv:2503.08796v2 Announce Type: replace Abstract: We introduce Robust Multi-Objective Decoding (RMOD), a novel inference-time algorithm that robustly aligns Large Language Models (LLMs) to multiple human objectives (e.g., instruction-following, helpfulness, safety) by maximizing the worst-case rewards. RMOD formulates the robust decoding…

February 17, 2026

Text Has Curvature

arXiv:2602.13418v1 Announce Type: new Abstract: Does text have an intrinsic curvature? Language is increasingly modeled in curved geometries – hyperbolic spaces for hierarchy, mixed-curvature manifolds for compositional structure – yet a basic scientific question remains unresolved: what does curvature mean…

February 17, 2026

Comparing Classifiers: A Case Study Using PyCM

arXiv:2602.13482v1 Announce Type: new Abstract: Selecting an optimal classification model requires a robust and comprehensive understanding of the performance of the model. This paper provides a tutorial on the PyCM library, demonstrating its utility in conducting deep-dive evaluations of multi-class…

February 17, 2026

Why is Normalization Preferred? A Worst-Case Complexity Theory for Stochastically Preconditioned SGD under Heavy-Tailed Noise

arXiv:2602.13413v1 Announce Type: new Abstract: We develop a worst-case complexity theory for stochastically preconditioned stochastic gradient descent (SPSGD) and its accelerated variants under heavy-tailed noise, a setting that encompasses widely used adaptive methods such as Adam, RMSProp, and Shampoo. We…

February 17, 2026

High-Resolution Climate Projections Using Diffusion-Based Downscaling of a Lightweight Climate Emulator

arXiv:2602.13416v1 Announce Type: new Abstract: The proliferation of data-driven models in weather and climate sciences has marked a significant paradigm shift, with advanced models demonstrating exceptional skill in medium-range forecasting. However, these models are often limited by long-term instabilities, climatological…

February 17, 2026

Accelerated Discovery of Cryoprotectant Cocktails via Multi-Objective Bayesian Optimization

arXiv:2602.13398v1 Announce Type: new Abstract: Designing cryoprotectant agent (CPA) cocktails for vitrification is challenging because formulations must be concentrated enough to suppress ice formation yet non-toxic enough to preserve cell viability. This tradeoff creates a large, multi-objective design space in…

February 17, 2026

The Speed-up Factor: A Quantitative Multi-Iteration Active Learning Performance Metric

arXiv:2602.13359v1 Announce Type: new Abstract: Machine learning models excel with abundant annotated data, but annotation is often costly and time-intensive. Active learning (AL) aims to improve the performance-to-annotation ratio by using query methods (QMs) to iteratively select the most informative…

February 17, 2026

Exploring the Performance of ML/DL Architectures on the MNIST-1D Dataset

arXiv:2602.13348v1 Announce Type: new Abstract: Small datasets like MNIST have historically been instrumental in advancing machine learning research by providing a controlled environment for rapid experimentation and model evaluation. However, their simplicity often limits their utility for distinguishing between advanced…

February 17, 2026

ShapBPT: Image Feature Attributions Using Data-Aware Binary Partition Trees

arXiv:2602.07047v2 Announce Type: replace-cross Abstract: Pixel-level feature attributions are an important tool in eXplainable AI for Computer Vision (XCV), providing visual insights into how image features influence model predictions. The Owen formula for hierarchical Shapley values has been widely used…

February 17, 2026

Finding Highly Interpretable Prompt-Specific Circuits in Language Models

arXiv:2602.13483v1 Announce Type: new Abstract: Understanding the internal circuits that language models use to solve tasks remains a central challenge in mechanistic interpretability. Most prior work identifies circuits at the task level by averaging across many prompts, implicitly assuming a…

February 17, 2026