Archives AI News

Joint Denoising of Cryo-EM Projection Images using Polar Transformers

arXiv:2506.11283v2 Announce Type: replace-cross Abstract: Many imaging modalities involve reconstruction of unknown objects from collections of noisy projections related by random rotations. In one of these modalities, cryogenic electron microscopy (cryo-EM), the extremely low signal-to-noise ratio (SNR) makes integration of…

A Connection Between Score Matching and Local Intrinsic Dimension

arXiv:2510.12975v1 Announce Type: new Abstract: The local intrinsic dimension (LID) of data is a fundamental quantity in signal processing and learning theory, but quantifying the LID of high-dimensional, complex data has been a historically challenging task. Recent works have discovered…

Benchmarking Hindi LLMs: A New Suite of Datasets and a Comparative Analysis

arXiv:2508.19831v2 Announce Type: replace-cross Abstract: Evaluating instruction-tuned Large Language Models (LLMs) in Hindi is challenging due to a lack of high-quality benchmarks, as direct translation of English datasets fails to capture crucial linguistic and cultural nuances. To address this, we…

Reference-Specific Unlearning Metrics Can Hide the Truth: A Reality Check

arXiv:2510.12981v1 Announce Type: new Abstract: Current unlearning metrics for generative models evaluate success based on reference responses or classifier outputs rather than assessing the core objective: whether the unlearned model behaves indistinguishably from a model that never saw the unwanted…

Hard2Verify: A Step-Level Verification Benchmark for Open-Ended Frontier Math

arXiv:2510.13744v1 Announce Type: cross Abstract: Large language model (LLM)-based reasoning systems have recently achieved gold medal-level performance in the IMO 2025 competition, writing mathematical proofs where, to receive full credit, each step must be not only correct but also sufficiently…

Can DPO Learn Diverse Human Values? A Theoretical Scaling Law

arXiv:2408.03459v5 Announce Type: replace Abstract: Large language models (LLMs) have demonstrated remarkable capabilities but often struggle to align with human preferences, leading to harmful or undesirable outputs. Preference learning, which trains models to distinguish between preferred and non-preferred responses based…

Max It or Miss It: Benchmarking LLM On Solving Extremal Problems

arXiv:2510.12997v1 Announce Type: new Abstract: Test-time scaling has enabled Large Language Models (LLMs) with remarkable reasoning capabilities, particularly in mathematical domains, through intermediate chain-of-thought (CoT) reasoning before generating final answers. However, the specific sources and mechanisms underlying these reasoning capabilities…

AMORE: Adaptive Multi-Output Operator Network for Stiff Chemical Kinetics

arXiv:2510.12999v1 Announce Type: new Abstract: Time integration of stiff systems is a primary source of computational cost in combustion, hypersonics, and other reactive transport systems. This stiffness can introduce time scales significantly smaller than those associated with other physical processes,…