Archives AI News

Can DPO Learn Diverse Human Values? A Theoretical Scaling Law

arXiv:2408.03459v5 Announce Type: replace Abstract: Large language models (LLMs) have demonstrated remarkable capabilities but often struggle to align with human preferences, leading to harmful or undesirable outputs. Preference learning, which trains models to distinguish between preferred and non-preferred responses based…

Max It or Miss It: Benchmarking LLM On Solving Extremal Problems

arXiv:2510.12997v1 Announce Type: new Abstract: Test-time scaling has enabled Large Language Models (LLMs) with remarkable reasoning capabilities, particularly in mathematical domains, through intermediate chain-of-thought (CoT) reasoning before generating final answers. However, the specific sources and mechanisms underlying these reasoning capabilities…

AMORE: Adaptive Multi-Output Operator Network for Stiff Chemical Kinetics

arXiv:2510.12999v1 Announce Type: new Abstract: Time integration of stiff systems is a primary source of computational cost in combustion, hypersonics, and other reactive transport systems. This stiffness can introduce time scales significantly smaller than those associated with other physical processes,…

A Brain-to-Population Graph Learning Framework for Diagnosing Brain Disorders

arXiv:2506.16096v2 Announce Type: replace Abstract: Recent developed graph-based methods for diagnosing brain disorders using functional connectivity highly rely on predefined brain atlases, but overlook the rich information embedded within atlases and the confounding effects of site and phenotype variability. To…

Escaping Local Optima in the Waddington Landscape: A Multi-Stage TRPO-PPO Approach for Single-Cell Perturbation Analysis

arXiv:2510.13018v1 Announce Type: new Abstract: Modeling cellular responses to genetic and chemical perturbations remains a central challenge in single-cell biology. Existing data-driven framework have advanced perturbation prediction through variational autoencoders, chemically conditioned autoencoders, and large-scale transformer pretraining. However, these models…

How Well Can Preference Optimization Generalize Under Noisy Feedback?

arXiv:2510.01458v2 Announce Type: replace Abstract: As large language models (LLMs) advance their capabilities, aligning these models with human preferences has become crucial. Preference optimization, which trains models to distinguish between preferred and non-preferred responses based on human feedback, has become…

Clustering with minimum spanning trees: How good can it be?

arXiv:2303.05679v4 Announce Type: replace-cross Abstract: Minimum spanning trees (MSTs) provide a convenient representation of datasets in numerous pattern recognition activities. Moreover, they are relatively fast to compute. In this paper, we quantify the extent to which they are meaningful in…