Archives AI News

A Multivariate Bernoulli-Based Sampling Method for Multi-Label Data with Application to Meta-Research

arXiv:2512.08371v2 Announce Type: replace Abstract: Datasets may contain observations with multiple labels. If the labels are not mutually exclusive, and if the labels vary greatly in frequency, obtaining a sample that includes sufficient observations with scarcer labels to make inferences…

December 11, 2025

Understanding the Failure Modes of Transformers through the Lens of Graph Neural Networks

arXiv:2512.09182v1 Announce Type: new Abstract: Transformers and more specifically decoder-only transformers dominate modern LLM architectures. While they have shown to work exceptionally well, they are not without issues, resulting in surprising failure modes and predictably asymmetric performance degradation. This article…

December 11, 2025

New materials could boost the energy efficiency of microelectronics

By stacking multiple active components based on new materials on the back end of a computer chip, this new approach reduces the amount of energy wasted during computation.

December 11, 2025

RIFT: A Scalable Methodology for LLM Accelerator Fault Assessment using Reinforcement Learning

arXiv:2512.09829v1 Announce Type: cross Abstract: The massive scale of modern AI accelerators presents critical challenges to traditional fault assessment methodologies, which face prohibitive computational costs and provide poor coverage of critical failure modes. This paper introduces RIFT (Reinforcement Learning-guided Intelligent…

December 11, 2025

Neural Diversity Regularizes Hallucinations in Language Models

arXiv:2510.20690v2 Announce Type: replace-cross Abstract: Language models continue to hallucinate despite increases in parameters, compute, and data. We propose neural diversity — decorrelated parallel representations — as a principled mechanism that reduces hallucination rates at fixed parameter and data budgets.…

December 11, 2025

The Ky Fan Norms and Beyond: Dual Norms and Combinations for Matrix Optimization

arXiv:2512.09678v1 Announce Type: cross Abstract: In this article, we explore the use of various matrix norms for optimizing functions of weight matrices, a crucial problem in training large language models. Moving beyond the spectral norm underlying the Muon update, we…

December 11, 2025

Advancing physiological time series reconstruction and imputation via mixture of receptive fields and experts fusion

arXiv:2512.07873v2 Announce Type: replace Abstract: Recent studies show that using diffusion models for time series signal reconstruction holds great promise. However, such approaches remain largely unexplored in the domain of medical time series. The unique characteristics of the physiological time…

December 11, 2025

Efficient Transformed Gaussian Process State-Space Models for Non-Stationary High-Dimensional Dynamical Systems

arXiv:2503.18309v4 Announce Type: replace-cross Abstract: Gaussian process state-space models (GPSSMs) offer a principled framework for learning and inference in nonlinear dynamical systems with uncertainty quantification. However, existing GPSSMs are limited by the use of multiple independent stationary Gaussian processes (GPs),…

December 11, 2025

A Minimalist Optimizer Design for LLM Pretraining

arXiv:2506.16659v2 Announce Type: replace Abstract: Training large language models (LLMs) typically relies on adaptive optimizers such as Adam, which introduce extra operations and require significant more memory to maintain first- and second-order moments than SGD. While recent works such as…

December 11, 2025

Colliding with Adversaries at ECML-PKDD 2025 Model Robustness Competition 1st Prize Solution

arXiv:2510.16443v2 Announce Type: replace Abstract: This report presents the winning solution for Task 2 of Colliding with Adversaries: A Challenge on Robust Learning in High Energy Physics Discovery at ECML-PKDD 2025. The goal of the challenge was to design and…

December 11, 2025