Archives AI News

An invertible generative model for forward and inverse problems

arXiv:2509.03910v1 Announce Type: new Abstract: We formulate the inverse problem in a Bayesian framework and aim to train a generative model that allows us to simulate (i.e., sample from the likelihood) and do inference (i.e., sample from the posterior). We review the use of triangular normalizing flows for conditional sampling in this context and show how to combine two such triangular maps (an upper and a lower one) in to one invertible mapping that can be used for simulation and inference. We work out several useful properties of this invertible generative model and propose a possible training loss for training the map directly. We illustrate the workings of this new approach to conditional generative modeling numerically on a few stylized examples.

Diffusion Generative Models Meet Compressed Sensing, with Applications to Image Data and Financial Time Series

arXiv:2509.03898v1 Announce Type: new Abstract: This paper develops dimension reduction techniques for accelerating diffusion model inference in the context of synthetic data generation. The idea is to integrate compressed sensing into diffusion models: (i) compress the data into a latent space, (ii) train a diffusion model in the latent space, and (iii) apply a compressed sensing algorithm to the samples generated in the latent space, facilitating the efficiency of both model training and inference. Under suitable sparsity assumptions on data, the proposed algorithm is proved to enjoy faster convergence by combining diffusion model inference with sparse recovery. As a byproduct, we obtain an optimal value for the latent space dimension. We also conduct numerical experiments on a range of datasets, including image data (handwritten digits, medical images, and climate data) and financial time series for stress testing.

Imputation-free Learning of Tabular Data with Missing Values using Incremental Feature Partitions in Transformer

arXiv:2504.14610v4 Announce Type: replace-cross Abstract: Tabular data sets with varying missing values are prepared for machine learning using an arbitrary imputation strategy. Synthetic values generated by imputation models often raise concerns about data quality and the reliability of data-driven outcomes. To address these concerns, this article proposes an imputation-free incremental attention learning (IFIAL) method for tabular data. A pair of attention masks is derived and retrofitted to a transformer to directly streamline tabular data without imputing or initializing missing values. The proposed method incrementally learns partitions of overlapping and fixed-size feature sets to enhance the efficiency and performance of the transformer. The average classification performance rank order across 17 diverse tabular data sets highlights the superiority of IFIAL over 11 state-of-the-art learning methods with or without missing value imputations. Further experiments substantiate the robustness of IFIAL against varying missing value types and rates compared to methods involving missing value imputation. Our analysis reveals that a feature partition size of half the original feature space is, both computationally and in terms of accuracy, the best choice for the proposed incremental learning. The proposed method is one of the first solutions to enable deep attention learning of tabular data without requiring missing-value imputation. The source code for this paper is publicly available.

Sparse Autoencoder Neural Operators: Model Recovery in Function Spaces

arXiv:2509.03738v1 Announce Type: cross Abstract: We frame the problem of unifying representations in neural models as one of sparse model recovery and introduce a framework that extends sparse autoencoders (SAEs) to lifted spaces and infinite-dimensional function spaces, enabling mechanistic interpretability of large neural operators (NO). While the Platonic Representation Hypothesis suggests that neural networks converge to similar representations across architectures, the representational properties of neural operators remain underexplored despite their growing importance in scientific computing. We compare the inference and training dynamics of SAEs, lifted-SAE, and SAE neural operators. We highlight how lifting and operator modules introduce beneficial inductive biases, enabling faster recovery, improved recovery of smooth concepts, and robust inference across varying resolutions, a property unique to neural operators.

Testing for correlation between network structure and high-dimensional node covariates

arXiv:2509.03772v1 Announce Type: new Abstract: In many application domains, networks are observed with node-level features. In such settings, a common problem is to assess whether or not nodal covariates are correlated with the network structure itself. Here, we present four novel methods for addressing this problem. Two of these are based on a linear model relating node-level covariates to latent node-level variables that drive network structure. The other two are based on applying canonical correlation analysis to the node features and network structure, avoiding the linear modeling assumptions. We provide theoretical guarantees for all four methods when the observed network is generated according to a low-rank latent space model endowed with node-level covariates, which we allow to be high-dimensional. Our methods are computationally cheaper and require fewer modeling assumptions than previous approaches to network dependency testing. We demonstrate and compare the performance of our novel methods on both simulated and real-world data.

The Strong, Weak and Benign Goodhart’s law. An independence-free and paradigm-agnostic formalisation

arXiv:2505.23445v2 Announce Type: replace Abstract: Goodhart's law is a famous adage in policy-making that states that ``When a measure becomes a target, it ceases to be a good measure''. As machine learning models and the optimisation capacity to train them grow, growing empirical evidence reinforced the belief in the validity of this law without however being formalised. Recently, a few attempts were made to formalise Goodhart's law, either by categorising variants of it, or by looking at how optimising a proxy metric affects the optimisation of an intended goal. In this work, we alleviate the simplifying independence assumption, made in previous works, and the assumption on the learning paradigm made in most of them, to study the effect of the coupling between the proxy metric and the intended goal on Goodhart's law. Our results show that in the case of light tailed goal and light tailed discrepancy, dependence does not change the nature of Goodhart's effect. However, in the light tailed goal and heavy tailed discrepancy case, we exhibit an example where over-optimisation occurs at a rate inversely proportional to the heavy tailedness of the discrepancy between the goal and the metric. %

Simulation-based Inference via Langevin Dynamics with Score Matching

arXiv:2509.03853v1 Announce Type: cross Abstract: Simulation-based inference (SBI) enables Bayesian analysis when the likelihood is intractable but model simulations are available. Recent advances in statistics and machine learning, including Approximate Bayesian Computation and deep generative models, have expanded the applicability of SBI, yet these methods often face challenges in moderate to high-dimensional parameter spaces. Motivated by the success of gradient-based Monte Carlo methods in Bayesian sampling, we propose a novel SBI method that integrates score matching with Langevin dynamics to explore complex posterior landscapes more efficiently in such settings. Our approach introduces tailored score-matching procedures for SBI, including a localization scheme that reduces simulation costs and an architectural regularization that embeds the statistical structure of log-likelihood scores to improve score-matching accuracy. We provide theoretical analysis of the method and illustrate its practical benefits on benchmark tasks and on more challenging problems in moderate to high dimensions, where it performs favorably compared to existing approaches.

Prob-GParareal: A Probabilistic Numerical Parallel-in-Time Solver for Differential Equations

arXiv:2509.03945v1 Announce Type: cross Abstract: We introduce Prob-GParareal, a probabilistic extension of the GParareal algorithm designed to provide uncertainty quantification for the Parallel-in-Time (PinT) solution of (ordinary and partial) differential equations (ODEs, PDEs). The method employs Gaussian processes (GPs) to model the Parareal correction function, as GParareal does, further enabling the propagation of numerical uncertainty across time and yielding probabilistic forecasts of system's evolution. Furthermore, Prob-GParareal accommodates probabilistic initial conditions and maintains compatibility with classical numerical solvers, ensuring its straightforward integration into existing Parareal frameworks. Here, we first conduct a theoretical analysis of the computational complexity and derive error bounds of Prob-GParareal. Then, we numerically demonstrate the accuracy and robustness of the proposed algorithm on five benchmark ODE systems, including chaotic, stiff, and bifurcation problems. To showcase the flexibility and potential scalability of the proposed algorithm, we also consider Prob-nnGParareal, a variant obtained by replacing the GPs in Parareal with the nearest-neighbors GPs, illustrating its increased performance on an additional PDE example. This work bridges a critical gap in the development of probabilistic counterparts to established PinT methods.

Bootstrapping the Cross-Validation Estimate

arXiv:2307.00260v2 Announce Type: replace-cross Abstract: Cross-validation is a widely used technique for evaluating the performance of prediction models, ranging from simple binary classification to complex precision medicine strategies. It helps correct for optimism bias in error estimates, which can be significant for models built using complex statistical learning algorithms. However, since the cross-validation estimate is a random value dependent on observed data, it is essential to accurately quantify the uncertainty associated with the estimate. This is especially important when comparing the performance of two models using cross-validation, as one must determine whether differences in estimated error are due to chance. Although various methods have been developed to make inferences on cross-validation estimates, they often have many limitations, such as requiring stringent model assumptions. This paper proposes a fast bootstrap method that quickly estimates the standard error of the cross-validation estimate and produces valid confidence intervals for a population parameter measuring average model performance. Our method overcomes the computational challenges inherent in bootstrapping a cross-validation estimate by estimating the variance component within a random-effects model. It is also as flexible as the cross-validation procedure itself. To showcase the effectiveness of our approach, we conducted comprehensive simulations and real-data analysis across two applications.

Reinforcement Learning for Robust Ageing-Aware Control of Li-ion Battery Systems with Data-Driven Formal Verification

arXiv:2509.04288v1 Announce Type: cross Abstract: Rechargeable lithium-ion (Li-ion) batteries are a ubiquitous element of modern technology. In the last decades, the production and design of such batteries and their adjacent embedded charging and safety protocols, denoted by Battery Management Systems (BMS), has taken central stage. A fundamental challenge to be addressed is the trade-off between the speed of charging and the ageing behavior, resulting in the loss of capacity in the battery cell. We rely on a high-fidelity physics-based battery model and propose an approach to data-driven charging and safety protocol design. Following a Counterexample-Guided Inductive Synthesis scheme, we combine Reinforcement Learning (RL) with recent developments in data-driven formal methods to obtain a hybrid control strategy: RL is used to synthesise the individual controllers, and a data-driven abstraction guides their partitioning into a switched structure, depending on the initial output measurements of the battery. The resulting discrete selection among RL-based controllers, coupled with the continuous battery dynamics, realises a hybrid system. When a design meets the desired criteria, the abstraction provides probabilistic guarantees on the closed-loop performance of the cell.