Archives AI News

A pseudo-inverse of a line graph

arXiv:2508.09412v2 Announce Type: replace Abstract: Line graphs are an alternative representation of graphs where each vertex of the original (root) graph becomes an edge. However not all graphs have a corresponding root graph, hence the transformation from graphs to line…

Beyond Real Data: Synthetic Data through the Lens of Regularization

arXiv:2510.08095v1 Announce Type: new Abstract: Synthetic data can improve generalization when real data is scarce, but excessive reliance may introduce distributional mismatches that degrade performance. In this paper, we present a learning-theoretic framework to quantify the trade-off between synthetic and…

Surrogate Graph Partitioning for Spatial Prediction

arXiv:2510.07832v1 Announce Type: new Abstract: Spatial prediction refers to the estimation of unobserved values from spatially distributed observations. Although recent advances have improved the capacity to model diverse observation types, adoption in practice remains limited in industries that demand interpretability.…

A Honest Cross-Validation Estimator for Prediction Performance

arXiv:2510.07649v1 Announce Type: new Abstract: Cross-validation is a standard tool for obtaining a honest assessment of the performance of a prediction model. The commonly used version repeatedly splits data, trains the prediction model on the training set, evaluates the model…

Latency-Aware Contextual Bandit: Application to Cryo-EM Data Collection

arXiv:2410.13109v3 Announce Type: replace Abstract: We introduce a latency-aware contextual bandit framework that generalizes the standard contextual bandit problem, where the learner adaptively selects arms and switches decision sets under action delays. In this setting, the learner observes the context…

High-dimensional Analysis of Synthetic Data Selection

arXiv:2510.08123v1 Announce Type: new Abstract: Despite the progress in the development of generative models, their usefulness in creating synthetic data that improve prediction performance of classifiers has been put into question. Besides heuristic principles such as “synthetic data should be…