Archives AI News

Bottlenecked Transformers: Periodic KV Cache Consolidation for Generalised Reasoning

arXiv:2505.16950v3 Announce Type: replace Abstract: Transformer LLMs have been shown to exhibit strong reasoning ability that scales with inference-time compute, most prominently through token-space “thinking” chains of thought. A growing line of work pushes extra computation into the model’s latent…

September 29, 2025

Score-based Idempotent Distillation of Diffusion Models

arXiv:2509.21470v1 Announce Type: new Abstract: Idempotent generative networks (IGNs) are a new line of generative models based on idempotent mapping to a target manifold. IGNs support both single-and multi-step generation, allowing for a flexible trade-off between computational cost and sample…

September 29, 2025

From Formal Language Theory to Statistical Learning: Finite Observability of Subregular Languages

arXiv:2509.22598v1 Announce Type: cross Abstract: We prove that all standard subregular language classes are linearly separable when represented by their deciding predicates. This establishes finite observability and guarantees learnability with simple linear models. Synthetic experiments confirm perfect separability under noise-free…

September 29, 2025

Forecasting Seismic Waveforms: A Deep Learning Approach for Einstein Telescope

arXiv:2509.21446v1 Announce Type: new Abstract: We introduce textit{SeismoGPT}, a transformer-based model for forecasting three-component seismic waveforms in the context of future gravitational wave detectors like the Einstein Telescope. The model is trained in an autoregressive setting and can operate on…

September 29, 2025

Talking Trees: Reasoning-Assisted Induction of Decision Trees for Tabular Data

arXiv:2509.21465v1 Announce Type: new Abstract: Tabular foundation models are becoming increasingly popular for low-resource tabular problems. These models make up for small training datasets by pretraining on large volumes of synthetic data. The prior knowledge obtained via pretraining provides the…

September 29, 2025

Null-Space Filtering for Data-Free Continual Model Merging: Preserving Transparency, Promoting Fidelity

arXiv:2509.21413v1 Announce Type: new Abstract: Data-free continual model merging (DFCMM) aims to fuse independently fine-tuned models into a single backbone that evolves with incoming tasks without accessing task data. This paper formulate two fundamental desiderata for DFCMM: transparency, avoiding interference…

September 29, 2025

Object Identification Under Known Dynamics: A PIRNN Approach for UAV Classification

arXiv:2509.21405v1 Announce Type: new Abstract: This work addresses object identification under known dynamics in unmanned aerial vehicle applications, where learning and classification are combined through a physics-informed residual neural network. The proposed framework leverages physics-informed learning for state mapping and…

September 29, 2025

LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet?

arXiv:2509.21403v1 Announce Type: new Abstract: Large language models (LLMs) have recently been proposed as general-purpose agents for experimental design, with claims that they can perform in-context experimental design. We evaluate this hypothesis using both open- and closed-source instruction-tuned LLMs applied…

September 29, 2025

Comparative Analysis of GAN and Diffusion for MRI-to-CT translation

arXiv:2509.22049v1 Announce Type: cross Abstract: Computed tomography (CT) is essential for treatment and diagnostics; In case CT are missing or otherwise difficult to obtain, methods for generating synthetic CT (sCT) images from magnetic resonance imaging (MRI) images are sought after.…

September 29, 2025

Are Hallucinations Bad Estimations?

arXiv:2509.21473v1 Announce Type: new Abstract: We formalize hallucinations in generative models as failures to link an estimate to any plausible cause. Under this interpretation, we show that even loss-minimizing optimal estimators still hallucinate. We confirm this with a general high…

September 29, 2025