Archives AI News

Optimized Architectures for Kolmogorov-Arnold Networks

arXiv:2512.12448v2 Announce Type: replace Abstract: Efforts to improve Kolmogorov–Arnold networks (KANs) with architectural enhancements have been stymied by the complexity those enhancements bring, undermining the interpretability that makes KANs attractive in the first place. Here we study overprovisioned architectures combined…

HELM: Harness-Enhanced Long-horizon Memory for Vision-Language-Action Manipulation

arXiv:2604.18791v1 Announce Type: new Abstract: Vision-Language-Action (VLA) models fail systematically on long-horizon manipulation tasks despite strong short-horizon performance. We show that this failure is not resolved by extending context length alone in the current reactive execution setting; instead, it stems…

GAIN: Multiplicative Modulation for Domain Adaptation

arXiv:2604.04516v2 Announce Type: replace Abstract: Adapting LLMs to new domains causes forgetting because standard methods (e.g., full fine-tuning, LoRA) inject new directions into the weight space. We show that forgetting is governed by one algebraic property: whether the update preserves…

Preserving Clusters in Error-Bounded Lossy Compression of Particle Data

arXiv:2604.18801v1 Announce Type: new Abstract: Lossy compression is widely used to reduce storage and I/O costs for large-scale particle datasets in scientific applications such as cosmology, molecular dynamics, and fluid dynamics, where clustering structures (e.g., single-linkage or Friends-of-Friends) are critical…

On the Generalizability of Foundation Models for Crop Type Mapping

arXiv:2409.09451v5 Announce Type: replace-cross Abstract: Foundation models pre-trained using self-supervised learning have shown powerful transfer learning capabilities on various downstream tasks, including language understanding, text generation, and image recognition. The Earth observation (EO) field has produced several foundation models pre-trained…

A PPA-Driven 3D-IC Partitioning Selection Framework with Surrogate Models

arXiv:2604.18806v1 Announce Type: new Abstract: 3D-IC netlist partitioning is commonly optimized using proxy objectives, while final PPA is treated as a costly evaluation rather than an optimization signal. This proxy-driven paradigm makes it difficult to reliably translate additional PPA evaluations…

Rethinking Dataset Distillation: Hard Truths about Soft Labels

arXiv:2604.18811v1 Announce Type: new Abstract: Despite the perceived success of large-scale dataset distillation (DD) methods, recent evidence finds that simple random image baselines perform on-par with state-of-theart DD methods like SRe2L due to the use of soft labels during downstream…

SAGE-32B: Agentic Reasoning via Iterative Distillation

arXiv:2601.04237v2 Announce Type: replace-cross Abstract: We demonstrate SAGE-32B, a 32 billion parameter language model that focuses on agentic reasoning and long range planning tasks. Unlike chat models that aim for general conversation fluency, SAGE-32B is designed to operate in an…