Archives AI News

Distributional Consistency Loss: Beyond Pointwise Data Terms in Inverse Problems

arXiv:2510.13972v1 Announce Type: new Abstract: Recovering true signals from noisy measurements is a central challenge in inverse problems spanning medical imaging, geophysics, and signal processing. Current solutions balance prior assumptions regarding the true signal (regularization) with agreement to noisy measured…

Efficient & Correct Predictive Equivalence for Decision Trees

arXiv:2509.17774v4 Announce Type: replace-cross Abstract: The Rashomon set of decision trees (DTs) finds importance uses. Recent work showed that DTs computing the same classification function, i.e. predictive equivalent DTs, can represent a significant fraction of the Rashomon set. Such redundancy…

BitNet Distillation

arXiv:2510.13998v1 Announce Type: new Abstract: In this paper, we present BitNet Distillation (BitDistill), a lightweight pipeline that fine-tunes off-the-shelf full-precision LLMs (e.g., Qwen) into 1.58-bit precision (i.e., ternary weights {-1, 0, 1}) for specific downstream tasks, achieving strong task-specific performance…

From Loop Nests to Silicon: Mapping AI Workloads onto AMD NPUs with MLIR-AIR

arXiv:2510.14871v1 Announce Type: cross Abstract: General-purpose compilers abstract away parallelism, locality, and synchronization, limiting their effectiveness on modern spatial architectures. As modern computing architectures increasingly rely on fine-grained control over data movement, execution order, and compute placement for performance, compiler…

REAP the Experts: Why Pruning Prevails for One-Shot MoE compression

arXiv:2510.13999v1 Announce Type: new Abstract: Sparsely-activated Mixture-of-Experts (SMoE) models offer efficient pre-training and low latency but their large parameter counts create significant memory overhead, motivating research into expert compression. Contrary to recent findings favouring expert merging on discriminative benchmarks, we…

RDD: Retrieval-Based Demonstration Decomposer for Planner Alignment in Long-Horizon Tasks

arXiv:2510.14968v1 Announce Type: cross Abstract: To tackle long-horizon tasks, recent hierarchical vision-language-action (VLAs) frameworks employ vision-language model (VLM)-based planners to decompose complex manipulation tasks into simpler sub-tasks that low-level visuomotor policies can easily handle. Typically, the VLM planner is finetuned…

Conditional Clifford-Steerable CNNs with Complete Kernel Basis for PDE Modeling

arXiv:2510.14007v1 Announce Type: new Abstract: Clifford-Steerable CNNs (CSCNNs) provide a unified framework that allows incorporating equivariance to arbitrary pseudo-Euclidean groups, including isometries of Euclidean space and Minkowski spacetime. In this work, we demonstrate that the kernel basis of CSCNNs is…

REX: Causal discovery based on machine learning and explainability techniques

arXiv:2501.12706v2 Announce Type: replace Abstract: Explainable Artificial Intelligence (XAI) techniques hold significant potential for enhancing the causal discovery process, which is crucial for understanding complex systems in areas like healthcare, economics, and artificial intelligence. However, no causal discovery methods currently…

Uni-LoRA: One Vector is All You Need

arXiv:2506.00799v2 Announce Type: replace Abstract: Low-Rank Adaptation (LoRA) has become the de facto parameter-efficient fine-tuning (PEFT) method for large language models (LLMs) by constraining weight updates to low-rank matrices. Recent works such as Tied-LoRA, VeRA, and VB-LoRA push efficiency further…