Archives AI News

Share Your Attention: Transformer Weight Sharing via Matrix-based Dictionary Learning

arXiv:2508.04581v2 Announce Type: replace-cross Abstract: Large language models have revolutionized AI applications, yet their high computational and memory demands hinder their widespread deployment. Existing compression techniques focus on intra-block optimizations (e.g., low-rank approximation or attention pruning), while the repetitive layered…

February 23, 2026

Who Said Neural Networks Aren’t Linear?

arXiv:2510.08570v2 Announce Type: replace Abstract: Neural networks are famously nonlinear. However, linearity is defined relative to a pair of vector spaces, $f:X to Y$. Leveraging the algebraic concept of transport of structure, we propose a method to explicitly identify non-standard…

February 23, 2026

Communication-Corruption Coupling and Verification in Cooperative Multi-Objective Bandits

arXiv:2601.11924v2 Announce Type: replace Abstract: We study cooperative stochastic multi-armed bandits with vector-valued rewards under adversarial corruption and limited verification. In each of $T$ rounds, each of $N$ agents selects an arm, the environment generates a clean reward vector, and…

February 23, 2026

SUNLayer: Stable denoising with generative networks

arXiv:1803.09319v2 Announce Type: replace Abstract: Deep neural networks are often used to implement powerful generative models for real-world data. Notable applications include image denoising, as well as other classical inverse problems like compressed sensing and super-resolution. To provide a rigorous…

February 23, 2026

Learning to Weight Parameters for Training Data Attribution

arXiv:2506.05647v4 Announce Type: replace Abstract: We study gradient-based data attribution, aiming to identify which training examples most influence a given output. Existing methods for this task either treat network parameters uniformly or rely on implicit weighting derived from Hessian approximations,…

February 23, 2026

Optimal Multi-Debris Mission Planning in LEO: A Deep Reinforcement Learning Approach with Co-Elliptic Transfers and Refueling

arXiv:2602.17685v1 Announce Type: new Abstract: This paper addresses the challenge of multi target active debris removal (ADR) in Low Earth Orbit (LEO) by introducing a unified coelliptic maneuver framework that combines Hohmann transfers, safety ellipse proximity operations, and explicit refueling…

February 23, 2026

BONNI: Gradient-Informed Bayesian and Interior Point Optimization for Efficient Inverse Design in Nanophotonics

arXiv:2602.18148v1 Announce Type: cross Abstract: Inverse design, particularly geometric shape optimization, provides a systematic approach for developing high-performance nanophotonic devices. While numerous optimization algorithms exist, previous global approaches exhibit slow convergence and conversely local search strategies frequently become trapped in…

February 23, 2026

Probabilistic NDVI Forecasting from Sparse Satellite Time Series and Weather Covariates

arXiv:2602.17683v1 Announce Type: new Abstract: Accurate short-term forecasting of vegetation dynamics is a key enabler for data-driven decision support in precision agriculture. Normalized Difference Vegetation Index (NDVI) forecasting from satellite observations, however, remains challenging due to sparse and irregular sampling…

February 23, 2026

CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models

arXiv:2602.17684v1 Announce Type: new Abstract: Reinforcement Learning from Verifiable Rewards (RLVR) has driven recent progress in code large language models by leveraging execution-based feedback from unit tests, but its scalability is fundamentally constrained by the availability and reliability of high-quality…

February 23, 2026

Duality Models: An Embarrassingly Simple One-step Generation Paradigm

arXiv:2602.17682v1 Announce Type: new Abstract: Consistency-based generative models like Shortcut and MeanFlow achieve impressive results via a target-aware design for solving the Probability Flow ODE (PF-ODE). Typically, such methods introduce a target time $r$ alongside the current time $t$ to…

February 23, 2026