Archives AI News

Share Your Attention: Transformer Weight Sharing via Matrix-based Dictionary Learning

arXiv:2508.04581v2 Announce Type: replace-cross Abstract: Large language models have revolutionized AI applications, yet their high computational and memory demands hinder their widespread deployment. Existing compression techniques focus on intra-block optimizations (e.g., low-rank approximation or attention pruning), while the repetitive layered…

Who Said Neural Networks Aren’t Linear?

arXiv:2510.08570v2 Announce Type: replace Abstract: Neural networks are famously nonlinear. However, linearity is defined relative to a pair of vector spaces, $f:X to Y$. Leveraging the algebraic concept of transport of structure, we propose a method to explicitly identify non-standard…

SUNLayer: Stable denoising with generative networks

arXiv:1803.09319v2 Announce Type: replace Abstract: Deep neural networks are often used to implement powerful generative models for real-world data. Notable applications include image denoising, as well as other classical inverse problems like compressed sensing and super-resolution. To provide a rigorous…

Learning to Weight Parameters for Training Data Attribution

arXiv:2506.05647v4 Announce Type: replace Abstract: We study gradient-based data attribution, aiming to identify which training examples most influence a given output. Existing methods for this task either treat network parameters uniformly or rely on implicit weighting derived from Hessian approximations,…

Probabilistic NDVI Forecasting from Sparse Satellite Time Series and Weather Covariates

arXiv:2602.17683v1 Announce Type: new Abstract: Accurate short-term forecasting of vegetation dynamics is a key enabler for data-driven decision support in precision agriculture. Normalized Difference Vegetation Index (NDVI) forecasting from satellite observations, however, remains challenging due to sparse and irregular sampling…

Duality Models: An Embarrassingly Simple One-step Generation Paradigm

arXiv:2602.17682v1 Announce Type: new Abstract: Consistency-based generative models like Shortcut and MeanFlow achieve impressive results via a target-aware design for solving the Probability Flow ODE (PF-ODE). Typically, such methods introduce a target time $r$ alongside the current time $t$ to…