Archives AI News

MS-SSM: A Multi-Scale State Space Model for Efficient Sequence Modeling

arXiv:2512.23824v1 Announce Type: new Abstract: State-space models (SSMs) have recently attention as an efficient alternative to computationally expensive attention-based models for sequence modeling. They rely on linear recurrences to integrate information over time, enabling fast inference, parallelizable training, and control…

Exploiting the Prior of Generative Time Series Imputation

arXiv:2512.23832v1 Announce Type: new Abstract: Time series imputation, i.e., filling the missing values of a time recording, finds various applications in electricity, finance, and weather modelling. Previous methods have introduced generative models such as diffusion probabilistic models and Schrodinger bridge…

mHC: Manifold-Constrained Hyper-Connections

arXiv:2512.24880v1 Announce Type: cross Abstract: Recently, studies exemplified by Hyper-Connections (HC) have extended the ubiquitous residual connection paradigm established over the past decade by expanding the residual stream width and diversifying connectivity patterns. While yielding substantial performance gains, this diversification…

Trellis: Learning to Compress Key-Value Memory in Attention Models

arXiv:2512.23852v1 Announce Type: new Abstract: Transformers, while powerful, suffer from quadratic computational complexity and the ever-growing Key-Value (KV) cache of the attention mechanism. This paper introduces Trellis, a novel Transformer architecture with bounded memory that learns how to compress its…

Flow Matching Neural Processes

arXiv:2512.23853v1 Announce Type: new Abstract: Neural processes (NPs) are a class of models that learn stochastic processes directly from data and can be used for inference, sampling and conditional sampling. We introduce a new NP model based on flow matching,…

Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling

arXiv:2402.18508v3 Announce Type: replace Abstract: In the rapidly evolving field of deep learning, the demand for models that are both expressive and computationally efficient has never been more critical. This paper introduces Orchid, a novel architecture designed to address the…