Archives AI News

MoM: Linear Sequence Modeling with Mixture-of-Memories

arXiv:2502.13685v4 Announce Type: replace-cross Abstract: Linear sequence modeling methods, such as linear attention, state space modeling, and linear RNNs, offer significant efficiency improvements by reducing the complexity of training and inference. However, these methods typically compress the entire input sequence…

Compiling to linear neurons

arXiv:2511.13769v1 Announce Type: new Abstract: We don’t program neural networks directly. Instead, we rely on an indirect style where learning algorithms, like gradient descent, determine a neural network’s function by learning from data. This indirect style is often a virtue;…

O3SLM: Open Weight, Open Data, and Open Vocabulary Sketch-Language Model

arXiv:2511.14368v1 Announce Type: cross Abstract: While Large Vision Language Models (LVLMs) are increasingly deployed in real-world applications, their ability to interpret abstract visual inputs remains limited. Specifically, they struggle to comprehend hand-drawn sketches, a modality that offers an intuitive means…

DeepBlip: Estimating Conditional Average Treatment Effects Over Time

arXiv:2511.14545v1 Announce Type: cross Abstract: Structural nested mean models (SNMMs) are a principled approach to estimate the treatment effects over time. A particular strength of SNMMs is to break the joint effect of treatment sequences over time into localized, time-specific…