Archives AI News

Exploiting the Experts: Unauthorized Compression in MoE-LLMs

arXiv:2511.19480v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) architectures are increasingly adopted in large language models (LLMs) for their scalability and efficiency. However, their modular structure introduces a unique vulnerability: adversaries can attempt to compress or repurpose models by pruning experts…

WavefrontDiffusion: Dynamic Decoding Schedule or Improved Reasoning

arXiv:2511.19473v1 Announce Type: new Abstract: Diffusion Language Models (DLMs) have shown strong potential for text generation and are becoming a competitive alternative to autoregressive models. The denoising strategy plays an important role in determining the quality of their outputs. Mainstream…

Elucidated Rolling Diffusion Models for Probabilistic Weather Forecasting

arXiv:2506.20024v2 Announce Type: replace Abstract: Diffusion models are a powerful tool for probabilistic forecasting, yet most applications in high-dimensional complex systems predict future states individually. This approach struggles to model complex temporal dependencies and fails to explicitly account for the…

Softmax Transformers are Turing-Complete

arXiv:2511.20038v1 Announce Type: cross Abstract: Hard attention Chain-of-Thought (CoT) transformers are known to be Turing-complete. However, it is an open problem whether softmax attention Chain-of-Thought (CoT) transformers are Turing-complete. In this paper, we prove a stronger result that length-generalizable softmax…

STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flow

arXiv:2511.20462v1 Announce Type: cross Abstract: Normalizing flows (NFs) are end-to-end likelihood-based generative models for continuous data, and have recently regained attention with encouraging progress on image generation. Yet in the video generation domain, where spatiotemporal complexity and computational cost are…