Archives AI News

Prior Knowledge-enhanced Spatio-temporal Epidemic Forecasting

arXiv:2602.22270v1 Announce Type: new Abstract: Spatio-temporal epidemic forecasting is critical for public health management, yet existing methods often struggle with insensitivity to weak epidemic signals, over-simplified spatial relations, and unstable parameter estimation. To address these challenges, we propose the Spatio-Temporal…

Support Tokens, Stability Margins, and a New Foundation for Robust LLMs

arXiv:2602.22271v1 Announce Type: new Abstract: Self-attention is usually described as a flexible, content-adaptive way to mix a token with information from its past. We re-interpret causal self-attention transformers, the backbone of modern foundation models, within a probabilistic framework, much like…

Agentic Framework for Epidemiological Modeling

arXiv:2602.00299v2 Announce Type: replace Abstract: Epidemic modeling is essential for public health planning, yet traditional approaches rely on fixed model classes that require manual redesign as pathogens, policies, and scenario assumptions evolve. We introduce EPIAGENT, an agentic framework that automatically…

Positional-aware Spatio-Temporal Network for Large-Scale Traffic Prediction

arXiv:2602.22274v1 Announce Type: new Abstract: Traffic flow forecasting has emerged as an indispensable mission for daily life, which is required to utilize the spatiotemporal relationship between each location within a time period under a graph structure to predict future flow.…

Muon+: Towards Better Muon via One Additional Normalization Step

arXiv:2602.21545v2 Announce Type: replace Abstract: The Muon optimizer has demonstrated promising performance in pre-training large language models through gradient (or momentum) orthogonalization. In this work, we propose a simple yet effective enhancement to Muon, namely Muon+, which introduces an additional…

The Spacetime of Diffusion Models: An Information Geometry Perspective

arXiv:2505.17517v4 Announce Type: replace Abstract: We present a novel geometric perspective on the latent space of diffusion models. We first show that the standard pullback approach, utilizing the deterministic probability flow ODE decoder, is fundamentally flawed. It provably forces geodesics…

Simplex-to-Euclidean Bijections for Categorical Flow Matching

arXiv:2510.27480v2 Announce Type: replace Abstract: We propose a method for learning and sampling from probability distributions supported on the simplex. Our approach maps the open simplex to Euclidean space via smooth bijections, leveraging the Aitchison geometry to define the mappings,…

MoDora: Tree-Based Semi-Structured Document Analysis System

arXiv:2602.23061v1 Announce Type: cross Abstract: Semi-structured documents integrate diverse interleaved data elements (e.g., tables, charts, hierarchical paragraphs) arranged in various and often irregular layouts. These documents are widely observed across domains and account for a large portion of real-world data.…

Efficient Graph Coloring with Neural Networks: A Physics-Inspired Approach for Large Graphs

arXiv:2408.01503v2 Announce Type: replace Abstract: Combinatorial optimization problems near algorithmic phase transitions represent a fundamental challenge for both classical algorithms and machine learning approaches. Among them, graph coloring stands as a prototypical constraint satisfaction problem exhibiting sharp dynamical and satisfiability…