Archives AI News

Judge Model for Large-scale Multimodality Benchmarks

arXiv:2601.06106v1 Announce Type: new Abstract: We propose a dedicated multimodal Judge Model designed to provide reliable, explainable evaluation across a diverse suite of tasks. Our benchmark spans text, audio, image, and video modalities, drawing from carefully sampled public datasets with…

The Impact of Post-training on Data Contamination

arXiv:2601.06103v1 Announce Type: new Abstract: We present a controlled study of how dataset contamination interacts with the post-training stages now standard in large language model training pipelines. Starting from clean checkpoints of Qwen2.5 (0.5B) and Gemma3 (1B/4B), we inject five…

Australian Bushfire Intelligence with AI-Driven Environmental Analytics

arXiv:2601.06105v1 Announce Type: new Abstract: Bushfires are among the most destructive natural hazards in Australia, causing significant ecological, economic, and social damage. Accurate prediction of bushfire intensity is therefore essential for effective disaster preparedness and response. This study examines the…

Filtering Beats Fine Tuning: A Bayesian Kalman View of In Context Learning in LLMs

arXiv:2601.06100v1 Announce Type: new Abstract: We present a theory-first framework that interprets inference-time adaptation in large language models (LLMs) as online Bayesian state estimation. Rather than modeling rapid adaptation as implicit optimization or meta-learning, we formulate task- and context-specific learning…

The Hessian of tall-skinny networks is easy to invert

arXiv:2601.06096v1 Announce Type: new Abstract: We describe an exact algorithm for solving linear systems $Hx=b$ where $H$ is the Hessian of a deep net. The method computes Hessian-inverse-vector products without storing the Hessian or its inverse in time and storage…

Enabling Long FFT Convolutions on Memory-Constrained FPGAs via Chunking

arXiv:2601.06065v1 Announce Type: new Abstract: The need for long-context reasoning has led to alternative neural network architectures besides Transformers and self-attention, a popular model being Hyena, which employs causal 1D-convolutions implemented with FFTs. Long convolutions enable efficient global context mixing,…

A Complete Decomposition of Stochastic Differential Equations

arXiv:2601.07834v1 Announce Type: cross Abstract: We show that any stochastic differential equation with prescribed time-dependent marginal distributions admits a decomposition into three components: a unique scalar field governing marginal evolution, a symmetric positive-semidefinite diffusion matrix field and a skew-symmetric matrix…

DAST: Difficulty-Adaptive Slow-Thinking for Large Reasoning Models

arXiv:2503.04472v3 Announce Type: replace Abstract: Recent advancements in slow thinking reasoning models have shown exceptional performance in complex reasoning tasks. However, these models often exhibit overthinking (generating redundant reasoning steps for simple problems), leading to excessive computational resource usage. While…