Archives AI News

FATE: A Formal Benchmark Series for Frontier Algebra of Multiple Difficulty Levels

arXiv:2511.02872v3 Announce Type: replace Abstract: Recent advances in large language models (LLMs) have demonstrated impressive capabilities in formal theorem proving, particularly on contest-based mathematical benchmarks like the IMO. However, these contests do not reflect the depth, breadth, and abstraction of…

Correction of Decoupled Weight Decay

arXiv:2512.08217v2 Announce Type: replace Abstract: Decoupled weight decay, solely responsible for the performance advantage of AdamW over Adam, has long been set to proportional to learning rate $gamma$ without questioning. Some researchers have recently challenged such assumption and argued that…

Cluster Workload Allocation: Semantic Soft Affinity Using Natural Language Processing

arXiv:2601.09282v2 Announce Type: replace Abstract: Cluster workload allocation often requires complex configurations, creating a usability gap. This paper introduces a semantic, intent-driven scheduling paradigm for cluster systems using Natural Language Processing. The system employs a Large Language Model (LLM) integrated…

Share Your Attention: Transformer Weight Sharing via Matrix-based Dictionary Learning

arXiv:2508.04581v2 Announce Type: replace-cross Abstract: Large language models have revolutionized AI applications, yet their high computational and memory demands hinder their widespread deployment. Existing compression techniques focus on intra-block optimizations (e.g., low-rank approximation or attention pruning), while the repetitive layered…

Who Said Neural Networks Aren’t Linear?

arXiv:2510.08570v2 Announce Type: replace Abstract: Neural networks are famously nonlinear. However, linearity is defined relative to a pair of vector spaces, $f:X to Y$. Leveraging the algebraic concept of transport of structure, we propose a method to explicitly identify non-standard…

SUNLayer: Stable denoising with generative networks

arXiv:1803.09319v2 Announce Type: replace Abstract: Deep neural networks are often used to implement powerful generative models for real-world data. Notable applications include image denoising, as well as other classical inverse problems like compressed sensing and super-resolution. To provide a rigorous…