Archives AI News

Library Liberation: Competitive Performance Matmul Through Compiler-composed Nanokernels

arXiv:2511.13764v1 Announce Type: new Abstract: The rapidly evolving landscape of AI and machine learning workloads has widened the gap between high-level domain operations and efficient hardware utilization. Achieving near-peak performance still demands deep hardware expertise-experts either handcraft target-specific kernels (e.g.,…

Clone Deterministic 3D Worlds

arXiv:2510.26782v2 Announce Type: replace Abstract: A world model is an internal model that simulates how the world evolves. Given past observations and actions, it predicts the future physical state of both the embodied agent and its environment. Accurate world models…

Credal Ensemble Distillation for Uncertainty Quantification

arXiv:2511.13766v1 Announce Type: new Abstract: Deep ensembles (DE) have emerged as a powerful approach for quantifying predictive uncertainty and distinguishing its aleatoric and epistemic components, thereby enhancing model robustness and reliability. However, their high computational and memory costs during inference…

Dynamic Temperature Scheduler for Knowledge Distillation

arXiv:2511.13767v1 Announce Type: new Abstract: Knowledge Distillation (KD) trains a smaller student model using a large, pre-trained teacher model, with temperature as a key hyperparameter controlling the softness of output probabilities. Traditional methods use a fixed temperature throughout training, which…

MoM: Linear Sequence Modeling with Mixture-of-Memories

arXiv:2502.13685v4 Announce Type: replace-cross Abstract: Linear sequence modeling methods, such as linear attention, state space modeling, and linear RNNs, offer significant efficiency improvements by reducing the complexity of training and inference. However, these methods typically compress the entire input sequence…

Compiling to linear neurons

arXiv:2511.13769v1 Announce Type: new Abstract: We don’t program neural networks directly. Instead, we rely on an indirect style where learning algorithms, like gradient descent, determine a neural network’s function by learning from data. This indirect style is often a virtue;…