Archives AI News

Second-Order, First-Class: A Composable Stack for Curvature-Aware Training

arXiv:2603.25976v1 Announce Type: new Abstract: Second-order methods promise improved stability and faster convergence, yet they remain underused due to implementation overhead, tuning brittleness, and the lack of composable APIs. We introduce Somax, a composable Optax-native stack that treats curvature-aware training…

March 30, 2026

On the Objective and Feature Weights of Minkowski Weighted k-Means

arXiv:2603.25958v1 Announce Type: new Abstract: The Minkowski weighted k-means (mwk-means) algorithm extends classical k-means by incorporating feature weights and a Minkowski distance. Despite its empirical success, its theoretical properties remain insufficiently understood. We show that the mwk-means objective can be…

March 30, 2026

Decoding Defensive Coverage Responsibilities in American Football Using Factorized Attention Based Transformer Models

arXiv:2603.25901v1 Announce Type: new Abstract: Defensive coverage schemes in the National Football League (NFL) represent complex tactical patterns requiring coordinated assignments among defenders who must react dynamically to the offense’s passing concept. This paper presents a factorized attention-based transformer model…

March 30, 2026

MAGNET: Autonomous Expert Model Generation via Decentralized Autoresearch and BitNet Training

arXiv:2603.25813v1 Announce Type: new Abstract: We present MAGNET (Model Autonomously Growing Network), a decentralized system for autonomous generation, training, and serving of domain-expert language models across commodity hardware. MAGNET integrates four components: (1) autoresearch, an autonomous ML research pipeline that…

March 30, 2026

DRiffusion: Draft-and-Refine Process Parallelizes Diffusion Models with Ease

arXiv:2603.25872v1 Announce Type: new Abstract: Diffusion models have achieved remarkable success in generating high-fidelity content but suffer from slow, iterative sampling, resulting in high latency that limits their use in interactive applications. We introduce DRiffusion, a parallel sampling framework that…

March 30, 2026

On the Objective and Feature Weights of Minkowski Weighted k-Means

March 30, 2026

Cascading Bandits With Feedback

arXiv:2511.10938v2 Announce Type: replace Abstract: Motivated by the challenges of edge inference, we study a variant of the cascade bandit model in which each arm corresponds to an inference model with an associated accuracy and error probability. We analyse four…

March 30, 2026

Nonmyopic Global Optimisation via Approximate Dynamic Programming

arXiv:2412.04882v2 Announce Type: replace Abstract: Global optimisation to optimise expensive-to-evaluate black-box functions without gradient information. Bayesian optimisation, one of the most well-known techniques, typically employs Gaussian processes as surrogate models, leveraging their probabilistic nature to balance exploration and exploitation. However,…

March 30, 2026

NeST-BO: Fast Local Bayesian Optimization via Newton-Step Targeting of Gradient and Hessian Information

arXiv:2510.05516v2 Announce Type: replace Abstract: Bayesian optimization (BO) is effective for expensive black-box problems but remains challenging in high dimensions. We propose NeST-BO, a curvature-aware local BO method that targets a (modified) Newton step by jointly learning gradient and Hessian…

March 30, 2026

Task Tokens: A Flexible Approach to Adapting Behavior Foundation Models

arXiv:2503.22886v2 Announce Type: replace Abstract: Recent advancements in imitation learning have led to transformer-based behavior foundation models (BFMs) that enable multi-modal, human-like control for humanoid agents. While excelling at zero-shot generation of robust behaviors, BFMs often require meticulous prompt engineering…

March 30, 2026