Archives AI News

Super Apriel: One Checkpoint, Many Speeds

arXiv:2604.19877v1 Announce Type: new Abstract: We release Super Apriel, a 15B-parameter supernet in which every decoder layer provides four trained mixer choices — Full Attention (FA), Sliding Window Attention (SWA), Kimi Delta Attention (KDA), and Gated DeltaNet (GDN). A placement…

Graph-Theoretic Models for the Prediction of Molecular Measurements

arXiv:2604.19840v1 Announce Type: new Abstract: Graph-theoretic approaches offer simplicity, interpretability, and low computational cost for molecular property prediction. Among these, the model proposed by Mukwembi and Nyabadza, based on the external activity $D(G)$ and internal activity $zeta(G)$ indices, achieved strong…

Expert Upcycling: Shifting the Compute-Efficient Frontier of Mixture-of-Experts

arXiv:2604.19835v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) has become the dominant architecture for scaling large language models: frontier models routinely decouple total parameters from per-token computation through sparse expert routing. Scaling laws show that under fixed active computation, model quality…

Epistemology gives a Future to Complementarity in Human-AI Interactions

arXiv:2601.09871v2 Announce Type: replace-cross Abstract: Human-AI complementarity is the claim that a human supported by an AI system can outperform either alone in a decision-making process. Since its introduction in the humanAI interaction literature, it has gained traction by generalizing…