Archives AI News

KANMixer: a minimal KAN-centered mixer for long-term time series forecasting

arXiv:2508.01575v2 Announce Type: replace Abstract: Long-term time series forecasting (LTSF) underpins critical applications from energy management to weather prediction, yet achieving reliable multi-step-ahead accuracy remains challenging. Existing LTSF approaches, dominated by MLP- and Transformer-based architectures, either rely on simple linear…

Fairness-Aware Multi-Group Target Detection in Online Discussion

arXiv:2407.11933v4 Announce Type: replace Abstract: Target-group detection is the task of detecting which group(s) a piece of content is “directed at or about”. Applications include targeted marketing, content recommendation, and group-specific content assessment. Key challenges include: 1) that a single…

Super Apriel: One Checkpoint, Many Speeds

arXiv:2604.19877v1 Announce Type: new Abstract: We release Super Apriel, a 15B-parameter supernet in which every decoder layer provides four trained mixer choices — Full Attention (FA), Sliding Window Attention (SWA), Kimi Delta Attention (KDA), and Gated DeltaNet (GDN). A placement…

Graph-Theoretic Models for the Prediction of Molecular Measurements

arXiv:2604.19840v1 Announce Type: new Abstract: Graph-theoretic approaches offer simplicity, interpretability, and low computational cost for molecular property prediction. Among these, the model proposed by Mukwembi and Nyabadza, based on the external activity $D(G)$ and internal activity $zeta(G)$ indices, achieved strong…

Expert Upcycling: Shifting the Compute-Efficient Frontier of Mixture-of-Experts

arXiv:2604.19835v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) has become the dominant architecture for scaling large language models: frontier models routinely decouple total parameters from per-token computation through sparse expert routing. Scaling laws show that under fixed active computation, model quality…