Archives AI News

Analytical FFN-to-MoE Restructuring via Activation Pattern Analysis

arXiv:2502.04416v3 Announce Type: replace Abstract: Scaling large language models (LLMs) improves performance but significantly increases inference costs, with feed-forward networks (FFNs) consuming the majority of computational resources. While Mixture-of-Experts (MoE) architectures can reduce this cost through sparse activation, restructuring existing…

Differentially Private Model Merging

arXiv:2604.20985v1 Announce Type: new Abstract: In machine learning applications, privacy requirements during inference or deployment time could change constantly due to varying policies, regulations, or user experience. In this work, we aim to generate a magnitude of models to satisfy…

HyperAdapt: Simple High-Rank Adaptation

arXiv:2509.18629v3 Announce Type: replace Abstract: Foundation models excel across diverse tasks, but adapting them to specialized applications often requires fine-tuning, an approach that is memory and compute-intensive. Parameter-efficient fine-tuning (PEFT) methods mitigate this by updating only a small subset of…

Beyond Accuracy: A Stability-Aware Metric for Multi-Horizon Forecasting

arXiv:2601.10863v3 Announce Type: replace Abstract: Traditional time series forecasting methods optimize for accuracy alone. This objective neglects temporal consistency, in other words, how consistently a model predicts the same future event as the forecast origin changes. We introduce the forecast…

Adaptive Soft Error Protection for Neural Network Processing

arXiv:2407.19664v3 Announce Type: replace Abstract: Previous research on selective protection for neural network components typically exploits only static vulnerability differences. Although these methods improve upon classical modular redundancy, they still incur substantial overhead for neural network workloads that are both…