Archives AI News

Per-Axis Weight Deltas for Frequent Model Updates

arXiv:2512.19720v1 Announce Type: new Abstract: Serving many task-specialized LLM variants is often limited by the large size of fine-tuned checkpoints and the resulting cold-start latency. Since fine-tuned weights differ from their base model by relatively small structured residuals, a natural…

Lossless Model Compression via Joint Low-Rank Factorization Optimization

arXiv:2412.06867v2 Announce Type: replace Abstract: Low-rank factorization is a popular model compression technique that minimizes the error $delta$ between approximated and original weight matrices. Despite achieving performances close to the original models when $delta$ is optimized, a performance discrepancy remains…

Multi-Scale Harmonic Encoding for Feature-Wise Graph Message Passing

arXiv:2505.15015v2 Announce Type: replace Abstract: Most Graph Neural Networks (GNNs) propagate messages by treating node embeddings as holistic feature vectors, implicitly assuming uniform relevance across feature dimensions. This limits their ability to selectively transmit informative components, especially when graph structures…