Archives AI News

Neural Diversity Regularizes Hallucinations in Language Models

arXiv:2510.20690v2 Announce Type: replace-cross Abstract: Language models continue to hallucinate despite increases in parameters, compute, and data. We propose neural diversity — decorrelated parallel representations — as a principled mechanism that reduces hallucination rates at fixed parameter and data budgets.…

A Minimalist Optimizer Design for LLM Pretraining

arXiv:2506.16659v2 Announce Type: replace Abstract: Training large language models (LLMs) typically relies on adaptive optimizers such as Adam, which introduce extra operations and require significant more memory to maintain first- and second-order moments than SGD. While recent works such as…

Resolving Conflicts in Lifelong Learning via Aligning Updates in Subspaces

arXiv:2512.08960v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) enables efficient Continual Learning but often suffers from catastrophic forgetting due to destructive interference between tasks. Our analysis reveals that this degradation is primarily driven by antagonistic directional updates where new task…

SEA: Spectral Edge Attacks on Graph Neural Networks

arXiv:2512.08964v1 Announce Type: new Abstract: Graph Neural Networks (GNNs) achieve strong performance on graph-structured data, but are notoriously vulnerable to small, carefully crafted perturbations of the graph structure. Most existing structure-based attacks rely on gradient-based heuristics or local connectivity patterns,…

LUMOS: Large User MOdels for User Behavior Prediction

arXiv:2512.08957v1 Announce Type: new Abstract: User behavior prediction at scale remains a critical challenge for online B2C platforms. Traditional approaches rely heavily on task-specific models and domain-specific feature engineering. This is time-consuming, computationally expensive, and requires domain expertise and therefore…