Archives AI News

A Minimalist Optimizer Design for LLM Pretraining

arXiv:2506.16659v2 Announce Type: replace Abstract: Training large language models (LLMs) typically relies on adaptive optimizers such as Adam, which introduce extra operations and require significant more memory to maintain first- and second-order moments than SGD. While recent works such as…

Resolving Conflicts in Lifelong Learning via Aligning Updates in Subspaces

arXiv:2512.08960v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) enables efficient Continual Learning but often suffers from catastrophic forgetting due to destructive interference between tasks. Our analysis reveals that this degradation is primarily driven by antagonistic directional updates where new task…

SEA: Spectral Edge Attacks on Graph Neural Networks

arXiv:2512.08964v1 Announce Type: new Abstract: Graph Neural Networks (GNNs) achieve strong performance on graph-structured data, but are notoriously vulnerable to small, carefully crafted perturbations of the graph structure. Most existing structure-based attacks rely on gradient-based heuristics or local connectivity patterns,…

LUMOS: Large User MOdels for User Behavior Prediction

arXiv:2512.08957v1 Announce Type: new Abstract: User behavior prediction at scale remains a critical challenge for online B2C platforms. Traditional approaches rely heavily on task-specific models and domain-specific feature engineering. This is time-consuming, computationally expensive, and requires domain expertise and therefore…

EEG-Bench: A Benchmark for EEG Foundation Models in Clinical Applications

arXiv:2512.08959v1 Announce Type: new Abstract: We introduce a unified benchmarking framework focused on evaluating EEG-based foundation models in clinical applications. The benchmark spans 11 well-defined diagnostic tasks across 14 publicly available EEG datasets, including epilepsy, schizophrenia, Parkinson’s disease, OCD, and…

LLM4XCE: Large Language Models for Extremely Large-Scale Massive MIMO Channel Estimation

arXiv:2512.08955v1 Announce Type: new Abstract: Extremely large-scale massive multiple-input multiple-output (XL-MIMO) is a key enabler for sixth-generation (6G) networks, offering massive spatial degrees of freedom. Despite these advantages, the coexistence of near-field and far-field effects in hybrid-field channels presents significant…