Archives AI News

TawPipe: Topology-Aware Weight Pipeline Parallelism for Accelerating Long-Context Large Models Training

arXiv:2511.09741v1 Announce Type: new Abstract: Training large language models (LLMs) is fundamentally constrained by limited device memory and costly inter-device communication. Although pipeline parallelism alleviates memory pressure by partitioning models across devices, it incurs activation communication overhead that scales linearly…

November 14, 2025

Caption, Create, Continue: Continual Learning with Pre-trained Generative Vision-Language Models

arXiv:2409.17806v2 Announce Type: replace Abstract: Continual learning (CL) enables models to adapt to evolving data streams without catastrophic forgetting, a fundamental requirement for real-world AI systems. However, the current methods often depend on large replay buffers or heavily annotated datasets…

November 14, 2025

History Rhymes: Macro-Contextual Retrieval for Robust Financial Forecasting

arXiv:2511.09754v1 Announce Type: new Abstract: Financial markets are inherently non-stationary: structural breaks and macroeconomic regime shifts often cause forecasting models to fail when deployed out of distribution (OOD). Conventional multimodal approaches that simply fuse numerical indicators and textual sentiment rarely…

November 14, 2025

ELECTRA: A Cartesian Network for 3D Charge Density Prediction with Floating Orbitals

arXiv:2503.08305v3 Announce Type: replace Abstract: We present the Electronic Tensor Reconstruction Algorithm (ELECTRA) – an equivariant model for predicting electronic charge densities using floating orbitals. Floating orbitals are a long-standing concept in the quantum chemistry community that promises more compact…

November 14, 2025

Is nasty noise actually harder than malicious noise?

arXiv:2511.09763v1 Announce Type: new Abstract: We consider the relative abilities and limitations of computationally efficient algorithms for learning in the presence of noise, under two well-studied and challenging adversarial noise models for learning Boolean functions: malicious noise, in which an…

November 14, 2025

FlashKAT: Understanding and Addressing Performance Bottlenecks in the Kolmogorov-Arnold Transformer

arXiv:2505.13813v2 Announce Type: replace Abstract: The Kolmogorov-Arnold Network (KAN) has been gaining popularity as an alternative to the multi-layer perceptron (MLP) with its increased expressiveness and interpretability. Even so, the KAN suffers from being orders of magnitude slower due to…

November 14, 2025

NeuroLingua: A Language-Inspired Hierarchical Framework for Multimodal Sleep Stage Classification Using EEG and EOG

arXiv:2511.09773v1 Announce Type: new Abstract: Automated sleep stage classification from polysomnography remains limited by the lack of expressive temporal hierarchies, challenges in multimodal EEG and EOG fusion, and the limited interpretability of deep learning models. We propose NeuroLingua, a language-inspired…

November 14, 2025

HyperEvent: A Strong Baseline for Dynamic Link Prediction via Relative Structural Encoding

arXiv:2507.11836v3 Announce Type: replace Abstract: Learning representations for continuous-time dynamic graphs is critical for dynamic link prediction. While recent methods have become increasingly complex, the field lacks a strong and informative baseline to reliably gauge progress. This paper proposes HyperEvent,…

November 14, 2025

Hail to the Thief: Exploring Attacks and Defenses in Decentralised GRPO

arXiv:2511.09780v1 Announce Type: new Abstract: Group Relative Policy Optimization (GRPO) has demonstrated great utilization in post-training of Large Language Models (LLMs). In GRPO, prompts are answered by the model and, through reinforcement learning, preferred completions are learnt. Owing to the…

November 14, 2025

Inference Offloading for Cost-Sensitive Binary Classification at the Edge

arXiv:2509.15674v2 Announce Type: replace Abstract: We focus on a binary classification problem in an edge intelligence system where false negatives are more costly than false positives. The system has a compact, locally deployed model, which is supplemented by a larger,…

November 14, 2025