Archives AI News

The Initialization Determines Whether In-Context Learning Is Gradient Descent

arXiv:2512.04268v1 Announce Type: new Abstract: In-context learning (ICL) in large language models (LLMs) is a striking phenomenon, yet its underlying mechanisms remain only partially understood. Previous work connects linear self-attention (LSA) to gradient descent (GD), this connection has primarily been…

December 5, 2025

Random Feature Spiking Neural Networks

arXiv:2510.01012v2 Announce Type: replace Abstract: Spiking Neural Networks (SNNs) as Machine Learning (ML) models have recently received a lot of attention as a potentially more energy-efficient alternative to conventional Artificial Neural Networks. The non-differentiability and sparsity of the spiking mechanism…

December 5, 2025

Bootstrapped Mixed Rewards for RL Post-Training: Injecting Canonical Action Order

arXiv:2512.04277v1 Announce Type: new Abstract: Post-training with reinforcement learning (RL) typically optimizes a single scalar objective and ignores structure in how solutions are produced. We ask whether a scalar hint toward a canonical solver ordering, used only during RL post-training,…

December 5, 2025

Identifying environmental factors associated with tetrodotoxin contamination in bivalve mollusks using eXplainable AI

arXiv:2511.20395v2 Announce Type: replace Abstract: Since 2012, tetrodotoxin (TTX) has been found in seafoods such as bivalve mollusks in temperate European waters. TTX contamination leads to food safety risks and economic losses, making early prediction of TTX contamination vital to…

December 5, 2025

GRASP: GRouped Activation Shared Parameterization for Parameter-Efficient Fine-Tuning and Robust Inference of Transformers

arXiv:2512.04296v1 Announce Type: new Abstract: Parameter-efficient fine-tuning (PEFT) provides a scalable alternative to full-model adaptation by updating only a small subset of parameters in large pre-trained models. We introduce GRASP – GRouped Activation Shared Parameterization – a lightweight PEFT framework…

December 5, 2025

Bridging Online Behavior and Clinical Insight: A Longitudinal LLM-based Study of Suicidality on YouTube Reveals Novel Digital Markers

arXiv:2506.09495v2 Announce Type: replace-cross Abstract: Suicide remains a leading cause of death in Western countries. As social media becomes central to daily life, digital footprints offer valuable insight into suicidal behavior. Focusing on individuals who attempted suicide while uploading videos…

December 5, 2025

When do spectral gradient updates help in deep learning?

arXiv:2512.04299v1 Announce Type: new Abstract: Spectral gradient methods, such as the recently popularized Muon optimizer, are a promising alternative to standard Euclidean gradient descent for training deep neural networks and transformers, but it is still unclear in which regimes they…

December 5, 2025

Triangle Multiplication Is All You Need For Biomolecular Structure Representations

arXiv:2510.18870v2 Announce Type: replace-cross Abstract: AlphaFold has transformed protein structure prediction, but emerging applications such as virtual ligand screening, proteome-wide folding, and de novo binder design demand predictions at a massive scale, where runtime and memory costs become prohibitive. A…

December 5, 2025

Evaluating Long-Context Reasoning in LLM-Based WebAgents

arXiv:2512.04307v1 Announce Type: new Abstract: As large language model (LLM)-based agents become increasingly integrated into daily digital interactions, their ability to reason across long interaction histories becomes crucial for providing personalized and contextually aware assistance. However, the performance of these…

December 5, 2025

Arbitrage: Efficient Reasoning via Advantage-Aware Speculation

arXiv:2512.05033v1 Announce Type: cross Abstract: Modern Large Language Models achieve impressive reasoning capabilities with long Chain of Thoughts, but they incur substantial computational cost during inference, and this motivates techniques to improve the performance-cost ratio. Among these techniques, Speculative Decoding…

December 5, 2025