Archives AI News

Training Language Models to Reason Efficiently

arXiv:2502.04463v4 Announce Type: replace Abstract: Scaling model size and training data has led to great advances in the performance of Large Language Models (LLMs). However, the diminishing returns of this approach necessitate alternative methods to improve model capabilities, particularly in…

Predicting Microbial Interactions Using Graph Neural Networks

arXiv:2511.02038v1 Announce Type: new Abstract: Predicting interspecies interactions is a key challenge in microbial ecology, as these interactions are critical to determining the structure and activity of microbial communities. In this work, we used data on monoculture growth capabilities, interactions…

Noise-based reward-modulated learning

arXiv:2503.23972v3 Announce Type: replace Abstract: The pursuit of energy-efficient and adaptive artificial intelligence (AI) has positioned neuromorphic computing as a promising alternative to conventional computing. However, achieving learning on these platforms requires techniques that prioritize local information while enabling effective…

Quantum-Enhanced Generative Models for Rare Event Prediction

arXiv:2511.02042v1 Announce Type: new Abstract: Rare events such as financial crashes, climate extremes, and biological anomalies are notoriously difficult to model due to their scarcity and heavy-tailed distributions. Classical deep generative models often struggle to capture these rare occurrences, either…

Flashlight: PyTorch Compiler Extensions to Accelerate Attention Variants

arXiv:2511.02043v1 Announce Type: new Abstract: Bad charactors when submitting to arXiv: Attention is a fundamental building block of large language models (LLMs), so there have been many efforts to implement it efficiently. For example, FlashAttention leverages tiling and kernel fusion…

Learning to Steer: Input-dependent Steering for Multimodal LLMs

arXiv:2508.12815v2 Announce Type: replace-cross Abstract: Steering has emerged as a practical approach to enable post-hoc guidance of LLMs towards enforcing a specific behavior. However, it remains largely underexplored for multimodal LLMs (MLLMs); furthermore, existing steering techniques, such as mean steering,…