Archives AI News

Local LLM Ensembles for Zero-shot Portuguese Named Entity Recognition

arXiv:2512.10043v1 Announce Type: new Abstract: Large Language Models (LLMs) excel in many Natural Language Processing (NLP) tasks through in-context learning but often under-perform in Named Entity Recognition (NER), especially for lower-resource languages like Portuguese. While open-weight LLMs enable local deployment,…

Detailed balance in large language model-driven agents

arXiv:2512.10047v1 Announce Type: new Abstract: Large language model (LLM)-driven agents are emerging as a powerful new paradigm for solving complex problems. Despite the empirical success of these practices, a theoretical framework to understand and unify their macroscopic dynamics remains lacking.…

Robust Gradient Descent via Heavy-Ball Momentum with Predictive Extrapolation

arXiv:2512.10033v1 Announce Type: new Abstract: Accelerated gradient methods like Nesterov’s Accelerated Gradient (NAG) achieve faster convergence on well-conditioned problems but often diverge on ill-conditioned or non-convex landscapes due to aggressive momentum accumulation. We propose Heavy-Ball Synthetic Gradient Extrapolation (HB-SGE), a…

Cluster-Dags as Powerful Background Knowledge For Causal Discovery

arXiv:2512.10032v1 Announce Type: new Abstract: Finding cause-effect relationships is of key importance in science. Causal discovery aims to recover a graph from data that succinctly describes these cause-effect relationships. However, current methods face several challenges, especially when dealing with high-dimensional…

Latent Action World Models for Control with Unlabeled Trajectories

arXiv:2512.10016v1 Announce Type: new Abstract: Inspired by how humans combine direct interaction with action-free experience (e.g., videos), we study world models that learn from heterogeneous data. Standard world models typically rely on action-conditioned trajectories, which limits effectiveness when action labels…

What matters for Representation Alignment: Global Information or Spatial Structure?

arXiv:2512.10794v1 Announce Type: cross Abstract: Representation alignment (REPA) guides generative training by distilling representations from a strong, pretrained vision encoder to intermediate diffusion features. We investigate a fundamental question: what aspect of the target representation matters for generation, its textit{global}…

DB2-TransF: All You Need Is Learnable Daubechies Wavelets for Time Series Forecasting

arXiv:2512.10051v1 Announce Type: new Abstract: Time series forecasting requires models that can efficiently capture complex temporal dependencies, especially in large-scale and high-dimensional settings. While Transformer-based architectures excel at modeling long-range dependencies, their quadratic computational complexity poses limitations on scalability and…