Archives AI News

Leveraging Wikidata for Geographically Informed Sociocultural Bias Dataset Creation: Application to Latin America

arXiv:2603.10001v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) exhibit inequalities with respect to various cultural contexts. Most prominent open-weights models are trained on Global North data and show prejudicial behavior towards other cultures. Moreover, there is a notable lack…

March 13, 2026

High-resolution weather-guided surrogate modeling for data-efficient cross-location building energy prediction

arXiv:2603.11121v1 Announce Type: new Abstract: Building design optimization often depends on physics-based simulation tools such as EnergyPlus, which, although accurate, are computationally expensive and slow. Surrogate models provide a faster alternative, yet most are location-specific, and even weather-informed variants require…

March 13, 2026

Zero-Shot Cross-City Generalization in End-to-End Autonomous Driving: Self-Supervised versus Supervised Representations

arXiv:2603.11417v1 Announce Type: cross Abstract: End-to-end autonomous driving models are typically trained on multi-city datasets using supervised ImageNet-pretrained backbones, yet their ability to generalize to unseen cities remains largely unexamined. When training and evaluation data are geographically mixed, models may…

March 13, 2026

Beyond Barren Plateaus: A Scalable Quantum Convolutional Architecture for High-Fidelity Image Classification

arXiv:2603.11131v1 Announce Type: new Abstract: While Quantum Convolutional Neural Networks (QCNNs) offer a theoretical paradigm for quantum machine learning, their practical implementation is severely bottlenecked by barren plateaus — the exponential vanishing of gradients — and poor empirical accuracy compared…

March 13, 2026

Uncovering Locally Low-dimensional Structure in Networks by Locally Optimal Spectral Embedding

arXiv:2603.11965v1 Announce Type: cross Abstract: Standard Adjacency Spectral Embedding (ASE) relies on a global low-rank assumption often incompatible with the sparse, transitive structure of real-world networks, causing local geometric features to be ‘smeared’. To address this, we introduce Local Adjacency…

March 13, 2026

Higher-Order Modular Attention: Fusing Pairwise and Triadic Interactions for Protein Sequences

arXiv:2603.11133v1 Announce Type: new Abstract: Transformer self-attention computes pairwise token interactions, yet protein sequence to phenotype relationships often involve cooperative dependencies among three or more residues that dot product attention does not capture explicitly. We introduce Higher-Order Modular Attention, HOMA,…

March 13, 2026

drGT: Attention-Guided Gene Assessment of Drug Response Utilizing a Drug-Cell-Gene Heterogeneous Network

arXiv:2405.08979v3 Announce Type: replace Abstract: A challenge in drug response prediction is result interpretation compared to established knowledge. drGT is a graph deep learning model that predicts sensitivity and aids in biomarker identification using attention coefficients (ACs). drGT leverages a…

March 13, 2026

Scaling Reasoning Efficiently via Relaxed On-Policy Distillation

arXiv:2603.11137v1 Announce Type: new Abstract: On-policy distillation is pivotal for transferring reasoning capabilities to capacity-constrained models, yet remains prone to instability and negative transfer. We show that on-policy distillation can be interpreted, both theoretically and empirically, as a form of…

March 13, 2026

Busemann Functions in the Wasserstein Space: Existence, Closed-Forms, and Applications to Slicing

arXiv:2510.04579v2 Announce Type: replace Abstract: The Busemann function has recently found much interest in a variety of geometric machine learning problems, as it naturally defines projections onto geodesic rays of Riemannian manifolds and generalizes the notion of hyperplanes. As several…

March 13, 2026

H2LooP Spark Preview: Continual Pretraining of Large Language Models for Low-Level Embedded Systems Code

arXiv:2603.11139v1 Announce Type: new Abstract: Large language models (LLMs) demonstrate strong code generation abilities in general-purpose programming languages but remain limited in specialized domains such as low-level embedded systems programming. This domain involves hardware register manipulation, vendor-specific SDKs, real-time operating…

March 13, 2026