Archives AI News

Localist LLMs — A Mathematical Framework for Dynamic Locality Control

arXiv:2510.09338v1 Announce Type: cross Abstract: We present a novel framework for training large language models with continuously adjustable internal representations that span the full spectrum from localist (interpretable, rule-based) to distributed (generalizable, efficient) encodings. The key innovation is a locality…

Guiding Exploration in Reinforcement Learning Through LLM-Augmented Observations

arXiv:2510.08779v1 Announce Type: new Abstract: Reinforcement Learning (RL) agents often struggle in sparse-reward environments where traditional exploration strategies fail to discover effective action sequences. Large Language Models (LLMs) possess procedural knowledge and reasoning capabilities from text pretraining that could guide…

Hybrid Models for Natural Language Reasoning: The Case of Syllogistic Logic

arXiv:2510.09472v1 Announce Type: cross Abstract: Despite the remarkable progress in neural models, their ability to generalize, a cornerstone for applications like logical reasoning, remains a critical challenge. We delineate two fundamental aspects of this ability: compositionality, the capacity to abstract…

Weights initialization of neural networks for function approximation

arXiv:2510.08780v1 Announce Type: new Abstract: Neural network-based function approximation plays a pivotal role in the advancement of scientific computing and machine learning. Yet, training such models faces several challenges: (i) each target function often requires training a new model from…

Robustness in Both Domains: CLIP Needs a Robust Text Encoder

arXiv:2506.03355v2 Announce Type: replace Abstract: Adversarial input attacks can cause a significant shift of CLIP embeddings. This can affect the downstream robustness of models incorporating CLIP in the pipeline, such as text-to-image generative models or large vision language models. While…

Fair Graph Machine Learning under Adversarial Missingness Processes

arXiv:2311.01591v4 Announce Type: replace Abstract: Graph Neural Networks (GNNs) have achieved state-of-the-art results in many relevant tasks where decisions might disproportionately impact specific communities. However, existing work on fair GNNs often assumes that either sensitive attributes are fully observed or…

Active Model Selection for Large Language Models

arXiv:2510.09418v1 Announce Type: cross Abstract: We introduce LLM SELECTOR, the first framework for active model selection of Large Language Models (LLMs). Unlike prior evaluation and benchmarking approaches that rely on fully annotated datasets, LLM SELECTOR efficiently identifies the best LLM…

SWE-Arena: An Interactive Platform for Evaluating Foundation Models in Software Engineering

arXiv:2502.01860v5 Announce Type: replace-cross Abstract: Foundation models (FMs), particularly large language models (LLMs), have shown significant promise in various software engineering (SE) tasks, including code generation, debugging, and requirement refinement. Despite these advances, existing evaluation frameworks are insufficient for assessing…