Archives AI News

Agent-in-the-Loop: A Data Flywheel for Continuous Improvement in LLM-based Customer Support

arXiv:2510.06674v1 Announce Type: new Abstract: We introduce an Agent-in-the-Loop (AITL) framework that implements a continuous data flywheel for iteratively improving an LLM-based customer support system. Unlike standard offline approaches that rely on batch annotations, AITL integrates four key types of…

October 9, 2025

Inefficiencies of Meta Agents for Agent Design

arXiv:2510.06711v1 Announce Type: new Abstract: Recent works began to automate the design of agentic systems using meta-agents that propose and iteratively refine new agent architectures. In this paper, we examine three key challenges in a common class of meta-agents. First,…

October 9, 2025

MultiCNKG: Integrating Cognitive Neuroscience, Gene, and Disease Knowledge Graphs Using Large Language Models

arXiv:2510.06742v1 Announce Type: new Abstract: The advent of large language models (LLMs) has revolutionized the integration of knowledge graphs (KGs) in biomedical and cognitive sciences, overcoming limitations in traditional machine learning methods for capturing intricate semantic links among genes, diseases,…

October 9, 2025

NdLinear: Preserving Multi-Dimensional Structure for Parameter-Efficient Neural Networks

arXiv:2503.17353v3 Announce Type: replace-cross Abstract: In deep learning, processing multidimensional inputs (e.g., images, medical scans, and time series) is an important task that often requires flattening the inputs. We introduce $mathit{NdLinear}$, a drop-in replacement for linear layers that operates directly…

October 9, 2025

Verifying Memoryless Sequential Decision-making of Large Language Models

arXiv:2510.06756v1 Announce Type: new Abstract: We introduce a tool for rigorous and automated verification of large language model (LLM)- based policies in memoryless sequential decision-making tasks. Given a Markov decision process (MDP) representing the sequential decision-making task, an LLM policy,…

October 9, 2025

Evolving and Executing Research Plans via Double-Loop Multi-Agent Collaboration

arXiv:2510.06761v1 Announce Type: new Abstract: Automating the end-to-end scientific research process poses a fundamental challenge: it requires both evolving high-level plans that are novel and sound, and executing these plans correctly amidst dynamic and uncertain conditions. To address this bilevel…

October 9, 2025

Valid Inference with Imperfect Synthetic Data

arXiv:2508.06635v2 Announce Type: replace-cross Abstract: Predictions and generations from large language models are increasingly being explored as an aid in limited data regimes, such as in computational social science and human subjects research. While prior technical work has mainly explored…

October 9, 2025

Autoformalizer with Tool Feedback

arXiv:2510.06857v1 Announce Type: new Abstract: Autoformalization addresses the scarcity of data for Automated Theorem Proving (ATP) by translating mathematical problems from natural language into formal statements. Efforts in recent work shift from directly prompting large language models to training an…

October 9, 2025

SMARTER: A Data-efficient Framework to Improve Toxicity Detection with Explanation via Self-augmenting Large Language Models

arXiv:2509.15174v2 Announce Type: replace-cross Abstract: WARNING: This paper contains examples of offensive materials. To address the proliferation of toxic content on social media, we introduce SMARTER, we introduce SMARTER, a data-efficient two-stage framework for explainable content moderation using Large Language…

October 9, 2025

TGPR: Tree-Guided Policy Refinement for Robust Self-Debugging of LLMs

arXiv:2510.06878v1 Announce Type: new Abstract: Iterative refinement has been a promising paradigm to enable large language models (LLMs) to resolve difficult reasoning and problem-solving tasks. One of the key challenges, however, is how to effectively search through the enormous search…

October 9, 2025