Archives AI News

Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts

arXiv:2509.21743v1 Announce Type: new Abstract: Large reasoning models improve accuracy by producing long reasoning traces, but this inflates latency and cost, motivating inference-time efficiency. We propose Retrieval-of-Thought (RoT), which reuses prior reasoning as composable “thought” steps to guide new problems.…

September 29, 2025

A critical review of methods and challenges in large language models

arXiv:2404.11973v2 Announce Type: replace Abstract: This critical review provides an in-depth analysis of Large Language Models (LLMs), encompassing their foundational principles, diverse applications, and advanced training methodologies. We critically examine the evolution from Recurrent Neural Networks (RNNs) to Transformer models,…

September 29, 2025

Lifelong Learning with Behavior Consolidation for Vehicle Routing

arXiv:2509.21765v1 Announce Type: new Abstract: Recent neural solvers have demonstrated promising performance in learning to solve routing problems. However, existing studies are primarily based on one-off training on one or a set of predefined problem distributions and scales, i.e., tasks.…

September 29, 2025

Diverse Subset Selection via Norm-Based Sampling and Orthogonality

arXiv:2406.01086v2 Announce Type: replace-cross Abstract: Large annotated datasets are crucial for the success of deep neural networks, but labeling data can be prohibitively expensive in domains such as medical imaging. This work tackles the subset selection problem: selecting a small…

September 29, 2025

UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios

arXiv:2509.21766v1 Announce Type: new Abstract: Autonomous agents have recently achieved remarkable progress across diverse domains, yet most evaluations focus on short-horizon, fully observable tasks. In contrast, many critical real-world tasks, such as large-scale software development, commercial investment, and scientific discovery,…

September 29, 2025

Can Diffusion Models Disentangle? A Theoretical Perspective

arXiv:2504.00220v2 Announce Type: replace-cross Abstract: This paper presents a novel theoretical framework for understanding how diffusion models can learn disentangled representations. Within this framework, we establish identifiability conditions for general disentangled latent variable models, analyze training dynamics, and derive sample…

September 29, 2025

Benchmarking MLLM-based Web Understanding: Reasoning, Robustness and Safety

arXiv:2509.21782v1 Announce Type: new Abstract: Multimodal large language models (MLLMs) are increasingly positioned as AI collaborators for building complex web-related applications like GUI agents and front-end code generation. However, existing benchmarks largely emphasize visual perception or UI code generation, showing…

September 29, 2025

From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning

arXiv:2505.17117v5 Announce Type: replace-cross Abstract: Humans organize knowledge into compact categories that balance compression with semantic meaning preservation. Large Language Models (LLMs) demonstrate striking linguistic abilities, yet whether they achieve this same balance remains unclear. We apply the Information Bottleneck…

September 29, 2025

D-Artemis: A Deliberative Cognitive Framework for Mobile GUI Multi-Agents

arXiv:2509.21799v1 Announce Type: new Abstract: Graphical User Interface (GUI) agents aim to automate a wide spectrum of human tasks by emulating user interaction. Despite rapid advancements, current approaches are hindered by several critical challenges: data bottleneck in end-to-end training, high…

September 29, 2025

Investigating Redundancy in Multimodal Large Language Models with Multiple Vision Encoders

arXiv:2507.03262v2 Announce Type: replace-cross Abstract: Recent multimodal large language models (MLLMs) increasingly integrate multiple vision encoders to improve performance on various benchmarks, assuming that diverse pretraining objectives yield complementary visual signals. However, we show this assumption often fails in practice.…

September 29, 2025