Archives AI News

Reasoning About Intent for Ambiguous Requests

arXiv:2511.10453v1 Announce Type: cross Abstract: Large language models often respond to ambiguous requests by implicitly committing to one interpretation. Intent misunderstandings can frustrate users and create safety risks. To address this, we propose generating multiple interpretation-answer pairs in a single…

Preview, Accept or Discard? A Predictive Low-Motion Interaction Paradigm

arXiv:2511.10532v1 Announce Type: cross Abstract: Repetitive strain injury (RSI) affects roughly one in five computer users and remains largely unresolved despite decades of ergonomic mouse redesign. All such devices share a fundamental limitation: they still require fine-motor motion to operate.…

ChEmREF: Evaluating Language Model Readiness for Chemical Emergency Response

arXiv:2511.10027v1 Announce Type: new Abstract: Emergency responders managing hazardous material HAZMAT incidents face critical, time-sensitive decisions, manually navigating extensive chemical guidelines. We investigate whether today’s language models can assist responders by rapidly and reliably understanding critical information, identifying hazards, and…

Towards an Agentic Workflow for Internet Measurement Research

arXiv:2511.10611v1 Announce Type: cross Abstract: Internet measurement research faces an accessibility crisis: complex analyses require custom integration of multiple specialized tools that demands specialized domain expertise. When network disruptions occur, operators need rapid diagnostic workflows spanning infrastructure mapping, routing analysis,…

Beyond ReAct: A Planner-Centric Framework for Complex Tool-Augmented LLM Reasoning

arXiv:2511.10037v1 Announce Type: new Abstract: Existing tool-augmented large language models (LLMs) encounter significant challenges when processing complex queries. Current frameworks such as ReAct are prone to local optimization traps due to their reliance on incremental decision-making processes. To address these…

Unlocking Efficient Vehicle Dynamics Modeling via Analytic World Models

arXiv:2502.10012v2 Announce Type: replace Abstract: Differentiable simulators represent an environment’s dynamics as a differentiable function. Within robotics and autonomous driving, this property is used in Analytic Policy Gradients (APG), which relies on backpropagating through the dynamics to train accurate policies…

Efficient Thought Space Exploration through Strategic Intervention

arXiv:2511.10038v1 Announce Type: new Abstract: While large language models (LLMs) demonstrate emerging reasoning capabilities, current inference-time expansion methods incur prohibitive computational costs by exhaustive sampling. Through analyzing decoding trajectories, we observe that most next-token predictions align well with the golden…

LiveResearchBench: A Live Benchmark for User-Centric Deep Research in the Wild

arXiv:2510.14240v3 Announce Type: replace Abstract: Deep research — producing comprehensive, citation-grounded reports by searching and synthesizing information from hundreds of live web sources — marks an important frontier for agentic systems. To rigorously evaluate this ability, four principles are essential:…