Audio2Tool: Speak, Call, Act — A Dataset for Benchmarking Speech Tool Use
arXiv:2604.22821v2 Announce Type: replace-cross Abstract: Voice assistants increasingly rely on Speech Language Models (SpeechLMs) to interpret spoken queries and execute complex tasks, yet existing benchmarks lack domain breadth, acoustic diversity, and compositional reasoning complexity to evaluate tool-calling performance. We introduce…
