Archives AI News

LLMs as Agentic Cooperative Players in Multiplayer UNO

arXiv:2509.09867v1 Announce Type: new Abstract: LLMs promise to assist humans — not just by answering questions, but by offering useful guidance across a wide range of tasks. But how far does that assistance go? Can a large language model based…

Towards a Common Framework for Autoformalization

arXiv:2509.09810v1 Announce Type: new Abstract: Autoformalization has emerged as a term referring to the automation of formalization – specifically, the formalization of mathematics using interactive theorem provers (proof assistants). Its rapid development has been driven by progress in deep learning,…

How well can LLMs provide planning feedback in grounded environments?

arXiv:2509.09790v1 Announce Type: new Abstract: Learning to plan in grounded environments typically requires carefully designed reward functions or high-quality annotated demonstrations. Recent works show that pretrained foundation models, such as large language models (LLMs) and vision language models (VLMs), capture…

Can LLM Prompting Serve as a Proxy for Static Analysis in Vulnerability Detection

arXiv:2412.12039v3 Announce Type: replace-cross Abstract: Despite their remarkable success, large language models (LLMs) have shown limited ability on safety-critical code tasks such as vulnerability detection. Typically, static analysis (SA) tools, like CodeQL, CodeGuru Security, etc., are used for vulnerability detection.…