Archives AI News

The Cream Rises to the Top: Efficient Reranking Method for Verilog Code Generation

arXiv:2509.20215v1 Announce Type: cross Abstract: LLMs face significant challenges in Verilog generation due to limited domain-specific knowledge. While sampling techniques improve pass@k metrics, hardware engineers need one trustworthy solution rather than uncertain candidates. To bridge this gap, we formulate it…

GAUSS: Benchmarking Structured Mathematical Skills for Large Language Models

arXiv:2509.18122v1 Announce Type: cross Abstract: We introduce textbf{GAUSS} (textbf{G}eneral textbf{A}ssessment of textbf{U}nderlying textbf{S}tructured textbf{S}kills in Mathematics), a benchmark that evaluates LLMs’ mathematical abilities across twelve core skill dimensions, grouped into three domains: knowledge and understanding, problem solving and communication, and…

Adaptive Event-Triggered Policy Gradient for Multi-Agent Reinforcement Learning

arXiv:2509.20338v1 Announce Type: cross Abstract: Conventional multi-agent reinforcement learning (MARL) methods rely on time-triggered execution, where agents sample and communicate actions at fixed intervals. This approach is often computationally expensive and communication-intensive. To address this limitation, we propose ET-MAPG (Event-Triggered…

LLMs as verification oracles for Solidity

arXiv:2509.19153v1 Announce Type: cross Abstract: Ensuring the correctness of smart contracts is critical, as even subtle flaws can lead to severe financial losses. While bug detection tools able to spot common vulnerability patterns can serve as a first line of…

CNS-Obsidian: A Neurosurgical Vision-Language Model Built From Scientific Publications

arXiv:2502.19546v4 Announce Type: replace Abstract: General-purpose vision-language models (VLMs) demonstrate impressive capabilities, but their opaque training on uncurated internet data posse critical limitations for high-stakes decision-making, such as in neurosurgery. We present CNS-Obsidian, a neurosurgical VLM trained on peer-reviewed neurosurgical…

Wavelet Fourier Diffuser: Frequency-Aware Diffusion Model for Reinforcement Learning

arXiv:2509.19305v1 Announce Type: cross Abstract: Diffusion probability models have shown significant promise in offline reinforcement learning by directly modeling trajectory sequences. However, existing approaches primarily focus on time-domain features while overlooking frequency-domain features, leading to frequency shift and degraded performance…

The 2020 United States Decennial Census Is More Private Than You (Might) Think

arXiv:2410.09296v3 Announce Type: replace-cross Abstract: The U.S. Decennial Census serves as the foundation for many high-profile policy decision-making processes, including federal funding allocation and redistricting. In 2020, the Census Bureau adopted differential privacy to protect the confidentiality of individual responses…