Archives AI News

Learning Degenerate Manifolds of Frustrated Magnets with Boltzmann Machines

arXiv:2511.19879v1 Announce Type: cross Abstract: We show that Restricted Boltzmann Machines (RBMs) provide a flexible generative framework for modeling spin configurations in disordered yet strongly correlated phases of frustrated magnets. As a benchmark, we first demonstrate that an RBM can…

CostNav: A Navigation Benchmark for Cost-Aware Evaluation of Embodied Agents

arXiv:2511.20216v1 Announce Type: cross Abstract: Existing navigation benchmarks focus on task success metrics while overlooking economic viability — critical for commercial deployment of autonomous delivery robots. We introduce emph{CostNav}, a textbf{Micro-Navigation Economic Testbed} that evaluates embodied agents through comprehensive cost-revenue…

Automating Deception: Scalable Multi-Turn LLM Jailbreaks

arXiv:2511.19517v1 Announce Type: new Abstract: Multi-turn conversational attacks, which leverage psychological principles like Foot-in-the-Door (FITD), where a small initial request paves the way for a more significant one, to bypass safety alignments, pose a persistent threat to Large Language Models…

Learning to Solve Weighted Maximum Satisfiability with a Co-Training Architecture

arXiv:2511.19544v1 Announce Type: new Abstract: Wepropose SplitGNN, a graph neural network (GNN)-based approach that learns to solve weighted maximum satisfiabil ity (MaxSAT) problem. SplitGNN incorporates a co-training architecture consisting of supervised message passing mech anism and unsupervised solution boosting layer.…

On Evaluating LLM Alignment by Evaluating LLMs as Judges

arXiv:2511.20604v1 Announce Type: cross Abstract: Alignment with human preferences is an important evaluation aspect of LLMs, requiring them to be helpful, honest, safe, and to precisely follow human instructions. Evaluating large language models’ (LLMs) alignment typically involves directly assessing their…