Archives AI News

AniME: Adaptive Multi-Agent Planning for Long Animation Generation

AniME: Adaptive Multi-Agent Planning for Long Animation Generation arXiv:2508.18781v2 Announce Type: replace Abstract: We present AniME, a director-oriented multi-agent system for automated long-form anime production, covering the full workflow from a story to the final video. The director agent keeps…

Skill-based Explanations for Serendipitous Course Recommendation

Skill-based Explanations for Serendipitous Course Recommendation arXiv:2508.19569v1 Announce Type: new Abstract: Academic choice is crucial in U.S. undergraduate education, allowing students significant freedom in course selection. However, navigating the complex academic environment is challenging due to limited information, guidance, and…

Caught in the Act: a mechanistic approach to detecting deception

Caught in the Act: a mechanistic approach to detecting deception arXiv:2508.19505v1 Announce Type: new Abstract: Sophisticated instrumentation for AI systems might have indicators that signal misalignment from human values, not unlike a “check engine” light in cars. One such indicator…

Democracy-in-Silico: Institutional Design as Alignment in AI-Governed Polities

Democracy-in-Silico: Institutional Design as Alignment in AI-Governed Polities arXiv:2508.19562v1 Announce Type: new Abstract: This paper introduces Democracy-in-Silico, an agent-based simulation where societies of advanced AI agents, imbued with complex psychological personas, govern themselves under different institutional frameworks. We explore what…

SLIM: Subtrajectory-Level Elimination for More Effective Reasoning

SLIM: Subtrajectory-Level Elimination for More Effective Reasoning arXiv:2508.19502v1 Announce Type: new Abstract: In recent months, substantial progress has been made in complex reasoning of Large Language Models, particularly through the application of test-time scaling. Notable examples include o1/o3/o4 series and…

Reliable Weak-to-Strong Monitoring of LLM Agents

Reliable Weak-to-Strong Monitoring of LLM Agents arXiv:2508.19461v1 Announce Type: new Abstract: We stress test monitoring systems for detecting covert misbehavior in autonomous LLM agents (e.g., secretly sharing private information). To this end, we systematize a monitor red teaming (MRT) workflow…

Quantized but Deceptive? A Multi-Dimensional Truthfulness Evaluation of Quantized LLMs

Quantized but Deceptive? A Multi-Dimensional Truthfulness Evaluation of Quantized LLMs arXiv:2508.19432v1 Announce Type: new Abstract: Quantization enables efficient deployment of large language models (LLMs) in resource-constrained environments by significantly reducing memory and computation costs. While quantized LLMs often maintain performance…

AI-Powered Detection of Inappropriate Language in Medical School Curricula

AI-Powered Detection of Inappropriate Language in Medical School Curricula arXiv:2508.19883v1 Announce Type: cross Abstract: The use of inappropriate language — such as outdated, exclusionary, or non-patient-centered terms — medical instructional materials can significantly influence clinical training, patient interactions, and health…