Archives AI News

Boosting In-Silicon Directed Evolution with Fine-Tuned Protein Language Model and Tree Search

arXiv:2511.09900v1 Announce Type: new Abstract: Protein evolution through amino acid sequence mutations is a cornerstone of life sciences. While current in-silicon directed evolution algorithms focus on designing search strategies, they overlook how to utilize the transformative protein language models, which…

November 15, 2025

One-Shot Multi-Label Causal Discovery in High-Dimensional Event Sequences

arXiv:2509.23213v2 Announce Type: replace-cross Abstract: Understanding causality in event sequences with thousands of sparse event types is critical in domains such as healthcare, cybersecurity, or vehicle diagnostics, yet current methods fail to scale. We present OSCAR, a one-shot causal autoregressive…

November 15, 2025

CTRL-ALT-DECEIT: Sabotage Evaluations for Automated AI R&D

arXiv:2511.09904v1 Announce Type: new Abstract: AI systems are increasingly able to autonomously conduct realistic software engineering tasks, and may soon be deployed to automate machine learning (ML) R&D itself. Frontier AI systems may be deployed in safety-critical settings, including to…

November 15, 2025

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

arXiv:2511.07250v2 Announce Type: replace-cross Abstract: The advent of Multimodal Large Language Models (MLLMs) has expanded AI capabilities to visual modalities, yet existing evaluation benchmarks remain limited to single-video understanding, overlooking the critical need for multi-video understanding in real-world scenarios (e.g.,…

November 15, 2025

Learning to Pose Problems: Reasoning-Driven and Solver-Adaptive Data Synthesis for Large Reasoning Models

arXiv:2511.09907v1 Announce Type: new Abstract: Data synthesis for training large reasoning models offers a scalable alternative to limited, human-curated datasets, enabling the creation of high-quality data. However, existing approaches face several challenges: (i) indiscriminate generation that ignores the solver’s ability…

November 15, 2025

RoboBenchMart: Benchmarking Robots in Retail Environment

arXiv:2511.10276v1 Announce Type: cross Abstract: Most existing robotic manipulation benchmarks focus on simplified tabletop scenarios, typically involving a stationary robotic arm interacting with various objects on a flat surface. To address this limitation, we introduce RoboBenchMart, a more challenging and…

November 15, 2025

Understanding Human-AI Trust in Education

arXiv:2506.09160v4 Announce Type: replace-cross Abstract: As AI chatbots become integrated in education, students are turning to these systems for guidance, feedback, and information. However, the anthropomorphic characteristics of these chatbots create ambiguity over whether students develop trust in them in…

November 15, 2025

ManipDreamer3D : Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory

arXiv:2509.05314v2 Announce Type: replace-cross Abstract: Data scarcity continues to be a major challenge in the field of robotic manipulation. Although diffusion models provide a promising solution for generating robotic manipulation videos, existing methods largely depend on 2D trajectories, which inherently…

November 15, 2025

WaterMod: Modular Token-Rank Partitioning for Probability-Balanced LLM Watermarking

arXiv:2511.07863v2 Announce Type: replace Abstract: Large language models now draft news, legal analyses, and software code with human-level fluency. At the same time, regulations such as the EU AI Act mandate that each synthetic passage carry an imperceptible, machine-verifiable mark…

November 15, 2025

CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models Reasoning

arXiv:2502.02390v3 Announce Type: replace-cross Abstract: Research on LLM technologies is rapidly emerging, with most of them employ a ‘fast thinking’ approach to inference. Most LLMs generate the final result based solely on a single query and LLM’s reasoning capabilities. However,…

November 15, 2025