Archives AI News

Confidence Improves Self-Consistency in LLMs

arXiv:2502.06233v2 Announce Type: replace-cross Abstract: Self-consistency decoding enhances LLMs’ performance on reasoning tasks by sampling diverse reasoning paths and selecting the most frequent answer. However, it is computationally expensive, as sampling many of these (lengthy) paths is required to increase…

September 30, 2025

Multiplayer Nash Preference Optimization

arXiv:2509.23102v1 Announce Type: new Abstract: Reinforcement learning from human feedback (RLHF) has emerged as the standard paradigm for aligning large language models (LLMs) with human preferences. However, reward-based methods built on the Bradley-Terry assumption struggle to capture the non-transitive and…

September 30, 2025

InfoDet: A Dataset for Infographic Element Detection

arXiv:2505.17473v4 Announce Type: replace-cross Abstract: Given the central role of charts in scientific, business, and communication contexts, enhancing the chart understanding capabilities of vision-language models (VLMs) has become increasingly critical. A key limitation of existing VLMs lies in their inaccurate…

September 30, 2025

Artificial Phantasia: Evidence for Propositional Reasoning-Based Mental Imagery in Large Language Models

arXiv:2509.23108v1 Announce Type: new Abstract: This study offers a novel approach for benchmarking complex cognitive behavior in artificial systems. Almost universally, Large Language Models (LLMs) perform best on tasks which may be included in their training data and can be…

September 30, 2025

Mitigating Watermark Forgery in Generative Models via Randomized Key Selection

arXiv:2507.07871v3 Announce Type: replace-cross Abstract: Watermarking enables GenAI providers to verify whether content was generated by their models. A watermark is a hidden signal in the content, whose presence can be detected using a secret watermark key. A core security…

September 30, 2025

AttAnchor: Guiding Cross-Modal Token Alignment in VLMs with Attention Anchors

arXiv:2509.23109v1 Announce Type: new Abstract: A fundamental reason for the dominance of attention over RNNs and LSTMs in LLMs is its ability to capture long-range dependencies by modeling direct interactions between all tokens, overcoming the sequential limitations of recurrent architectures.…

September 30, 2025

StefaLand: An Efficient Geoscience Foundation Model That Improves Dynamic Land-Surface Predictions

arXiv:2509.17942v2 Announce Type: replace-cross Abstract: Stewarding natural resources, mitigating floods, droughts, wildfires, and landslides, and meeting growing demands require models that can predict climate-driven land-surface responses and human feedback with high accuracy. Traditional impact models, whether process-based, statistical, or machine…

September 30, 2025

Exploring LLM-based Frameworks for Fault Diagnosis

arXiv:2509.23113v1 Announce Type: new Abstract: Large Language Model (LLM)-based systems present new opportunities for autonomous health monitoring in sensor-rich industrial environments. This study explores the potential of LLMs to detect and classify faults directly from sensor data, while producing inherently…

September 30, 2025

Mitigating Visual Hallucinations via Semantic Curriculum Preference Optimization in MLLMs

arXiv:2509.24491v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) have significantly improved the performance of various tasks, but continue to suffer from visual hallucinations, a critical issue where generated responses contradict visual evidence. While Direct Preference Optimization(DPO) is widely…

September 30, 2025

Transferring Vision-Language-Action Models to Industry Applications: Architectures, Performance, and Challenges

arXiv:2509.23121v1 Announce Type: new Abstract: The application of artificial intelligence (AI) in industry is accelerating the shift from traditional automation to intelligent systems with perception and cognition. Vision language-action (VLA) models have been a key paradigm in AI to unify…

September 30, 2025