Archives AI News

Financial Instruction Following Evaluation (FIFE)

arXiv:2512.08965v1 Announce Type: new Abstract: Language Models (LMs) struggle with complex, interdependent instructions, particularly in high-stakes domains like finance where precision is critical. We introduce FIFE, a novel, high-difficulty benchmark designed to assess LM instruction-following capabilities for financial analysis tasks.…

December 11, 2025

Global Convergence for Average Reward Constrained MDPs with Primal-Dual Actor Critic Algorithm

arXiv:2505.15138v2 Announce Type: replace Abstract: This paper investigates infinite-horizon average reward Constrained Markov Decision Processes (CMDPs) with general parametrization. We propose a Primal-Dual Natural Actor-Critic algorithm that adeptly manages constraints while ensuring a high convergence rate. In particular, our algorithm…

December 11, 2025

CluCERT: Certifying LLM Robustness via Clustering-Guided Denoising Smoothing

arXiv:2512.08967v1 Announce Type: new Abstract: Recent advancements in Large Language Models (LLMs) have led to their widespread adoption in daily applications. Despite their impressive capabilities, they remain vulnerable to adversarial attacks, as even minor meaning-preserving changes such as synonym substitutions…

December 11, 2025

The Impossibility of Inverse Permutation Learning in Transformer Models

arXiv:2509.24125v3 Announce Type: replace Abstract: In this technical note, we study the problem of inverse permutation learning in decoder-only transformers. Given a permutation and a string to which that permutation has been applied, the model is tasked with producing the…

December 11, 2025

StructuredDNA: A Bio-Physical Framework for Energy-Aware Transformer Routing

arXiv:2512.08968v1 Announce Type: new Abstract: The rapid scaling of large computational models has led to a critical increase in energy and compute costs. Inspired by biological systems where structure and function emerge from low-energy configurations, we introduce StructuredDNA, a sparse…

December 11, 2025

Proportional integral derivative booster for neural networks-based time-series prediction: Case of water demand prediction

arXiv:2512.06357v2 Announce Type: replace Abstract: Multi-step time-series prediction is an essential supportive step for decision-makers in several industrial areas. Artificial intelligence techniques, which use a neural network component in various forms, have recently frequently been used to accomplish this step.…

December 11, 2025

Learning Robust Representations for Malicious Content Detection via Contrastive Sampling and Uncertainty Estimation

arXiv:2512.08969v1 Announce Type: new Abstract: We propose the Uncertainty Contrastive Framework (UCF), a Positive-Unlabeled (PU) representation learning framework that integrates uncertainty-aware contrastive loss, adaptive temperature scaling, and a self-attention-guided LSTM encoder to improve classification under noisy and imbalanced conditions. UCF…

December 11, 2025

Low-Dimensional Structure in the Space of Language Representations is Reflected in Brain Responses

arXiv:2106.05426v5 Announce Type: replace-cross Abstract: How related are the representations learned by neural language models, translation models, and language tagging tasks? We answer this question by adapting an encoder-decoder transfer learning method from computer vision to investigate the structure among…

December 11, 2025

Peek-a-Boo Reasoning: Contrastive Region Masking in MLLMs

arXiv:2512.08976v1 Announce Type: new Abstract: We introduce Contrastive Region Masking (CRM), a training free diagnostic that reveals how multimodal large language models (MLLMs) depend on specific visual regions at each step of chain-of-thought (CoT) reasoning. Unlike prior approaches limited to…

December 11, 2025

Imitative Membership Inference Attack

arXiv:2509.06796v2 Announce Type: replace-cross Abstract: A Membership Inference Attack (MIA) assesses how much a target machine learning model reveals about its training data by determining whether specific query instances were part of the training set. State-of-the-art MIAs rely on training…

December 11, 2025