Archives AI News

LLM Output Drift: Cross-Provider Validation & Mitigation for Financial Workflows

arXiv:2511.07585v1 Announce Type: new Abstract: Financial institutions deploy Large Language Models (LLMs) for reconciliations, regulatory reporting, and client communications, but nondeterministic outputs (output drift) undermine auditability and trust. We quantify drift across five model architectures (7B-120B parameters) on regulated financial…

November 12, 2025

S$^2$M-Former: Spiking Symmetric Mixing Branchformer for Brain Auditory Attention Detection

arXiv:2508.05164v2 Announce Type: replace Abstract: Auditory attention detection (AAD) aims to decode listeners’ focus in complex auditory environments from electroencephalography (EEG) recordings, which is crucial for developing neuro-steered hearing devices. Despite recent advancements, EEG-based AAD remains hindered by the absence…

November 12, 2025

One Router to Route Them All: Homogeneous Expert Routing for Heterogeneous Graph Transformers

arXiv:2511.07603v1 Announce Type: new Abstract: A common practice in heterogeneous graph neural networks (HGNNs) is to condition parameters on node/edge types, assuming types reflect semantic roles. However, this can cause overreliance on surface-level labels and impede cross-type knowledge transfer. We…

November 12, 2025

Evolutionary Profiles for Protein Fitness Prediction

arXiv:2510.07286v2 Announce Type: replace Abstract: Predicting the fitness impact of mutations is central to protein engineering but constrained by limited assays relative to the size of sequence space. Protein language models (pLMs) trained with masked language modeling (MLM) exhibit strong…

November 12, 2025

Partial Action Replacement: Tackling Distribution Shift in Offline MARL

arXiv:2511.07629v1 Announce Type: new Abstract: Offline multi-agent reinforcement learning (MARL) is severely hampered by the challenge of evaluating out-of-distribution (OOD) joint actions. Our core finding is that when the behavior policy is factorized – a common scenario where agents act…

November 12, 2025

Synergy over Discrepancy: A Partition-Based Approach to Multi-Domain LLM Fine-Tuning

arXiv:2511.07198v2 Announce Type: replace Abstract: Large language models (LLMs) demonstrate impressive generalization abilities, yet adapting them effectively across multiple heterogeneous domains remains challenging due to inter-domain interference. To overcome this challenge, we propose a partition-based multi-stage fine-tuning framework designed to…

November 12, 2025

FlowTIE: Flow-based Transport of Intensity Equation for Phase Gradient Estimation from 4D-STEM Data

arXiv:2511.07633v1 Announce Type: new Abstract: We introduce FlowTIE, a neural-network-based framework for phase reconstruction from 4D-Scanning Transmission Electron Microscopy (STEM) data, which integrates the Transport of Intensity Equation (TIE) with a flow-based representation of the phase gradient. This formulation allows…

November 12, 2025

On the generalization of language models from in-context learning and finetuning: a controlled study

arXiv:2505.00661v3 Announce Type: replace-cross Abstract: Large language models exhibit exciting capabilities, yet can show surprisingly narrow generalization from finetuning. E.g. they can fail to generalize to simple reversals of relations they are trained on, or fail to make simple logical…

November 12, 2025

Private-RAG: Answering Multiple Queries with LLMs while Keeping Your Data Private

arXiv:2511.07637v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving documents from an external corpus at inference time. When this corpus contains sensitive information, however, unprotected RAG systems are at risk of leaking private information.…

November 12, 2025

LLM-based Relevance Assessment for Web-Scale Search Evaluation at Pinterest

arXiv:2509.03764v2 Announce Type: replace-cross Abstract: Relevance evaluation plays a crucial role in personalized search systems to ensure that search results align with a user’s queries and intent. While human annotation is the traditional method for relevance evaluation, its high cost…

November 12, 2025