Archives AI News

Provably Efficient Sample Complexity for Robust CMDP

arXiv:2511.07486v1 Announce Type: new Abstract: We study the problem of learning policies that maximize cumulative reward while satisfying safety constraints, even when the real environment differs from a simulator or nominal model. We focus on robust constrained Markov decision processes…

Source-Optimal Training is Transfer-Suboptimal

arXiv:2511.08401v1 Announce Type: cross Abstract: We prove a fundamental misalignment in transfer learning: the source regularization that minimizes source risk almost never coincides with the regularization maximizing transfer benefit. Through sharp phase boundaries for L2-SP ridge regression, we characterize the…

N-ReLU: Zero-Mean Stochastic Extension of ReLU

arXiv:2511.07559v1 Announce Type: new Abstract: Activation functions are fundamental for enabling nonlinear representations in deep neural networks. However, the standard rectified linear unit (ReLU) often suffers from inactive or “dead” neurons caused by its hard zero cutoff. To address this…

Hierarchical Deep Counterfactual Regret Minimization

arXiv:2305.17327v3 Announce Type: replace Abstract: Imperfect Information Games (IIGs) offer robust models for scenarios where decision-makers face uncertainty or lack complete information. Counterfactual Regret Minimization (CFR) has been one of the most successful family of algorithms for tackling IIGs. The…

SCALAR: Benchmarking SAE Interaction Sparsity in Toy LLMs

arXiv:2511.07572v1 Announce Type: new Abstract: Mechanistic interpretability aims to decompose neural networks into interpretable features and map their connecting circuits. The standard approach trains sparse autoencoders (SAEs) on each layer’s activations. However, SAEs trained in isolation don’t encourage sparse cross-layer…

LLM Output Drift: Cross-Provider Validation & Mitigation for Financial Workflows

arXiv:2511.07585v1 Announce Type: new Abstract: Financial institutions deploy Large Language Models (LLMs) for reconciliations, regulatory reporting, and client communications, but nondeterministic outputs (output drift) undermine auditability and trust. We quantify drift across five model architectures (7B-120B parameters) on regulated financial…