Archives AI News

N-ReLU: Zero-Mean Stochastic Extension of ReLU

arXiv:2511.07559v1 Announce Type: new Abstract: Activation functions are fundamental for enabling nonlinear representations in deep neural networks. However, the standard rectified linear unit (ReLU) often suffers from inactive or “dead” neurons caused by its hard zero cutoff. To address this…

November 12, 2025

Hierarchical Deep Counterfactual Regret Minimization

arXiv:2305.17327v3 Announce Type: replace Abstract: Imperfect Information Games (IIGs) offer robust models for scenarios where decision-makers face uncertainty or lack complete information. Counterfactual Regret Minimization (CFR) has been one of the most successful family of algorithms for tackling IIGs. The…

November 12, 2025

SCALAR: Benchmarking SAE Interaction Sparsity in Toy LLMs

arXiv:2511.07572v1 Announce Type: new Abstract: Mechanistic interpretability aims to decompose neural networks into interpretable features and map their connecting circuits. The standard approach trains sparse autoencoders (SAEs) on each layer’s activations. However, SAEs trained in isolation don’t encourage sparse cross-layer…

November 12, 2025

CATransformers: Carbon Aware Transformers Through Joint Model-Hardware Optimization

arXiv:2505.01386v4 Announce Type: replace Abstract: Machine learning solutions are rapidly adopted to enable a variety of key use cases, from conversational AI assistants to scientific discovery. This growing adoption is expected to increase the associated lifecycle carbon footprint, including both…

November 12, 2025

LLM Output Drift: Cross-Provider Validation & Mitigation for Financial Workflows

arXiv:2511.07585v1 Announce Type: new Abstract: Financial institutions deploy Large Language Models (LLMs) for reconciliations, regulatory reporting, and client communications, but nondeterministic outputs (output drift) undermine auditability and trust. We quantify drift across five model architectures (7B-120B parameters) on regulated financial…

November 12, 2025

S$^2$M-Former: Spiking Symmetric Mixing Branchformer for Brain Auditory Attention Detection

arXiv:2508.05164v2 Announce Type: replace Abstract: Auditory attention detection (AAD) aims to decode listeners’ focus in complex auditory environments from electroencephalography (EEG) recordings, which is crucial for developing neuro-steered hearing devices. Despite recent advancements, EEG-based AAD remains hindered by the absence…

November 12, 2025

One Router to Route Them All: Homogeneous Expert Routing for Heterogeneous Graph Transformers

arXiv:2511.07603v1 Announce Type: new Abstract: A common practice in heterogeneous graph neural networks (HGNNs) is to condition parameters on node/edge types, assuming types reflect semantic roles. However, this can cause overreliance on surface-level labels and impede cross-type knowledge transfer. We…

November 12, 2025

Evolutionary Profiles for Protein Fitness Prediction

arXiv:2510.07286v2 Announce Type: replace Abstract: Predicting the fitness impact of mutations is central to protein engineering but constrained by limited assays relative to the size of sequence space. Protein language models (pLMs) trained with masked language modeling (MLM) exhibit strong…

November 12, 2025

Partial Action Replacement: Tackling Distribution Shift in Offline MARL

arXiv:2511.07629v1 Announce Type: new Abstract: Offline multi-agent reinforcement learning (MARL) is severely hampered by the challenge of evaluating out-of-distribution (OOD) joint actions. Our core finding is that when the behavior policy is factorized – a common scenario where agents act…

November 12, 2025

Synergy over Discrepancy: A Partition-Based Approach to Multi-Domain LLM Fine-Tuning

arXiv:2511.07198v2 Announce Type: replace Abstract: Large language models (LLMs) demonstrate impressive generalization abilities, yet adapting them effectively across multiple heterogeneous domains remains challenging due to inter-domain interference. To overcome this challenge, we propose a partition-based multi-stage fine-tuning framework designed to…

November 12, 2025