Archives AI News

Bi-Level Contextual Bandits for Individualized Resource Allocation under Delayed Feedback

arXiv:2511.10572v2 Announce Type: replace-cross Abstract: Equitably allocating limited resources in high-stakes domains-such as education, employment, and healthcare-requires balancing short-term utility with long-term impact, while accounting for delayed outcomes, hidden heterogeneity, and ethical constraints. However, most learning-based allocation frameworks either assume…

November 17, 2025

The Computational Advantage of Depth: Learning High-Dimensional Hierarchical Functions with Gradient Descent

arXiv:2502.13961v4 Announce Type: replace-cross Abstract: Understanding the advantages of deep neural networks trained by gradient descent (GD) compared to shallow models remains an open theoretical challenge. In this paper, we introduce a class of target functions (single and multi-index Gaussian…

November 17, 2025

CAMA: Enhancing Mathematical Reasoning in Large Language Models with Causal Knowledge

arXiv:2508.02583v3 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have demonstrated strong performance across a wide range of tasks, yet they still struggle with complex mathematical reasoning, a challenge fundamentally rooted in deep structural dependencies. To address this challenge, we…

November 17, 2025

Potent but Stealthy: Rethink Profile Pollution against Sequential Recommendation via Bi-level Constrained Reinforcement Paradigm

arXiv:2511.09392v3 Announce Type: replace Abstract: Sequential Recommenders, which exploit dynamic user intents through interaction sequences, is vulnerable to adversarial attacks. While existing attacks primarily rely on data poisoning, they require large-scale user access or fake profiles thus lacking practicality. In…

November 17, 2025

FedALT: Federated Fine-Tuning through Adaptive Local Training with Rest-of-World LoRA

arXiv:2503.11880v3 Announce Type: replace Abstract: Fine-tuning large language models (LLMs) in federated settings enables privacy-preserving adaptation but suffers from cross-client interference due to model aggregation. Existing federated LoRA fine-tuning methods, primarily based on FedAvg, struggle with data heterogeneity, leading to…

November 17, 2025

Convergence Bound and Critical Batch Size of Muon Optimizer

arXiv:2507.01598v3 Announce Type: replace Abstract: Muon, a recently proposed optimizer that leverages the inherent matrix structure of neural network parameters, has demonstrated strong empirical performance, indicating its potential as a successor to standard optimizers such as AdamW. This paper presents…

November 17, 2025

Near-optimal Linear Predictive Clustering in Non-separable Spaces via Mixed Integer Programming and Quadratic Pseudo-Boolean Reductions

arXiv:2511.10809v1 Announce Type: new Abstract: Linear Predictive Clustering (LPC) partitions samples based on shared linear relationships between feature and target variables, with numerous applications including marketing, medicine, and education. Greedy optimization methods, commonly used for LPC, alternate between clustering and…

November 17, 2025

Transformers know more than they can tell — Learning the Collatz sequence

arXiv:2511.10811v1 Announce Type: new Abstract: We investigate transformer prediction of long Collatz steps, a complex arithmetic function that maps odd integers to their distant successors in the Collatz sequence ( $u_{n+1}=u_n/2$ if $u_n$ is even, $u_{n+1}=(3u_n+1)/2$ if $u_n$ is odd).…

November 17, 2025

Movement-Specific Analysis for FIM Score Classification Using Spatio-Temporal Deep Learning

arXiv:2511.10713v1 Announce Type: new Abstract: The functional independence measure (FIM) is widely used to evaluate patients’ physical independence in activities of daily living. However, traditional FIM assessment imposes a significant burden on both patients and healthcare professionals. To address this…

November 17, 2025

Fast Neural Tangent Kernel Alignment, Norm and Effective Rank via Trace Estimation

arXiv:2511.10796v1 Announce Type: new Abstract: The Neural Tangent Kernel (NTK) characterizes how a model’s state evolves over Gradient Descent. Computing the full NTK matrix is often infeasible, especially for recurrent architectures. Here, we introduce a matrix-free perspective, using trace estimation…

November 17, 2025