Archives AI News

FedALT: Federated Fine-Tuning through Adaptive Local Training with Rest-of-World LoRA

arXiv:2503.11880v3 Announce Type: replace Abstract: Fine-tuning large language models (LLMs) in federated settings enables privacy-preserving adaptation but suffers from cross-client interference due to model aggregation. Existing federated LoRA fine-tuning methods, primarily based on FedAvg, struggle with data heterogeneity, leading to…

November 17, 2025

Convergence Bound and Critical Batch Size of Muon Optimizer

arXiv:2507.01598v3 Announce Type: replace Abstract: Muon, a recently proposed optimizer that leverages the inherent matrix structure of neural network parameters, has demonstrated strong empirical performance, indicating its potential as a successor to standard optimizers such as AdamW. This paper presents…

November 17, 2025

Near-optimal Linear Predictive Clustering in Non-separable Spaces via Mixed Integer Programming and Quadratic Pseudo-Boolean Reductions

arXiv:2511.10809v1 Announce Type: new Abstract: Linear Predictive Clustering (LPC) partitions samples based on shared linear relationships between feature and target variables, with numerous applications including marketing, medicine, and education. Greedy optimization methods, commonly used for LPC, alternate between clustering and…

November 17, 2025

Transformers know more than they can tell — Learning the Collatz sequence

arXiv:2511.10811v1 Announce Type: new Abstract: We investigate transformer prediction of long Collatz steps, a complex arithmetic function that maps odd integers to their distant successors in the Collatz sequence ( $u_{n+1}=u_n/2$ if $u_n$ is even, $u_{n+1}=(3u_n+1)/2$ if $u_n$ is odd).…

November 17, 2025

Movement-Specific Analysis for FIM Score Classification Using Spatio-Temporal Deep Learning

arXiv:2511.10713v1 Announce Type: new Abstract: The functional independence measure (FIM) is widely used to evaluate patients’ physical independence in activities of daily living. However, traditional FIM assessment imposes a significant burden on both patients and healthcare professionals. To address this…

November 17, 2025

Fast Neural Tangent Kernel Alignment, Norm and Effective Rank via Trace Estimation

arXiv:2511.10796v1 Announce Type: new Abstract: The Neural Tangent Kernel (NTK) characterizes how a model’s state evolves over Gradient Descent. Computing the full NTK matrix is often infeasible, especially for recurrent architectures. Here, we introduce a matrix-free perspective, using trace estimation…

November 17, 2025

Towards Uncertainty Quantification in Generative Model Learning

arXiv:2511.10710v1 Announce Type: new Abstract: While generative models have become increasingly prevalent across various domains, fundamental concerns regarding their reliability persist. A crucial yet understudied aspect of these models is the uncertainty quantification surrounding their distribution approximation capabilities. Current evaluation…

November 17, 2025

Bias-Restrained Prefix Representation Finetuning for Mathematical Reasoning

arXiv:2511.10707v1 Announce Type: new Abstract: Parameter-Efficient finetuning (PEFT) enhances model performance on downstream tasks by updating a minimal subset of parameters. Representation finetuning (ReFT) methods further improve efficiency by freezing model weights and optimizing internal representations with fewer parameters than…

November 17, 2025

Differentiable Sparse Identification of Lagrangian Dynamics

arXiv:2511.10706v1 Announce Type: new Abstract: Data-driven discovery of governing equations from data remains a fundamental challenge in nonlinear dynamics. Although sparse regression techniques have advanced system identification, they struggle with rational functions and noise sensitivity in complex mechanical systems. The…

November 17, 2025

Partial Information Decomposition for Data Interpretability and Feature Selection

arXiv:2405.19212v4 Announce Type: replace Abstract: In this paper, we introduce Partial Information Decomposition of Features (PIDF), a new paradigm for simultaneous data interpretability and feature selection. Contrary to traditional methods that assign a single importance value, our approach is based…

November 17, 2025