Archives AI News

On the Occurence of Critical Learning Periods in Neural Networks

arXiv:2510.09687v1 Announce Type: new Abstract: This study delves into the plasticity of neural networks, offering empirical support for the notion that critical learning periods and warm-starting performance loss can be avoided through simple adjustments to learning hyperparameters. The critical learning…

October 14, 2025

Reinforcement learning-based statistical search strategy for an axion model from flavor

arXiv:2409.10023v2 Announce Type: replace-cross Abstract: We propose a reinforcement learning-based search strategy to explore new physics beyond the Standard Model. The reinforcement learning, which is one of machine learning methods, is a powerful approach to find model parameters with phenomenological…

October 14, 2025

Evaluation of Differential Privacy Mechanisms on Federated Learning

arXiv:2510.09691v1 Announce Type: new Abstract: Federated learning is distributed model training across several clients without disclosing raw data. Despite advancements in data privacy, risks still remain. Differential Privacy (DP) is a technique to protect sensitive data by adding noise to…

October 14, 2025

Noise Injection Systemically Degrades Large Language Model Safety Guardrails

arXiv:2505.13500v2 Announce Type: replace-cross Abstract: Safety guardrails in large language models (LLMs) are a critical component in preventing harmful outputs. Yet, their resilience under perturbation remains poorly understood. In this paper, we investigate the robustness of safety fine-tuning in LLMs…

October 14, 2025

Neural PDE Solvers with Physics Constraints: A Comparative Study of PINNs, DRM, and WANs

arXiv:2510.09693v1 Announce Type: new Abstract: Partial differential equations (PDEs) underpin models across science and engineering, yet analytical solutions are atypical and classical mesh-based solvers can be costly in high dimensions. This dissertation presents a unified comparison of three mesh-free neural…

October 14, 2025

FSA: An Alternative Efficient Implementation of Native Sparse Attention Kernel

arXiv:2508.18224v2 Announce Type: replace-cross Abstract: Recent advance in sparse attention mechanisms has demonstrated strong potential for reducing the computational cost of long-context training and inference in large language models (LLMs). Native Sparse Attention (NSA), one state-of-the-art approach, introduces natively trainable,…

October 14, 2025

Kelp: A Streaming Safeguard for Large Models via Latent Dynamics-Guided Risk Detection

arXiv:2510.09694v1 Announce Type: new Abstract: Large models (LMs) are powerful content generators, yet their open-ended nature can also introduce potential risks, such as generating harmful or biased content. Existing guardrails mostly perform post-hoc detection that may expose unsafe content before…

October 14, 2025

Enhanced Urban Traffic Management Using CCTV Surveillance Videos and Multi-Source Data Current State Prediction and Frequent Episode Mining

arXiv:2510.09644v1 Announce Type: new Abstract: Rapid urbanization has intensified traffic congestion, environmental strain, and inefficiencies in transportation systems, creating an urgent need for intelligent and adaptive traffic management solutions. Conventional systems relying on static signals and manual monitoring are inadequate…

October 14, 2025

Vanishing Contributions: A Unified Approach to Smoothly Transition Neural Models into Compressed Form

arXiv:2510.09696v1 Announce Type: new Abstract: The increasing scale of deep neural networks has led to a growing need for compression techniques such as pruning, quantization, and low-rank decomposition. While these methods are very effective in reducing memory, computation and energy…

October 14, 2025

Discursive Circuits: How Do Language Models Understand Discourse Relations?

arXiv:2510.11210v1 Announce Type: cross Abstract: Which components in transformer language models are responsible for discourse understanding? We hypothesize that sparse computational graphs, termed as discursive circuits, control how models process discourse relations. Unlike simpler tasks, discourse relations involve longer spans…

October 14, 2025