Archives AI News

Adaptive Physics-Informed Neural Networks with Multi-Category Feature Engineering for Hydrogen Sorption Prediction in Clays, Shales, and Coals

arXiv:2509.00049v1 Announce Type: new Abstract: Accurate prediction of hydrogen sorption in clays, shales, and coals is vital for advancing underground hydrogen storage, natural hydrogen exploration, and radioactive waste containment. Traditional experimental methods, while foundational, are time-consuming, error-prone, and limited in capturing geological heterogeneity. This study introduces an adaptive physics-informed neural network (PINN) framework with multi-category feature engineering to enhance hydrogen sorption prediction. The framework integrates classical isotherm models with thermodynamic constraints to ensure physical consistency while leveraging deep learning flexibility. A comprehensive dataset consisting of 155 samples, which includes 50 clays, 60 shales, and 45 coals, was employed, incorporating diverse compositional properties and experimental conditions. Multi-category feature engineering across seven categories captured complex sorption dynamics. The PINN employs deep residual networks with multi-head attention, optimized via adaptive loss functions and Monte Carlo dropout for uncertainty quantification. K-fold cross-validation and hyperparameter optimization achieve significant accuracy (R2 = 0.979, RMSE = 0.045 mol per kg) with 67% faster convergence despite 15-fold increased complexity. The framework demonstrates robust lithology-specific performance across clay minerals (R2 = 0.981), shales (R2 = 0.971), and coals (R2 = 0.978), maintaining 85-91% reliability scores. Interpretability analysis via SHAP, accumulated local effects, and Friedman's H-statistics reveal that hydrogen adsorption capacity dominates predictions, while 86.7% of feature pairs exhibit strong interactions, validating the necessity of non-linear modeling approaches. This adaptive physics-informed framework accelerates site screening and enables risk-informed decision-making through robust uncertainty quantification.

Teaching AI to Remember: Insights from Brain-Inspired Replay in Continual Learning

arXiv:2509.00047v1 Announce Type: new Abstract: Artificial neural networks (ANNs) continue to face challenges in continual learning, particularly due to catastrophic forgetting, the loss of previously learned knowledge when acquiring new tasks. Inspired by memory consolidation in the human brain, we investigate the internal replay mechanism proposed by~citep{brain_inspired_replay1}, which reactivates latent representations of prior experiences during learning. As internal replay was identified as the most influential component among the brain-inspired mechanisms in their framework, it serves as the central focus of our in-depth investigation. Using the CIFAR-100 dataset in a class-incremental setting, we evaluate the effectiveness of internal replay, both in isolation and in combination with Synaptic Intelligence (SI). Our experiments show that internal replay significantly mitigates forgetting, especially when paired with SI, but at the cost of reduced initial task accuracy, highlighting a trade-off between memory stability and learning plasticity. Further analyses using log-likelihood distributions, reconstruction errors, silhouette scores, and UMAP projections reveal that internal replay increases representational overlap in latent space, potentially limiting task-specific differentiation. These results underscore the limitations of current brain-inspired methods and suggest future directions for balancing retention and adaptability in continual learning systems.

Non-Identical Diffusion Models in MIMO-OFDM Channel Generation

arXiv:2509.01641v1 Announce Type: cross Abstract: We propose a novel diffusion model, termed the non-identical diffusion model, and investigate its application to wireless orthogonal frequency division multiplexing (OFDM) channel generation. Unlike the standard diffusion model that uses a scalar-valued time index to represent the global noise level, we extend this notion to an element-wise time indicator to capture local error variations more accurately. Non-identical diffusion enables us to characterize the reliability of each element (e.g., subcarriers in OFDM) within the noisy input, leading to improved generation results when the initialization is biased. Specifically, we focus on the recovery of wireless multi-input multi-output (MIMO) OFDM channel matrices, where the initial channel estimates exhibit highly uneven reliability across elements due to the pilot scheme. Conventional time embeddings, which assume uniform noise progression, fail to capture such variability across pilot schemes and noise levels. We introduce a matrix that matches the input size to control element-wise noise progression. Following a similar diffusion procedure to existing methods, we show the correctness and effectiveness of the proposed non-identical diffusion scheme both theoretically and numerically. For MIMO-OFDM channel generation, we propose a dimension-wise time embedding strategy. We also develop and evaluate multiple training and generation methods and compare them through numerical experiments.

Text Meets Topology: Rethinking Out-of-distribution Detection in Text-Rich Networks

arXiv:2508.17690v2 Announce Type: replace-cross Abstract: Out-of-distribution (OOD) detection remains challenging in text-rich networks, where textual features intertwine with topological structures. Existing methods primarily address label shifts or rudimentary domain-based splits, overlooking the intricate textual-structural diversity. For example, in social networks, where users represent nodes with textual features (name, bio) while edges indicate friendship status, OOD may stem from the distinct language patterns between bot and normal users. To address this gap, we introduce the TextTopoOOD framework for evaluating detection across diverse OOD scenarios: (1) attribute-level shifts via text augmentations and embedding perturbations; (2) structural shifts through edge rewiring and semantic connections; (3) thematically-guided label shifts; and (4) domain-based divisions. Furthermore, we propose TNT-OOD to model the complex interplay between Text aNd Topology using: 1) a novel cross-attention module to fuse local structure into node-level text representations, and 2) a HyperNetwork to generate node-specific transformation parameters. This aligns topological and semantic features of ID nodes, enhancing ID/OOD distinction across structural and textual shifts. Experiments on 11 datasets across four OOD scenarios demonstrate the nuanced challenge of TextTopoOOD for evaluating OOD detection in text-rich networks.

K-Tensors: Clustering Positive Semi-Definite Matrices

arXiv:2306.06534v5 Announce Type: replace Abstract: This paper presents a new clustering algorithm for symmetric positive semi-definite (SPSD) matrices, called K-Tensors. The method identifies structured subsets of the SPSD cone characterized by common principal component (CPC) representations, where each subset corresponds to matrices sharing a common eigenstructure. Unlike conventional clustering approaches that rely on vectorization or transformations of SPSD matrices, thereby losing critical geometric and spectral information, K-Tensors introduces a divergence that respects the intrinsic geometry of SPSD matrices. This divergence preserves the shape and eigenstructure information and yields principal SPSD tensors, defined as a set of representative matrices that summarize the distribution of SPSD matrices. By exploring its theoretical properties, we show that the proposed clustering algorithm is self-consistent under mild distribution assumptions and converges to a local optimum. We demonstrate the use of the algorithm through an application to resting-state functional magnetic resonance imaging (rs-fMRI) data from the Human Connectome Project, where we cluster brain connectivity matrices to discover groups of subjects with shared connectivity structures.

Learning Social Heuristics for Human-Aware Path Planning

arXiv:2509.02134v1 Announce Type: cross Abstract: Social robotic navigation has been at the center of numerous studies in recent years. Most of the research has focused on driving the robotic agent along obstacle-free trajectories, respecting social distances from humans, and predicting their movements to optimize navigation. However, in order to really be socially accepted, the robots must be able to attain certain social norms that cannot arise from conventional navigation, but require a dedicated learning process. We propose Heuristic Planning with Learned Social Value (HPLSV), a method to learn a value function encapsulating the cost of social navigation, and use it as an additional heuristic in heuristic-search path planning. In this preliminary work, we apply the methodology to the common social scenario of joining a queue of people, with the intention of generalizing to further human activities.

Mask-PINNs: Mitigating Internal Covariate Shift in Physics-Informed Neural Networks

arXiv:2505.06331v3 Announce Type: replace Abstract: Physics-Informed Neural Networks (PINNs) have emerged as a powerful framework for solving partial differential equations (PDEs) by embedding physical laws directly into the loss function. However, as a fundamental optimization issue, internal covariate shift (ICS) hinders the stable and effective training of PINNs by disrupting feature distributions and limiting model expressiveness. Unlike standard deep learning tasks, conventional remedies for ICS -- such as Batch Normalization and Layer Normalization -- are not directly applicable to PINNs, as they distort the physical consistency required for reliable PDE solutions. To address this issue, we propose Mask-PINNs, a novel architecture that introduces a learnable mask function to regulate feature distributions while preserving the underlying physical constraints of PINNs. We provide a theoretical analysis showing that the mask suppresses the expansion of feature representations through a carefully designed modulation mechanism. Empirically, we validate the method on multiple PDE benchmarks -- including convection, wave propagation, and Helmholtz equations -- across diverse activation functions. Our results show consistent improvements in prediction accuracy, convergence stability, and robustness. Furthermore, we demonstrate that Mask-PINNs enable the effective use of wider networks, overcoming a key limitation in existing PINN frameworks.

Goal-Conditioned Data Augmentation for Offline Reinforcement Learning

arXiv:2412.20519v2 Announce Type: replace Abstract: Offline reinforcement learning (RL) enables policy learning from pre-collected offline datasets, relaxing the need to interact directly with the environment. However, limited by the quality of offline datasets, it generally fails to learn well-qualified policies in suboptimal datasets. To address datasets with insufficient optimal demonstrations, we introduce Goal-cOnditioned Data Augmentation (GODA), a novel goal-conditioned diffusion-based method for augmenting samples with higher quality. Leveraging recent advancements in generative modelling, GODA incorporates a novel return-oriented goal condition with various selection mechanisms. Specifically, we introduce a controllable scaling technique to provide enhanced return-based guidance during data sampling. GODA learns a comprehensive distribution representation of the original offline datasets while generating new data with selectively higher-return goals, thereby maximizing the utility of limited optimal demonstrations. Furthermore, we propose a novel adaptive gated conditioning method for processing noisy inputs and conditions, enhancing the capture of goal-oriented guidance. We conduct experiments on the D4RL benchmark and real-world challenges, specifically traffic signal control (TSC) tasks, to demonstrate GODA's effectiveness in enhancing data quality and superior performance compared to state-of-the-art data augmentation methods across various offline RL algorithms.

Parameter-Aware Ensemble SINDy for Interpretable Symbolic SGS Closure

arXiv:2508.14085v2 Announce Type: replace Abstract: This work designs a scalable, parameter-aware sparse regression framework for discovering interpretable partial differential equations and subgrid-scale closures from multi-parameter simulation data. Building on SINDy (Sparse Identification of Nonlinear Dynamics), the approach addresses key limitations through four enhancements. First, symbolic parameterisation enables physical parameters to vary within unified regression. Second, the Dimensional Similarity Filter enforces unit consistency while reducing candidate libraries. Third, memory-efficient Gram-matrix accumulation enables batch processing of large datasets. Fourth, ensemble consensus with coefficient stability analysis ensures robust model identification. Validation on canonical one-dimensional benchmarks demonstrates consistent discovery of governing equations across parameter ranges. Applied to filtered Burgers datasets, the framework autonomously discovers the SGS closure $tau_{mathrm{SGS}} = 0.1604cdotDelta^2left(frac{partial bar{u}}{partial x}right)^2$ with the SINDy-discovered Smagorinsky constant $C_s^{text{SINDy}} approx 0.4005$ without predefined closure assumptions, recovering Smagorinsky-type structure directly from data. The discovered model achieves $R^2 = 0.885$ across filter scales and demonstrates improved prediction accuracy compared to classical SGS closures. The ability of the framework to identify physically meaningful SGS forms and calibrate coefficients offers a complementary approach to existing turbulence modelling methods, contributing to the broader field of data-driven turbulence closure discovery.