Archives AI News

Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry

arXiv:2503.01822v2 Announce Type: replace Abstract: Sparse Autoencoders (SAEs) are widely used to interpret neural networks by identifying meaningful concepts from their representations. However, do SAEs truly uncover all concepts a model relies on, or are they inherently biased toward certain…

December 3, 2025

HTG-GCL: Leveraging Hierarchical Topological Granularity from Cellular Complexes for Graph Contrastive Learning

arXiv:2512.02073v1 Announce Type: new Abstract: Graph contrastive learning (GCL) aims to learn discriminative semantic invariance by contrasting different views of the same graph that share critical topological patterns. However, existing GCL approaches with structural augmentations often struggle to identify task-relevant…

December 3, 2025

FDRMFL:Multi-modal Federated Feature Extraction Model Based on Information Maximization and Contrastive Learning

arXiv:2512.02076v1 Announce Type: new Abstract: This study focuses on the feature extraction problem in multi-modal data regression. To address three core challenges in real-world scenarios: limited and non-IID data, effective extraction and fusion of multi-modal information, and susceptibility to catastrophic…

December 3, 2025

Ada-MoGE: Adaptive Mixture of Gaussian Expert Model for Time Series Forecasting

arXiv:2512.02061v1 Announce Type: new Abstract: Multivariate time series forecasts are widely used, such as industrial, transportation and financial forecasts. However, the dominant frequencies in time series may shift with the evolving spectral distribution of the data. Traditional Mixture of Experts…

December 3, 2025

DPWMixer: Dual-Path Wavelet Mixer for Long-Term Time Series Forecasting

arXiv:2512.02070v1 Announce Type: new Abstract: Long-term time series forecasting (LTSF) is a critical task in computational intelligence. While Transformer-based models effectively capture long-range dependencies, they often suffer from quadratic complexity and overfitting due to data sparsity. Conversely, efficient linear models…

December 3, 2025

Opening the Black Box: An Explainable, Few-shot AI4E Framework Informed by Physics and Expert Knowledge for Materials Engineering

arXiv:2512.02057v1 Announce Type: new Abstract: The industrial adoption of Artificial Intelligence for Engineering (AI4E) faces two fundamental bottlenecks: scarce high-quality data and the lack of interpretability in black-box models-particularly critical in safety-sensitive sectors like aerospace. We present an explainable, few-shot…

December 3, 2025

Contextual Gating within the Transformer Stack: Synergistic Feature Modulation for Enhanced Lyrical Classification and Calibration

arXiv:2512.02053v1 Announce Type: new Abstract: This study introduces a significant architectural advancement in feature fusion for lyrical content classification by integrating auxiliary structural features directly into the self-attention mechanism of a pre-trained Transformer. I propose the SFL Transformer, a novel…

December 3, 2025

PIBNet: a Physics-Inspired Boundary Network for Multiple Scattering Simulations

arXiv:2512.02049v1 Announce Type: new Abstract: The boundary element method (BEM) provides an efficient numerical framework for solving multiple scattering problems in unbounded homogeneous domains, since it reduces the discretization to the domain boundaries, thereby condensing the computational complexity. The procedure…

December 3, 2025

Cross-View Topology-Aware Graph Representation Learning

arXiv:2512.02130v1 Announce Type: new Abstract: Graph classification has gained significant attention due to its applications in chemistry, social networks, and bioinformatics. While Graph Neural Networks (GNNs) effectively capture local structural patterns, they often overlook global topological features that are critical…

December 3, 2025

Efficient Turing Machine Simulation with Transformers

arXiv:2512.00003v2 Announce Type: replace-cross Abstract: Constant bit-size Transformers are known to be Turing complete, but existing constructions require $Omega(s(n))$ chain-of-thought (CoT) steps per simulated Turing machine (TM) step, leading to impractical reasoning lengths. In this paper, we significantly reduce this…

December 3, 2025