Archives AI News

Toward Reproducible Cross-Backend Compatibility for Deep Learning: A Configuration-First Framework with Three-Tier Verification

arXiv:2509.06977v1 Announce Type: new Abstract: This paper presents a configuration-first framework for evaluating cross-backend compatibility in deep learning systems deployed on CPU, GPU, and compiled runtimes. The framework decouples experiments from code using YAML, supports both library and repository models, and employs a three-tier verification protocol covering tensor-level closeness, activation alignment, and task-level metrics. Through 672 checks across multiple models and tolerance settings, we observe that 72.0% of runs pass, with most discrepancies occurring under stricter thresholds. Our results show that detection models and compiled backends are particularly prone to drift, often due to nondeterministic post-processing. We further demonstrate that deterministic adapters and selective fallbacks can substantially improve agreement without significant performance loss. To our knowledge, this is the first unified framework that systematically quantifies and mitigates cross-backend drift in deep learning, providing a reproducible methodology for dependable deployment across heterogeneous runtimes.

September 10, 2025

A Knowledge-Guided Cross-Modal Feature Fusion Model for Local Traffic Demand Prediction

arXiv:2509.06976v1 Announce Type: new Abstract: Traffic demand prediction plays a critical role in intelligent transportation systems. Existing traffic prediction models primarily rely on temporal traffic data, with limited efforts incorporating human knowledge and experience for urban traffic demand forecasting. However, in real-world scenarios, traffic knowledge and experience derived from human daily life significantly influence precise traffic prediction. Such knowledge and experiences can guide the model in uncovering latent patterns within traffic data, thereby enhancing the accuracy and robustness of predictions. To this end, this paper proposes integrating structured temporal traffic data with textual data representing human knowledge and experience, resulting in a novel knowledge-guided cross-modal feature representation learning (KGCM) model for traffic demand prediction. Based on regional transportation characteristics, we construct a prior knowledge dataset using a large language model combined with manual authoring and revision, covering both regional and global knowledge and experiences. The KGCM model then learns multimodal data features through designed local and global adaptive graph networks, as well as a cross-modal feature fusion mechanism. A proposed reasoning-based dynamic update strategy enables dynamic optimization of the graph model's parameters, achieving optimal performance. Experiments on multiple traffic datasets demonstrate that our model accurately predicts future traffic demand and outperforms existing state-of-the-art (SOTA) models.

September 10, 2025

Active Learning of Piecewise Gaussian Process Surrogates

arXiv:2301.08789v4 Announce Type: replace Abstract: Active learning of Gaussian process (GP) surrogates has been useful for optimizing experimental designs for physical/computer simulation experiments, and for steering data acquisition schemes in machine learning. In this paper, we develop a method for active learning of piecewise, Jump GP surrogates. Jump GPs are continuous within, but discontinuous across, regions of a design space, as required for applications spanning autonomous materials design, configuration of smart factory systems, and many others. Although our active learning heuristics are appropriated from strategies originally designed for ordinary GPs, we demonstrate that additionally accounting for model bias, as opposed to the usual model uncertainty, is essential in the Jump GP context. Toward that end, we develop an estimator for bias and variance of Jump GP models. Illustrations, and evidence of the advantage of our proposed methods, are provided on a suite of synthetic benchmarks, and real-simulation experiments of varying complexity.

September 10, 2025

FediLoRA: Heterogeneous LoRA for Federated Multimodal Fine-tuning under Missing Modalities

arXiv:2509.06984v1 Announce Type: new Abstract: Foundation models have demonstrated remarkable performance across a wide range of tasks, yet their large parameter sizes pose challenges for practical deployment, especially in decentralized environments. Parameter-efficient fine-tuning (PEFT), such as Low-Rank Adaptation (LoRA), reduces local computing and memory overhead, making it attractive for federated learning. However, existing federated LoRA methods typically assume uniform rank configurations and unimodal inputs, overlooking two key real-world challenges: (1) heterogeneous client resources have different LoRA ranks, and (2) multimodal data settings with potentially missing modalities. In this work, we propose FediLoRA, a simple yet effective framework for federated multimodal fine-tuning under heterogeneous LoRA ranks and missing modalities. FediLoRA introduces a dimension-wise aggregation strategy that reweights LoRA updates without information dilution during aggregation. It also includes a lightweight layer-wise model editing method that selectively incorporates global parameters to repair local components which improves both client and global model performances. Experimental results on three multimodal benchmark datasets demonstrate that FediLoRA achieves superior performance over competitive baselines in both global and personalized settings, particularly in the presence of modality incompleteness.

September 10, 2025

When Do Neural Networks Learn World Models?

arXiv:2502.09297v5 Announce Type: replace Abstract: Humans develop world models that capture the underlying generation process of data. Whether neural networks can learn similar world models remains an open problem. In this work, we present the first theoretical results for this problem, showing that in a multi-task setting, models with a low-degree bias provably recover latent data-generating variables under mild assumptions--even if proxy tasks involve complex, non-linear functions of the latents. However, such recovery is sensitive to model architecture. Our analysis leverages Boolean models of task solutions via the Fourier-Walsh transform and introduces new techniques for analyzing invertible Boolean transforms, which may be of independent interest. We illustrate the algorithmic implications of our results and connect them to related research areas, including self-supervised learning, out-of-distribution generalization, and the linear representation hypothesis in large language models.

September 10, 2025

Machine Generalize Learning in Agent-Based Models: Going Beyond Surrogate Models for Calibration in ABMs

arXiv:2509.07013v1 Announce Type: new Abstract: Calibrating agent-based epidemic models is computationally demanding. We present a supervised machine learning calibrator that learns the inverse mapping from epidemic time series to SIR parameters. A three-layer bidirectional LSTM ingests 60-day incidence together with population size and recovery rate, and outputs transmission probability, contact rate, and R0. Training uses a composite loss with an epidemiology-motivated consistency penalty that encourages R0 * recovery rate to equal transmission probability * contact rate. In a 1000-scenario simulation study, we compare the calibrator with Approximate Bayesian Computation (likelihood-free MCMC). The method achieves lower error across all targets (MAE: R0 0.0616 vs 0.275; transmission 0.0715 vs 0.128; contact 1.02 vs 4.24), produces tighter predictive intervals with near nominal coverage, and reduces wall clock time from 77.4 s to 2.35 s per calibration. Although contact rate and transmission probability are partially nonidentifiable, the approach reproduces epidemic curves more faithfully than ABC, enabling fast and practical calibration. We evaluate it on SIR agent based epidemics generated with epiworldR and provide an implementation in R.

September 10, 2025

FNODE: Flow-Matching for data-driven simulation of constrained multibody systems

arXiv:2509.00183v2 Announce Type: replace Abstract: Data-driven modeling of constrained multibody systems faces two persistent challenges: high computational cost and limited long-term prediction accuracy. To address these issues, we introduce the Flow-Matching Neural Ordinary Differential Equation (FNODE), a framework that learns acceleration vector fields directly from trajectory data. By reformulating the training objective to supervise accelerations rather than integrated states, FNODE eliminates the need for backpropagation through an ODE solver, which represents a bottleneck in traditional Neural ODEs. Acceleration targets are computed efficiently using numerical differentiation techniques, including a hybrid Fast Fourier Transform (FFT) and Finite Difference (FD) scheme. We evaluate FNODE on a diverse set of benchmarks, including the single and triple mass-spring-damper systems, double pendulum, slider-crank, and cart-pole. Across all cases, FNODE consistently outperforms existing approaches such as Multi-Body Dynamic Neural ODE (MBD-NODE), Long Short-Term Memory (LSTM) networks, and Fully Connected Neural Networks (FCNN), demonstrating good accuracy, generalization, and computational efficiency.

September 10, 2025

An efficient deep reinforcement learning environment for flexible job-shop scheduling

arXiv:2509.07019v1 Announce Type: new Abstract: The Flexible Job-shop Scheduling Problem (FJSP) is a classical combinatorial optimization problem that has a wide-range of applications in the real world. In order to generate fast and accurate scheduling solutions for FJSP, various deep reinforcement learning (DRL) scheduling methods have been developed. However, these methods are mainly focused on the design of DRL scheduling Agent, overlooking the modeling of DRL environment. This paper presents a simple chronological DRL environment for FJSP based on discrete event simulation and an end-to-end DRL scheduling model is proposed based on the proximal policy optimization (PPO). Furthermore, a short novel state representation of FJSP is proposed based on two state variables in the scheduling environment and a novel comprehensible reward function is designed based on the scheduling area of machines. Experimental results on public benchmark instances show that the performance of simple priority dispatching rules (PDR) is improved in our scheduling environment and our DRL scheduling model obtains competing performance compared with OR-Tools, meta-heuristic, DRL and PDR scheduling methods.

September 10, 2025

Explainable Metrics for the Assessment of Neurodegenerative Diseases through Handwriting Analysis

arXiv:2409.08303v3 Announce Type: replace-cross Abstract: Motor dysfunction is a common sign of neurodegenerative diseases (NDs) such as Parkinson's disease (PD) and Alzheimer's disease (AD), but may be difficult to detect, especially in the early stages. In this work, we examine the behavior of a wide array of explainable metrics extracted from the handwriting signals of 113 subjects performing multiple tasks on a digital tablet, as part of the Neurological Signals dataset. The aim is to measure their effectiveness in characterizing NDs, including AD and PD. To this end, task-agnostic and task-specific metrics are extracted from 14 distinct tasks. Subsequently, through statistical analysis and a series of classification experiments, we investigate which metrics provide greater discriminative power between NDs and healthy controls and amongst different NDs. Preliminary results indicate that the tasks at hand can all be effectively leveraged to distinguish between the considered set of NDs, specifically by measuring the stability, the speed of writing, the time spent not writing, and the pressure variations between groups from our handcrafted explainable metrics, which shows p-values lower than 0.0001 for multiple tasks. Using various binary classification algorithms on the computed metrics, we obtain up to 87 % accuracy for the discrimination between AD and healthy controls (CTL), and up to 69 % for the discrimination between PD and CTL.

September 10, 2025

1 bit is all we need: binary normalized neural networks

arXiv:2509.07025v1 Announce Type: new Abstract: The increasing size of large neural network models, specifically language models and foundational image models, poses deployment challenges, prompting efforts to reduce memory requirements and enhance computational efficiency. These efforts are critical to ensure practical deployment and effective utilization of these models across various applications. In this work, a novel type of neural network layers and models is developed that uses only single-bit parameters. In this novel type of models all parameters of all layers, including kernel weights and biases, only have values equal to zero or one. This novel type of models uses layers named as binary normalized layer. These binary normalized layers can be of any type, such as fully connected, convolutional, attention, etc., and they consist of slight variations of the corresponding conventional layers. To show the effectiveness of the binary normalized layers, two different models are configured to solve a multiclass image classification problem and a language decoder to predict the next token of a sequence. The model to solve the image classification has convolutional and fully connected layers, and the language model is composed of transformer blocks with multi-head attention. The results show that models with binary normalized layers present almost the same results obtained by equivalent models with real 32-bit parameters. The binary normalized layers allow to develop models that use 32 times less memory than current models and have equivalent performance. Besides, the binary normalized layers can be easily implemented on current computers using 1-bit arrays, and do not require the development of dedicated electronic hardware. This novel type of layers opens a new era for large neural network models with reduced memory requirements that can be deployed using simple and cheap hardware, such as mobile devices or only cpus.

September 10, 2025