Archives AI News

Continuous Fairness On Data Streams

arXiv:2601.08976v1 Announce Type: new Abstract: We study the problem of enforcing continuous group fairness over windows in data streams. We propose a novel fairness model that ensures group fairness at a finer granularity level (referred to as block) within each…

Optimising for Energy Efficiency and Performance in Machine Learning

arXiv:2601.08991v1 Announce Type: new Abstract: The ubiquity of machine learning (ML) and the demand for ever-larger models bring an increase in energy consumption and environmental impact. However, little is known about the energy scaling laws in ML, and existing research…

Breaking the Bottlenecks: Scalable Diffusion Models for 3D Molecular Generation

arXiv:2601.08963v1 Announce Type: new Abstract: Diffusion models have emerged as a powerful class of generative models for molecular design, capable of capturing complex structural distributions and achieving high fidelity in 3D molecule generation. However, their widespread use remains constrained by…

XGBoost Forecasting of NEPSE Index Log Returns with Walk Forward Validation

arXiv:2601.08896v1 Announce Type: new Abstract: This study develops a robust machine learning framework for one-step-ahead forecasting of daily log-returns in the Nepal Stock Exchange (NEPSE) Index using the XGBoost regressor. A comprehensive feature set is engineered, including lagged log-returns (up…

Meta-learning to Address Data Shift in Time Series Classification

arXiv:2601.09018v1 Announce Type: new Abstract: Across engineering and scientific domains, traditional deep learning (TDL) models perform well when training and test data share the same distribution. However, the dynamic nature of real-world data, broadly termed textit{data shift}, renders TDL models…

Dynamics-Aligned Latent Imagination in Contextual World Models for Zero-Shot Generalization

arXiv:2508.20294v2 Announce Type: replace Abstract: Real-world reinforcement learning demands adaptation to unseen environmental conditions without costly retraining. Contextual Markov Decision Processes (cMDP) model this challenge, but existing methods often require explicit context variables (e.g., friction, gravity), limiting their use when…

Layer-Parallel Training for Transformers

arXiv:2601.09026v1 Announce Type: new Abstract: We present a new training methodology for transformers using a multilevel, layer-parallel approach. Through a neural ODE formulation of transformers, our application of a multilevel parallel-in-time algorithm for the forward and backpropagation phases of training…

When do spectral gradient updates help in deep learning?

arXiv:2512.04299v2 Announce Type: replace Abstract: Spectral gradient methods, such as the recently popularized Muon optimizer, are a promising alternative to standard Euclidean gradient descent for training deep neural networks and transformers, but it is still unclear in which regimes they…