Archives AI News

Binned semiparametric Bayesian networks for efficient kernel density estimation

arXiv:2506.21997v3 Announce Type: replace Abstract: This paper introduces a new type of probabilistic semiparametric model that takes advantage of data binning to reduce the computational cost of kernel density estimation in nonparametric distributions. Two new conditional probability distributions are developed…

Informed Machine Learning with Knowledge Landmarks

arXiv:2604.00256v1 Announce Type: new Abstract: Informed Machine Learning has emerged as a viable generalization of Machine Learning (ML) by building a unified conceptual and algorithmic setting for constructing models on a unified basis of knowledge and data. Physics-informed ML involving…

Regularizing Extrapolation in Causal Inference

arXiv:2509.17180v3 Announce Type: replace Abstract: Many common estimators in machine learning and causal inference are linear smoothers, where the prediction is a weighted average of the training outcomes. Some estimators, such as ordinary least squares and kernel ridge regression, allow…

Hierarchical Apprenticeship Learning from Imperfect Demonstrations with Evolving Rewards

arXiv:2604.00258v1 Announce Type: new Abstract: While apprenticeship learning has shown promise for inducing effective pedagogical policies directly from student interactions in e-learning environments, most existing approaches rely on optimal or near-optimal expert demonstrations under a fixed reward. Real-world student interactions,…

CRoPE: Efficient Parametrization of Rotary Positional Embedding

arXiv:2601.02728v2 Announce Type: replace Abstract: Rotary positional embedding has become the state-of-the-art approach to encode position information in transformer-based models. While it is often succinctly expressed in complex linear algebra, we note that the actual implementation of $Q/K/V$-projections is not…

Mousse: Rectifying the Geometry of Muon with Curvature-Aware Preconditioning

arXiv:2603.09697v2 Announce Type: replace Abstract: Recent advances in spectral optimization, notably Muon, have demonstrated that constraining update steps to the Stiefel manifold can significantly accelerate training and improve generalization. However, Muon implicitly assumes an isotropic optimization landscape, enforcing a uniform…