Archives AI News

MCD: Marginal Contrastive Discrimination for conditional density estimation

arXiv:2206.01592v2 Announce Type: replace-cross Abstract: We consider the problem of conditional density estimation, which is a major topic of interest in the fields of statistical and machine learning. Our method, called Marginal Contrastive Discrimination, MCD, reformulates the conditional density function…

Can Optimal Transport Improve Federated Inverse Reinforcement Learning?

arXiv:2601.00309v1 Announce Type: new Abstract: In robotics and multi-agent systems, fleets of autonomous agents often operate in subtly different environments while pursuing a common high-level objective. Directly pooling their data to learn a shared reward function is typically impractical due…

Mitigating optimistic bias in entropic risk estimation and optimization

arXiv:2409.19926v4 Announce Type: replace-cross Abstract: The entropic risk measure is widely used in high-stakes decision-making across economics, management science, finance, and safety-critical control systems because it captures tail risks associated with uncertain losses. However, when data are limited, the empirical…

Quantum King-Ring Domination in Chess: A QAOA Approach

arXiv:2601.00318v1 Announce Type: new Abstract: The Quantum Approximate Optimization Algorithm (QAOA) is extensively benchmarked on synthetic random instances such as MaxCut, TSP, and SAT problems, but these lack semantic structure and human interpretability, offering limited insight into performance on real-world…

Smart Fault Detection in Nanosatellite Electrical Power System

arXiv:2601.00335v1 Announce Type: new Abstract: This paper presents a new detection method of faults at Nanosatellites’ electrical power without an Attitude Determination Control Subsystem (ADCS) at the LEO orbit. Each part of this system is at risk of fault due…

The Curse of Depth in Large Language Models

arXiv:2502.05795v3 Announce Type: replace Abstract: In this paper, we introduce the Curse of Depth, a concept that highlights, explains, and addresses the recent observation in modern Large Language Models (LLMs) where nearly half of the layers are less effective than…

Flattening Hierarchies with Policy Bootstrapping

arXiv:2505.14975v3 Announce Type: replace Abstract: Offline goal-conditioned reinforcement learning (GCRL) is a promising approach for pretraining generalist policies on large datasets of reward-free trajectories, akin to the self-supervised objectives used to train foundation models for computer vision and natural language…