Archives AI News

Murphys Laws of AI Alignment: Why the Gap Always Wins

arXiv:2509.05381v3 Announce Type: replace-cross Abstract: We study reinforcement learning from human feedback under misspecification. Sometimes human feedback is systematically wrong on certain types of inputs, like a broken compass that points the wrong way in specific regions. We prove that…

HoloGarment: 360{deg} Novel View Synthesis of In-the-Wild Garments

arXiv:2509.12187v1 Announce Type: cross Abstract: Novel view synthesis (NVS) of in-the-wild garments is a challenging task due significant occlusions, complex human poses, and cloth deformations. Prior methods rely on synthetic 3D training data consisting of mostly unoccluded and static objects,…

Offline Contextual Bandit with Counterfactual Sample Identification

arXiv:2509.10520v1 Announce Type: new Abstract: In production systems, contextual bandit approaches often rely on direct reward models that take both action and context as input. However, these models can suffer from confounding, making it difficult to isolate the effect of…