Archives AI News

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

arXiv:2509.22638v1 Announce Type: cross Abstract: LLMs are often trained with RL from human or AI feedback, yet such methods typically compress nuanced feedback into scalar rewards, discarding much of their richness and inducing scale imbalance. We propose treating verbal feedback…

Uncertainty-Aware Knowledge Tracing Models

arXiv:2509.21514v1 Announce Type: new Abstract: The main focus of research on Knowledge Tracing (KT) models is on model developments with the aim of improving predictive accuracy. Most of these models make the most incorrect predictions when students choose a distractor,…

Unstable Unlearning: The Hidden Risk of Concept Resurgence in Diffusion Models

arXiv:2410.08074v3 Announce Type: replace Abstract: Text-to-image diffusion models rely on massive, web-scale datasets. Training them from scratch is computationally expensive, and as a result, developers often prefer to make incremental updates to existing models. These updates often compose fine-tuning steps…

TRiCo: Triadic Game-Theoretic Co-Training for Robust Semi-Supervised Learning

arXiv:2509.21526v1 Announce Type: new Abstract: We introduce TRiCo, a novel triadic game-theoretic co-training framework that rethinks the structure of semi-supervised learning by incorporating a teacher, two students, and an adversarial generator into a unified training paradigm. Unlike existing co-training or…

Preemptive Detection and Steering of LLM Misalignment via Latent Reachability

arXiv:2509.21528v1 Announce Type: new Abstract: Large language models (LLMs) are now ubiquitous in everyday tools, raising urgent safety concerns about their tendency to generate harmful content. The dominant safety approach — reinforcement learning from human feedback (RLHF) — effectively shapes…

Learnable Kernel Density Estimation for Graphs

arXiv:2505.21285v3 Announce Type: replace Abstract: This work proposes a framework LGKDE that learns kernel density estimation for graphs. The key challenge in graph density estimation lies in effectively capturing both structural patterns and semantic variations while maintaining theoretical guarantees. Combining…

Expert-guided Clinical Text Augmentation via Query-Based Model Collaboration

arXiv:2509.21530v1 Announce Type: new Abstract: Data augmentation is a widely used strategy to improve model robustness and generalization by enriching training datasets with synthetic examples. While large language models (LLMs) have demonstrated strong generative capabilities for this purpose, their applications…

RL-Obfuscation: Can Language Models Learn to Evade Latent-Space Monitors?

arXiv:2506.14261v3 Announce Type: replace Abstract: Latent-space monitors aim to detect undesirable behaviours in Large Language Models by leveraging their internal representations rather than relying solely on black-box outputs. These methods have shown promise in identifying behaviours such as deception and…