Archives AI News

Do LLMs Follow Their Own Rules? A Reflexive Audit of Self-Stated Safety Policies

arXiv:2604.09189v1 Announce Type: cross Abstract: LLMs internalize safety policies through RLHF, yet these policies are never formally specified and remain difficult to inspect. Existing benchmarks evaluate models against external standards but do not measure whether models understand and enforce their…

April 13, 2026

Automatic Self-supervised Learning for Social Recommendations

arXiv:2412.18735v3 Announce Type: replace-cross Abstract: In recent years, researchers have leveraged social relations to enhance recommendation performance. However, most existing social recommendation methods require carefully designed auxiliary social tasks tailored to specific scenarios, which depend heavily on domain knowledge and…

April 13, 2026

Bandwidth-constrained Variational Message Encoding for Cooperative Multi-agent Reinforcement Learning

arXiv:2512.11179v3 Announce Type: replace Abstract: Graph-based multi-agent reinforcement learning (MARL) enables coordinated behavior under partial observability by modeling agents as nodes and communication links as edges. While recent methods excel at learning sparse coordination graphs-determining who communicates with whom-they do…

April 13, 2026

ALMAB-DC: Active Learning, Multi-Armed Bandits, and Distributed Computing for Sequential Experimental Design and Black-Box Optimization

arXiv:2603.21180v3 Announce Type: replace Abstract: Sequential experimental design under expensive, gradient-free objectives is a central challenge in computational statistics: evaluation budgets are tightly constrained and information must be extracted efficiently from each observation. We propose textbf{ALMAB-DC}, a GP-based sequential design…

April 13, 2026

Sim-to-Real Transfer for Muscle-Actuated Robots via Generalized Actuator Networks

arXiv:2604.09487v1 Announce Type: cross Abstract: Tendon drives paired with soft muscle actuation enable faster and safer robots while potentially accelerating skill acquisition. Still, these systems are rarely used in practice due to inherent nonlinearities, friction, and hysteresis, which complicate modeling…

April 13, 2026

Task-agnostic Low-rank Residual Adaptation for Efficient Federated Continual Fine-Tuning

arXiv:2505.12318v2 Announce Type: replace Abstract: Federated Parameter-Efficient Fine-Tuning (Fed-PEFT) enables lightweight adaptation of large pre-trained models in federated learning settings by updating only a small subset of parameters. However, Fed-PEFT methods typically assume a fixed label space and static downstream…

April 13, 2026

Online Quantile Regression for Nonparametric Additive Models

arXiv:2604.08969v1 Announce Type: cross Abstract: This paper introduces a projected functional gradient descent algorithm (P-FGD) for training nonparametric additive quantile regression models in online settings. This algorithm extends the functional stochastic gradient descent framework to the pinball loss. An advantage…

April 13, 2026

Automatic Self-supervised Learning for Social Recommendations

April 13, 2026

Bandwidth-constrained Variational Message Encoding for Cooperative Multi-agent Reinforcement Learning

April 13, 2026

ALMAB-DC: Active Learning, Multi-Armed Bandits, and Distributed Computing for Sequential Experimental Design and Black-Box Optimization

April 13, 2026