Robust Multi-Objective Controlled Decoding of Large Language Models
arXiv:2503.08796v2 Announce Type: replace Abstract: We introduce Robust Multi-Objective Decoding (RMOD), a novel inference-time algorithm that robustly aligns Large Language Models (LLMs) to multiple human objectives (e.g., instruction-following, helpfulness, safety) by maximizing the worst-case rewards. RMOD formulates the robust decoding…
