Archives AI News

Steer Like the LLM: Activation Steering that Mimics Prompting

arXiv:2605.03907v1 Announce Type: cross Abstract: Large language models can be steered at inference time through prompting or activation interventions, but activation steering methods often underperform compared to prompt-based approaches. We propose a framework that formulates prompt steering as a form…

ZeRO-Prefill: Zero Redundancy Overheads in MoE Prefill Serving

arXiv:2605.02960v1 Announce Type: new Abstract: Production LLM workloads increasingly serve discriminative tasks, such as classification, recommendation, and verification, whose answers are read from the logits of a single prefill pass with no autoregressive decoding. Serving these prefill-only workloads on mixture-of-experts…

Fisher Decorator: Refining Flow Policy via a Local Transport Map

arXiv:2604.17919v2 Announce Type: replace Abstract: Recent advances in flow-based offline reinforcement learning (RL) have achieved strong performance by parameterizing policies via flow matching. However, they still face critical trade-offs among expressiveness, optimality, and efficiency. In particular, existing flow policies interpret…

InvisibleInk: High-Utility and Low-Cost Text Generation with Differential Privacy

arXiv:2507.02974v3 Announce Type: replace Abstract: As major progress in LLM-based long-form text generation enables paradigms such as retrieval-augmented generation (RAG) and inference-time scaling, safely incorporating private information into the generation remains a critical open question. We present InvisibleInk, a highly…