Archives AI News

DualDiffusion: A Speculative Decoding Strategy for Masked Diffusion Models

arXiv:2604.05250v1 Announce Type: new Abstract: Masked Diffusion Models (MDMs) offer a promising alternative to autoregressive language models by enabling parallel token generation and bidirectional context modeling. However, their inference speed is significantly limited by the inability to cache key-value pairs…

Automatic Replication of LLM Mistakes in Medical Conversations

arXiv:2512.20983v2 Announce Type: replace-cross Abstract: Large language models (LLMs) are increasingly evaluated in clinical settings using multi-dimensional rubrics which quantify reasoning quality, safety, and patient-centeredness. Yet, replicating specific mistakes in other LLM models is not straightforward and often requires manual…

Quantization-Robust LLM Unlearning via Low-Rank Adaptation

arXiv:2602.13151v3 Announce Type: replace Abstract: Large Language Model (LLM) unlearning aims to remove targeted knowledge from a trained model, but practical deployments often require post-training quantization (PTQ) for efficient inference. However, aggressive low-bit PTQ can mask unlearning updates, causing quantized…

Retrieval Augmented Time Series Forecasting

arXiv:2411.08249v2 Announce Type: replace Abstract: Retrieval-augmented generation (RAG) is a central component of modern LLM systems, particularly in scenarios where up-to-date information is crucial for accurately responding to user queries or when queries exceed the scope of the training data.…