Archives AI News

Generative diffusion models for spatiotemporal influenza forecasting

arXiv:2604.24913v1 Announce Type: new Abstract: Forecasting infectious disease incidence can provide important information to guide public health planning, yet is difficult because epidemic dynamics are complex. Current mechanistic and statistical approaches often struggle to capture multimodal uncertainty or emergent trends.…

Revisiting the Past: Data Unlearning with Model State History

arXiv:2506.20941v3 Announce Type: replace Abstract: Large language models are trained on massive corpora of web data, which may include private data, copyrighted material, factually inaccurate data, or data that degrades model performance. Eliminating the influence of such problematic datapoints on…

A Unifying Framework for Unsupervised Concept Extraction

arXiv:2604.24936v1 Announce Type: new Abstract: Techniques for concept extraction, such as sparse autoencoders and transcoders, aim to extract high-level symbolic concepts from low-level nonsymbolic representations. When these extracted concepts are used for downstream tasks such as model steering and unlearning,…

Rethinking Entropy Interventions in RLVR: An Entropy Change Perspective

arXiv:2510.10150v3 Announce Type: replace Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) serves as a cornerstone technique for enhancing the reasoning capabilities of Large Language Models (LLMs). However, its training is often plagued by emph{entropy collapse}, a rapid decline in policy…

Evaluating LLM Safety Under Repeated Inference via Accelerated Prompt Stress Testing

arXiv:2602.11786v2 Announce Type: replace Abstract: Traditional benchmarks for large language models (LLMs), such as HELM and AIR-BENCH, primarily assess safety through breadth-oriented evaluation across diverse tasks and risk categories. However, real-world deployment often exposes a different class of risk: operational…

Improving Diversity in Black-box Few-shot Knowledge Distillation

arXiv:2604.25795v1 Announce Type: cross Abstract: Knowledge distillation (KD) is a well-known technique to effectively compress a large network (teacher) to a smaller network (student) with little sacrifice in performance. However, most KD methods require a large training set and internal…

Is the Modality Gap a Bug or a Feature? A Robustness Perspective

arXiv:2603.29080v2 Announce Type: replace-cross Abstract: Many modern multi-modal models (e.g. CLIP) seek an embedding space in which the two modalities are aligned. Somewhat surprisingly, almost all existing models show a strong modality gap: the distribution of images is well-separated from…