Archives AI News

Generative diffusion models for spatiotemporal influenza forecasting

arXiv:2604.24913v1 Announce Type: new Abstract: Forecasting infectious disease incidence can provide important information to guide public health planning, yet is difficult because epidemic dynamics are complex. Current mechanistic and statistical approaches often struggle to capture multimodal uncertainty or emergent trends.…

April 29, 2026

Revisiting the Past: Data Unlearning with Model State History

arXiv:2506.20941v3 Announce Type: replace Abstract: Large language models are trained on massive corpora of web data, which may include private data, copyrighted material, factually inaccurate data, or data that degrades model performance. Eliminating the influence of such problematic datapoints on…

April 29, 2026

A Unifying Framework for Unsupervised Concept Extraction

arXiv:2604.24936v1 Announce Type: new Abstract: Techniques for concept extraction, such as sparse autoencoders and transcoders, aim to extract high-level symbolic concepts from low-level nonsymbolic representations. When these extracted concepts are used for downstream tasks such as model steering and unlearning,…

April 29, 2026

Rethinking Entropy Interventions in RLVR: An Entropy Change Perspective

arXiv:2510.10150v3 Announce Type: replace Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) serves as a cornerstone technique for enhancing the reasoning capabilities of Large Language Models (LLMs). However, its training is often plagued by emph{entropy collapse}, a rapid decline in policy…

April 29, 2026

Rethinking Layer Redundancy in Large Language Models: Calibration Objectives and Search for Depth Pruning

arXiv:2604.24938v1 Announce Type: new Abstract: Depth pruning improves the inference efficiency of large language models by removing Transformer blocks. Prior work has focused on importance criteria and search algorithms, often treating layer redundancy as an inherent structural property of pretrained…

April 29, 2026

Evaluating LLM Safety Under Repeated Inference via Accelerated Prompt Stress Testing

arXiv:2602.11786v2 Announce Type: replace Abstract: Traditional benchmarks for large language models (LLMs), such as HELM and AIR-BENCH, primarily assess safety through breadth-oriented evaluation across diverse tasks and risk categories. However, real-world deployment often exposes a different class of risk: operational…

April 29, 2026

Enabling privacy-preserving AI training on everyday devices

A new method could bring more accurate and efficient AI models to high-stakes applications like health care and finance, even in under-resourced settings.

April 29, 2026

Improving Diversity in Black-box Few-shot Knowledge Distillation

arXiv:2604.25795v1 Announce Type: cross Abstract: Knowledge distillation (KD) is a well-known technique to effectively compress a large network (teacher) to a smaller network (student) with little sacrifice in performance. However, most KD methods require a large training set and internal…

April 29, 2026

Is the Modality Gap a Bug or a Feature? A Robustness Perspective

arXiv:2603.29080v2 Announce Type: replace-cross Abstract: Many modern multi-modal models (e.g. CLIP) seek an embedding space in which the two modalities are aligned. Somewhat surprisingly, almost all existing models show a strong modality gap: the distribution of images is well-separated from…

April 29, 2026

Egocentric Tactile and Proximity Sensors as Observation Priors for Humanoid Collision Avoidance

arXiv:2604.25554v1 Announce Type: cross Abstract: Collision-free motion is often aided by tactile and proximity sensors distributed on the body of the robot due to their resistance to occlusion as opposed to external cameras. However, how to shape the sensor’s properties,…

April 29, 2026