Archives AI News

BARD: Bridging AutoRegressive and Diffusion Vision-Language Models Via Highly Efficient Progressive Block Merging and Stage-Wise Distillation

arXiv:2604.16514v2 Announce Type: replace-cross Abstract: Autoregressive vision-language models (VLMs) deliver strong multimodal capability, but their token-by-token decoding imposes a fundamental inference bottleneck. Diffusion VLMs offer a more parallel decoding paradigm, yet directly converting a pretrained autoregressive VLM into a large-block…

April 22, 2026

SAGE-32B: Agentic Reasoning via Iterative Distillation

arXiv:2601.04237v2 Announce Type: replace-cross Abstract: We demonstrate SAGE-32B, a 32 billion parameter language model that focuses on agentic reasoning and long range planning tasks. Unlike chat models that aim for general conversation fluency, SAGE-32B is designed to operate in an…

April 22, 2026

COMODO: Cross-Modal Video-to-IMU Distillation for Efficient Egocentric Human Activity Recognition

arXiv:2503.07259v2 Announce Type: replace-cross Abstract: The goal of creating intelligent, human-centered wearable systems for continuous activity understanding faces a fundamental trade-off: Egocentric video-based models capture rich semantic information and have demonstrated strong performance in human activity recognition (HAR), but their…

April 22, 2026

How to Teach Large Multimodal Models New Skills

arXiv:2510.08564v2 Announce Type: replace-cross Abstract: How can we teach large multimodal models (LMMs) new skills without erasing prior abilities? We study sequential fine-tuning on five target skills while monitoring general ability on eight held-out benchmarks across three model families. Surprisingly,…

April 22, 2026

TreeGrad-Ranker: Feature Ranking via $O(L)$-Time Gradients for Decision Trees

arXiv:2602.11623v3 Announce Type: replace Abstract: We revisit the use of probabilistic values, which include the well-known Shapley and Banzhaf values, to rank features for explaining the local predicted values of decision trees. The quality of feature rankings is typically assessed…

April 22, 2026

LEPO: Latent Reasoning Policy Optimization for Large Language Models

arXiv:2604.17892v2 Announce Type: replace Abstract: Recently, latent reasoning has been introduced into large language models (LLMs) to leverage rich information within a continuous space. However, without stochastic sampling, these methods inevitably collapse to deterministic inference, failing to discover diverse reasoning…

April 22, 2026

RESFL: An Uncertainty-Aware Framework for Responsible Federated Learning by Balancing Privacy, Fairness and Utility

arXiv:2503.16251v2 Announce Type: replace Abstract: Federated Learning (FL) has gained prominence in machine learning applications across critical domains by enabling collaborative model training without centralized data aggregation. However, FL frameworks that protect privacy often sacrifice fairness and reliability. Differential privacy…

April 22, 2026

Enabling Vibration-Based Gesture Recognition on Everyday Furniture via Energy-Efficient FPGA Implementation of 1D Convolutional Networks

arXiv:2510.23156v2 Announce Type: replace Abstract: The growing demand for smart home interfaces has increased interest in non-intrusive sensing methods like vibration-based gesture recognition. While prior studies demonstrated feasibility, they often rely on complex preprocessing and large Neural Networks (NNs) requiring…

April 22, 2026

Handling and Interpreting Missing Modalities in Patient Clinical Trajectories via Autoregressive Sequence Modeling

arXiv:2604.18753v1 Announce Type: new Abstract: An active challenge in developing multimodal machine learning (ML) models for healthcare is handling missing modalities during training and deployment. As clinical datasets are inherently temporal and sparse in terms of modality presence, capturing the…

April 22, 2026

Towards Understanding the Robustness of Sparse Autoencoders

arXiv:2604.18756v1 Announce Type: new Abstract: Large Language Models (LLMs) remain vulnerable to optimization-based jailbreak attacks that exploit internal gradient structure. While Sparse Autoencoders (SAEs) are widely used for interpretability, their robustness implications remain underexplored. We present a study of integrating…

April 22, 2026