Archives AI News

BARD: Bridging AutoRegressive and Diffusion Vision-Language Models Via Highly Efficient Progressive Block Merging and Stage-Wise Distillation

arXiv:2604.16514v2 Announce Type: replace-cross Abstract: Autoregressive vision-language models (VLMs) deliver strong multimodal capability, but their token-by-token decoding imposes a fundamental inference bottleneck. Diffusion VLMs offer a more parallel decoding paradigm, yet directly converting a pretrained autoregressive VLM into a large-block…

Task Switching Without Forgetting via Proximal Decoupling

arXiv:2604.18857v1 Announce Type: new Abstract: In continual learning, the primary challenge is to learn new information without forgetting old knowledge. A common solution addresses this trade-off through regularization, penalizing changes to parameters critical for previous tasks. In most cases, this…

Best Agent Identification for General Game Playing

arXiv:2507.00451v2 Announce Type: replace Abstract: We present an efficient and generalised procedure to accurately identify the best (or near best) performing algorithm for each sub-task in a multi-problem domain. Our approach treats this as a set of best arm identification…

ParamBoost: Gradient Boosted Piecewise Cubic Polynomials

arXiv:2604.18864v1 Announce Type: new Abstract: Generalized Additive Models (GAMs) can be used to create non-linear glass-box (i.e. explicitly interpretable) models, where the predictive function is fully observable over the complete input space. However, glass-box interpretability itself does not allow for…

How does the optimizer implicitly bias the model merging loss landscape?

arXiv:2510.04686v2 Announce Type: replace Abstract: Model merging combines independent solutions with different capabilities into a single one while maintaining the same inference cost. Two popular approaches are linear interpolation, which simply averages multiple model weights, and task arithmetic, which combines…

Subgraph Concept Networks: Concept Levels in Graph Classification

arXiv:2604.18868v1 Announce Type: new Abstract: The reasoning process of Graph Neural Networks is complex and considered opaque, limiting trust in their predictions. To alleviate this issue, prior work has proposed concept-based explanations, extracted from clusters in the model’s node embeddings.…