Archives AI News

Weight Weaving: Parameter Pooling for Data-Free Model Merging

arXiv:2510.13921v1 Announce Type: new Abstract: Model merging provides a cost-effective and data-efficient combination of specialized deep neural networks through parameter integration. This technique leverages expert models across downstream tasks without requiring retraining. Most model merging approaches critically depend on scaling…

Symmetry-Aware GFlowNets

arXiv:2506.02685v3 Announce Type: replace-cross Abstract: Generative Flow Networks (GFlowNets) offer a powerful framework for sampling graphs in proportion to their rewards. However, existing approaches suffer from systematic biases due to inaccuracies in state transition probability computations. These biases, rooted in…

K-frames: Scene-Driven Any-k Keyframe Selection for long video understanding

arXiv:2510.13891v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have demonstrated significant capabilities in image understanding, but long-video are constrained by context windows and computational cost. Uniform frame sampling often leads to substantial information loss. Meanwhile existing keyframe selection…

Joint Discriminative-Generative Modeling via Dual Adversarial Training

arXiv:2510.13872v1 Announce Type: new Abstract: Simultaneously achieving robust classification and high-fidelity generative modeling within a single framework presents a significant challenge. Hybrid approaches, such as Joint Energy-Based Models (JEM), interpret classifiers as EBMs but are often limited by the instability…

Deep Edge Filter: Return of the Human-Crafted Layer in Deep Learning

arXiv:2510.13865v1 Announce Type: new Abstract: We introduce the Deep Edge Filter, a novel approach that applies high-pass filtering to deep neural network features to improve model generalizability. Our method is motivated by our hypothesis that neural networks encode task-relevant semantic…

Thompson Sampling via Fine-Tuning of LLMs

arXiv:2510.13328v2 Announce Type: replace Abstract: Bayesian optimization in large unstructured discrete spaces is often hindered by the computational cost of maximizing acquisition functions due to the absence of gradients. We propose a scalable alternative based on Thompson sampling that eliminates…

LTR-ICD: A Learning-to-Rank Approach for Automatic ICD Coding

arXiv:2510.13922v1 Announce Type: new Abstract: Clinical notes contain unstructured text provided by clinicians during patient encounters. These notes are usually accompanied by a sequence of diagnostic codes following the International Classification of Diseases (ICD). Correctly assigning and ordering ICD codes…