Archives AI News

Demystifying Domain-adaptive Post-training for Financial LLMs

arXiv:2501.04961v3 Announce Type: replace-cross Abstract: Domain-adaptive post-training of large language models (LLMs) has emerged as a promising approach for specialized domains such as medicine and finance. However, significant challenges remain in identifying optimal adaptation criteria and training strategies across varying…

September 29, 2025

GraphPFN: A Prior-Data Fitted Graph Foundation Model

arXiv:2509.21489v1 Announce Type: new Abstract: Foundation models pretrained on large-scale datasets have transformed such fields as natural language processing and computer vision, but their application to graph data remains limited. Recently emerged graph foundation models, such as G2T-FM, utilize tabular…

September 29, 2025

MUCAR: Benchmarking Multilingual Cross-Modal Ambiguity Resolution for Multimodal Large Language Models

arXiv:2506.17046v2 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs) have demonstrated significant advances across numerous vision-language tasks. MLLMs have shown promising capability in aligning visual and textual modalities, allowing them to process image-text pairs with clear and explicit meanings.…

September 29, 2025

SlimDiff: Training-Free, Activation-Guided Hands-free Slimming of Diffusion Models

arXiv:2509.21498v1 Announce Type: new Abstract: Diffusion models (DMs), lauded for their generative performance, are computationally prohibitive due to their billion-scale parameters and iterative denoising dynamics. Existing efficiency techniques, such as quantization, timestep reduction, or pruning, offer savings in compute, memory,…

September 29, 2025

Learnable Conformal Prediction with Context-Aware Nonconformity Functions for Robotic Planning and Perception

arXiv:2509.21955v1 Announce Type: cross Abstract: Deep learning models in robotics often output point estimates with poorly calibrated confidences, offering no native mechanism to quantify predictive reliability under novel, noisy, or out-of-distribution inputs. Conformal prediction (CP) addresses this gap by providing…

September 29, 2025

Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training

arXiv:2509.21500v1 Announce Type: new Abstract: Reinforcement fine-tuning (RFT) often suffers from emph{reward over-optimization}, where a policy model hacks the reward signals to achieve high scores while producing low-quality outputs. Our theoretical analysis shows that the key lies in reward misspecification…

September 29, 2025

HiGS: History-Guided Sampling for Plug-and-Play Enhancement of Diffusion Models

arXiv:2509.22300v1 Announce Type: cross Abstract: While diffusion models have made remarkable progress in image generation, their outputs can still appear unrealistic and lack fine details, especially when using fewer number of neural function evaluations (NFEs) or lower guidance scales. To…

September 29, 2025

Contrastive Mutual Information Learning: Toward Robust Representations without Positive-Pair Augmentations

arXiv:2509.21511v1 Announce Type: new Abstract: Learning representations that transfer well to diverse downstream tasks remains a central challenge in representation learning. Existing paradigms — contrastive learning, self-supervised masking, and denoising auto-encoders — balance this challenge with different trade-offs. We introduce…

September 29, 2025

Smoothing-Based Conformal Prediction for Balancing Efficiency and Interpretability

arXiv:2509.22529v1 Announce Type: cross Abstract: Conformal Prediction (CP) is a distribution-free framework for constructing statistically rigorous prediction sets. While popular variants such as CD-split improve CP’s efficiency, they often yield prediction sets composed of multiple disconnected subintervals, which are difficult…

September 29, 2025

DistillKac: Few-Step Image Generation via Damped Wave Equations

arXiv:2509.21513v1 Announce Type: new Abstract: We present DistillKac, a fast image generator that uses the damped wave equation and its stochastic Kac representation to move probability mass at finite speed. In contrast to diffusion models whose reverse time velocities can…

September 29, 2025