Archives AI News

PrunedLoRA: Robust Gradient-Based structured pruning for Low-rank Adaptation in Fine-tuning

arXiv:2510.00192v1 Announce Type: new Abstract: Low-rank adaptation (LoRA) has become a widely used paradigm for parameter-efficient fine-tuning of large language models, yet its representational capacity often lags behind full fine-tuning. Within the context of LoRA, a key open question is…

October 2, 2025

Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition

arXiv:2510.01068v1 Announce Type: cross Abstract: Diffusion-based models for robotic control, including vision-language-action (VLA) and vision-action (VA) policies, have demonstrated significant capabilities. Yet their advancement is constrained by the high cost of acquiring large-scale interaction datasets. This work introduces an alternative…

October 2, 2025

GRPO-$lambda$: Credit Assignment improves LLM Reasoning

arXiv:2510.00194v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly deployed for tasks requiring complex reasoning, prompting significant interest in improving their reasoning abilities through post-training. Especially RL based methods using verifiable reward, like the state-of-the-art GRPO, have shown…

October 2, 2025

Neural Network Characterization and Entropy Regulated Data Balancing through Principal Component Analysis

arXiv:2312.01392v2 Announce Type: replace Abstract: This paper examines in detail the geometric structure of principal component analysis (PCA) by considering in detail the distributions of both unrotated and rotated MNIST digits in the space defined by the lowest order PCA…

October 2, 2025

RouterArena: An Open Platform for Comprehensive Comparison of LLM Routers

arXiv:2510.00202v1 Announce Type: new Abstract: Today’s LLM ecosystem comprises a wide spectrum of models that differ in size, capability, and cost. No single model is optimal for all scenarios; hence, LLM routers have become essential for selecting the most appropriate…

October 2, 2025

TDBench: A Benchmark for Top-Down Image Understanding with Reliability Analysis of Vision-Language Models

arXiv:2504.03748v2 Announce Type: replace Abstract: Top-down images play an important role in safety-critical settings such as autonomous navigation and aerial surveillance, where they provide holistic spatial information that front-view images cannot capture. Despite this, Vision Language Models (VLMs) are mostly…

October 2, 2025

LoRAFusion: Efficient LoRA Fine-Tuning for LLMs

arXiv:2510.00206v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) has become the leading Parameter-Efficient Fine-Tuning (PEFT) method for Large Language Models (LLMs), as it significantly reduces GPU memory usage while maintaining competitive fine-tuned model quality on downstream tasks. Despite these benefits,…

October 2, 2025

Training-free LLM Verification via Recycling Few-shot Examples

arXiv:2506.17251v2 Announce Type: replace Abstract: Although LLMs have achieved remarkable performance, the inherent stochasticity of their reasoning process and varying conclusions present significant challenges. Majority voting or Best-of-N with external verification models has been explored to find the most promising…

October 2, 2025

Directed-MAML: Meta Reinforcement Learning Algorithm with Task-directed Approximation

arXiv:2510.00212v1 Announce Type: new Abstract: Model-Agnostic Meta-Learning (MAML) is a versatile meta-learning framework applicable to both supervised learning and reinforcement learning (RL). However, applying MAML to meta-reinforcement learning (meta-RL) presents notable challenges. First, MAML relies on second-order gradient computations, leading…

October 2, 2025

Combating Noisy Labels via Dynamic Connection Masking

arXiv:2508.09697v2 Announce Type: replace Abstract: Noisy labels are inevitable in real-world scenarios. Due to the strong capacity of deep neural networks to memorize corrupted labels, these noisy labels can cause significant performance degradation. Existing research on mitigating the negative effects…

October 2, 2025