Archives AI News

DualTune: Decoupled Fine-Tuning for On-Device Agentic Systems

arXiv:2510.00229v1 Announce Type: new Abstract: The deployment of Large Language Models (LLMs) as agentic orchestrators has revolutionized task automation, but the need for privacy-preserving, cost-effective solutions demands on-device inference capabilities. However, local LLMs consistently underperform compared to frontier models in…

EFRame: Deeper Reasoning via Exploration-Filter-Replay Reinforcement Learning Framework

arXiv:2506.22200v4 Announce Type: replace-cross Abstract: Recent advances in reinforcement learning (RL) have significantly enhanced the reasoning capabilities of large language models (LLMs). Group Relative Policy Optimization (GRPO), a lightweight variant of Proximal Policy Optimization (PPO), improves efficiency but suffers from…

ICL Optimized Fragility

arXiv:2510.00300v1 Announce Type: new Abstract: ICL guides are known to improve task-specific performance, but their impact on cross-domain cognitive abilities remains unexplored. This study examines how ICL guides affect reasoning across different knowledge domains using six variants of the GPT-OSS:20b…