Archives AI News

ICL-Router: In-Context Learned Model Representations for LLM Routing

arXiv:2510.09719v2 Announce Type: replace Abstract: Large language models (LLMs) often exhibit complementary strengths. Model routing harnesses these strengths by dynamically directing each query to the most suitable model, given a candidate model pool. However, routing performance relies on accurate model…

Efficient Restarts in Non-Stationary Model-Free Reinforcement Learning

arXiv:2510.11933v1 Announce Type: new Abstract: In this work, we propose three efficient restart paradigms for model-free non-stationary reinforcement learning (RL). We identify two core issues with the restart design of Mao et al. (2022)’s RestartQ-UCB algorithm: (1) complete forgetting, where…

LLMBridge: Reducing Costs in a Prompt-Centric Internet

arXiv:2410.11857v2 Announce Type: replace-cross Abstract: Today’s Internet infrastructure is centered around content retrieval over HTTP, with middleboxes (e.g., HTTP proxies) playing a crucial role in performance, security, and cost-effectiveness. We envision a future where Internet communication will be dominated by…

On efficiently computable functions, deep networks and sparse compositionality

arXiv:2510.11942v1 Announce Type: new Abstract: We show that emph{efficient Turing computability} at any fixed input/output precision implies the existence of emph{compositionally sparse} (bounded-fan-in, polynomial-size) DAG representations and of corresponding neural approximants achieving the target precision. Concretely: if $f:[0,1]^dtoR^m$ is computable…

Inverse Design in Nanophotonics via Representation Learning

arXiv:2507.00546v2 Announce Type: replace-cross Abstract: Inverse design in nanophotonics, the computational discovery of structures achieving targeted electromagnetic (EM) responses, has become a key tool for recent optical advances. Traditional intuition-driven or iterative optimization methods struggle with the inherently high-dimensional, non-convex…

Sculpting Latent Spaces With MMD: Disentanglement With Programmable Priors

arXiv:2510.11953v1 Announce Type: new Abstract: Learning disentangled representations, where distinct factors of variation are captured by independent latent variables, is a central goal in machine learning. The dominant approach has been the Variational Autoencoder (VAE) framework, which uses a Kullback-Leibler…

Y-shaped Generative Flows

arXiv:2510.11955v1 Announce Type: new Abstract: Modern continuous-time generative models often induce V-shaped transport: each sample travels independently along nearly straight trajectories from prior to data, overlooking shared structure. We introduce Y-shaped generative flows, which move probability mass together along shared…

LayerSync: Self-aligning Intermediate Layers

arXiv:2510.12581v1 Announce Type: cross Abstract: We propose LayerSync, a domain-agnostic approach for improving the generation quality and the training efficiency of diffusion models. Prior studies have highlighted the connection between the quality of generation and the representations learned by diffusion…