Archives AI News

SIGMA: Scalable Spectral Insights for LLM Collapse

arXiv:2601.03385v1 Announce Type: new Abstract: The rapid adoption of synthetic data for training Large Language Models (LLMs) has introduced the technical challenge of “model collapse”-a degenerative process where recursive training on model-generated content leads to a contraction of distributional variance…

Intrinsic-Metric Physics-Informed Neural Networks (IM-PINN) for Reaction-Diffusion Dynamics on Complex Riemannian Manifolds

arXiv:2601.00834v2 Announce Type: replace Abstract: Simulating nonlinear reaction-diffusion dynamics on complex, non-Euclidean manifolds remains a fundamental challenge in computational morphogenesis, constrained by high-fidelity mesh generation costs and symplectic drift in discrete time-stepping schemes. This study introduces the Intrinsic-Metric Physics-Informed Neural…

A Novel Convolution and Attention Mechanism-based Model for 6D Object Pose Estimation

arXiv:2501.01993v2 Announce Type: replace-cross Abstract: This paper proposes PoseLecTr, a graph-based encoder-decoder framework that integrates a novel Legendre convolution with attention mechanisms for six-degree-of-freedom (6-DOF) object pose estimation from monocular RGB images. Conventional learning-based approaches predominantly rely on grid-structured convolutions,…

Sensor to Pixels: Decentralized Swarm Gathering via Image-Based Reinforcement Learning

arXiv:2601.03413v1 Announce Type: new Abstract: This study highlights the potential of image-based reinforcement learning methods for addressing swarm-related tasks. In multi-agent reinforcement learning, effective policy learning depends on how agents sense, interpret, and process inputs. Traditional approaches often rely on…

Jailbreaking LLMs Without Gradients or Priors: Effective and Transferable Attacks

arXiv:2601.03420v1 Announce Type: new Abstract: As Large Language Models (LLMs) are increasingly deployed in safety-critical domains, rigorously evaluating their robustness against adversarial jailbreaks is essential. However, current safety evaluations often overestimate robustness because existing automated attacks are limited by restrictive…

Compact Example-Based Explanations for Language Models

arXiv:2601.03786v1 Announce Type: cross Abstract: Training data influence estimation methods quantify the contribution of training documents to a model’s output, making them a promising source of information for example-based explanations. As humans cannot interpret thousands of documents, only a small…

Spectral Archaeology: The Causal Topology of Model Evolution

arXiv:2601.03424v1 Announce Type: new Abstract: Behavioral benchmarks tell us textit{what} a model does, but not textit{how}. We introduce a training-free mechanistic probe using attention-graph spectra. Treating each layer as a token graph, we compute algebraic connectivity ($lambda_2$), smoothness, and spectral…

Current Agents Fail to Leverage World Model as Tool for Foresight

arXiv:2601.03905v1 Announce Type: cross Abstract: Agents built on vision-language models increasingly face tasks that demand anticipating future states rather than relying on short-horizon reasoning. Generative world models offer a promising remedy: agents could use them as external simulators to foresee…