Archives AI News

A Theoretical Framework for Grokking: Interpolation followed by Riemannian Norm Minimisation

arXiv:2505.20172v2 Announce Type: replace Abstract: We study the dynamics of gradient flow with small weight decay on general training losses $F: mathbb{R}^d to mathbb{R}$. Under mild regularity assumptions and assuming convergence of the unregularised gradient flow, we show that the…

November 6, 2025

Discrete Bayesian Sample Inference for Graph Generation

arXiv:2511.03015v1 Announce Type: new Abstract: Generating graph-structured data is crucial in applications such as molecular generation, knowledge graphs, and network analysis. However, their discrete, unordered nature makes them difficult for traditional generative models, leading to the rise of discrete diffusion…

November 6, 2025

Diagrams-to-Dynamics (D2D): Exploring Causal Loop Diagram Leverage Points under Uncertainty

arXiv:2508.05659v3 Announce Type: replace Abstract: Causal loop diagrams (CLDs) are widely used in health and environmental research to represent hypothesized causal structures underlying complex problems. However, as qualitative and static representations, CLDs are limited in their ability to support dynamic…

November 6, 2025

Adaptive-Sensorless Monitoring of Shipping Containers

arXiv:2511.03022v1 Announce Type: new Abstract: Monitoring the internal temperature and humidity of shipping containers is essential to preventing quality degradation during cargo transportation. Sensorless monitoring — machine learning models that predict the internal conditions of the containers using exogenous factors…

November 6, 2025

DE3S: Dual-Enhanced Soft-Sparse-Shape Learning for Medical Early Time-Series Classification

arXiv:2510.12214v2 Announce Type: replace Abstract: Early Time Series Classification (ETSC) is critical in time-sensitive medical applications such as sepsis, yet it presents an inherent trade-off between accuracy and earliness. This trade-off arises from two core challenges: 1) models should effectively…

November 6, 2025

Leveraging Discrete Function Decomposability for Scientific Design

arXiv:2511.03032v1 Announce Type: new Abstract: In the era of AI-driven science and engineering, we often want to design discrete objects in silico according to user-specified properties. For example, we may wish to design a protein to bind its target, arrange…

November 6, 2025

Variable Selection in Maximum Mean Discrepancy for Interpretable Distribution Comparison

arXiv:2311.01537v2 Announce Type: replace-cross Abstract: We study two-sample variable selection: identifying variables that discriminate between the distributions of two sets of data vectors. Such variables help scientists understand the mechanisms behind dataset discrepancies. Although domain-specific methods exist (e.g., in medical…

November 6, 2025

Data-Efficient Realized Volatility Forecasting with Vision Transformers

arXiv:2511.03046v1 Announce Type: new Abstract: Recent work in financial machine learning has shown the virtue of complexity: the phenomenon by which deep learning methods capable of learning highly nonlinear relationships outperform simpler approaches in financial forecasting. While transformer architectures like…

November 6, 2025

Unsupervised Evaluation of Multi-Turn Objective-Driven Interactions

arXiv:2511.03047v1 Announce Type: new Abstract: Large language models (LLMs) have seen increasing popularity in enterprise applications where AI agents and humans engage in objective-driven interactions. However, these systems are difficult to evaluate: data may be complex and unlabeled; human annotation…

November 6, 2025

VoiceAgentBench: Are Voice Assistants ready for agentic tasks?

arXiv:2510.07978v2 Announce Type: replace-cross Abstract: Large-scale Speech Language Models (SpeechLMs) have enabled voice assistants capable of understanding natural spoken queries and performing complex tasks. However, existing speech benchmarks primarily focus on isolated capabilities such as transcription, or question-answering, and do…

November 6, 2025