Archives AI News

The Effect of Attention Head Count on Transformer Approximation

arXiv:2510.06662v1 Announce Type: cross Abstract: Transformer has become the dominant architecture for sequence modeling, yet a detailed understanding of how its structural parameters influence expressive power remains limited. In this work, we study the approximation properties of transformers, with particular…

October 9, 2025

Non-Asymptotic Analysis of Efficiency in Conformalized Regression

arXiv:2510.07093v1 Announce Type: cross Abstract: Conformal prediction provides prediction sets with coverage guarantees. The informativeness of conformal prediction depends on its efficiency, typically quantified by the expected size of the prediction set. Prior work on the efficiency of conformalized regression…

October 9, 2025

jmstate, a Flexible Python Package for Multi-State Joint Modeling

arXiv:2510.07128v1 Announce Type: cross Abstract: Classical joint modeling approaches often rely on competing risks or recurrent event formulations to account for complex real-world processes involving evolving longitudinal markers and discrete event occurrences. However, these frameworks typically capture only limited aspects…

October 9, 2025

On residual network depth

arXiv:2510.03470v2 Announce Type: replace-cross Abstract: Deep residual architectures, such as ResNet and the Transformer, have enabled models of unprecedented depth, yet a formal understanding of why depth is so effective remains an open question. A popular intuition, following Veit et…

October 9, 2025

DPMM-CFL: Clustered Federated Learning via Dirichlet Process Mixture Model Nonparametric Clustering

arXiv:2510.07132v1 Announce Type: cross Abstract: Clustered Federated Learning (CFL) improves performance under non-IID client heterogeneity by clustering clients and training one model per cluster, thereby balancing between a global model and fully personalized models. However, most CFL methods require the…

October 9, 2025

Slow-Fast Policy Optimization: Reposition-Before-Update for LLM Reasoning

arXiv:2510.04072v2 Announce Type: replace-cross Abstract: Reinforcement learning (RL) has become central to enhancing reasoning in large language models (LLMs). Yet on-policy algorithms such as Group Relative Policy Optimization (GRPO) often suffer in early training: noisy gradients from low-quality rollouts lead…

October 9, 2025

A General Constructive Upper Bound on Shallow Neural Nets Complexity

arXiv:2510.06372v1 Announce Type: new Abstract: We provide an upper bound on the number of neurons required in a shallow neural network to approximate a continuous function on a compact set with a given accuracy. This method, inspired by a specific…

October 9, 2025

Capturing the trillion dollar opportunity with autonomous professional services

Presented by CertiniaEvery professional services leader knows the feeling: a pipeline full of promising deals, but a bench that’s already stretched thin. That’s because growth has always been tied to a finite supply of consultants with finite availability to work…

October 9, 2025

Zendesk launches new AI capabilities for the Resolution Platform, creating the ultimate service experience for all

Presented by ZendeskZendesk powers nearly 5 billion resolutions every year for over 100,000 customers around the world, with about 20,000 of its customers (and growing) using its AI services. Zendesk is poised to generate about $200 million in AI-related revenue…

October 9, 2025

The next AI battleground: Google’s Gemini Enterprise and AWS’s Quick Suite bring full-stack, in-context AI to the workplace

The friction of having to open a separate chat window to prompt an agent could be a hassle for many enterprises. And AI companies are seeing an opportunity to bring more and more AI services into one platform, even integrating…

October 9, 2025