Archives AI News

Towards Quantifying the Hessian Structure of Neural Networks

arXiv:2505.02809v2 Announce Type: replace Abstract: Empirical studies reported that the Hessian matrix of neural networks (NNs) exhibits a near-block-diagonal structure, yet its theoretical foundation remains unclear. In this work, we reveal that the reported Hessian structure comes from a mixture…

Highly Imbalanced Regression with Tabular Data in SEP and Other Applications

arXiv:2509.16339v1 Announce Type: new Abstract: We investigate imbalanced regression with tabular data that have an imbalance ratio larger than 1,000 (“highly imbalanced”). Accurately estimating the target values of rare instances is important in applications such as forecasting the intensity of…

Robust LLM Training Infrastructure at ByteDance

arXiv:2509.16293v1 Announce Type: new Abstract: The training scale of large language models (LLMs) has reached tens of thousands of GPUs and is still continuously expanding, enabling faster learning of larger models. Accompanying the expansion of the resource scale is the…

Architectural change in neural networks using fuzzy vertex pooling

arXiv:2509.16287v1 Announce Type: new Abstract: The process of pooling vertices involves the creation of a new vertex, which becomes adjacent to all the vertices that were originally adjacent to the endpoints of the vertices being pooled. After this, the endpoints…