Archives AI News

Nonlinear Optimization with GPU-Accelerated Neural Network Constraints

arXiv:2509.22462v2 Announce Type: replace Abstract: We propose a reduced-space formulation for optimizing over trained neural networks where the network’s outputs and derivatives are evaluated on a GPU. To do this, we treat the neural network as a “gray box” where…

ReJump: A Tree-Jump Representation for Analyzing and Improving LLM Reasoning

arXiv:2512.00831v2 Announce Type: replace Abstract: Large Reasoning Models (LRMs) are Large Language Models (LLMs) explicitly trained to generate long-form Chain-of-Thoughts (CoTs), achieving impressive success on challenging tasks like math and programming. However, their underlying reasoning “algorithms” remain poorly understood. To…

Generative Learning of Heterogeneous Tail Dependence

arXiv:2011.13132v3 Announce Type: replace Abstract: We propose a multivariate generative model to capture the complex dependence structure often encountered in business and financial data. Our model features heterogeneous and asymmetric tail dependence between all pairs of individual dimensions while also…

Representation Retrieval Learning for Heterogeneous Data Integration

arXiv:2503.09494v3 Announce Type: replace Abstract: In the era of big data, large-scale, multi-source, multi-modality datasets are increasingly ubiquitous, offering unprecedented opportunities for predictive modeling and scientific discovery. However, these datasets often exhibit complex heterogeneity, such as covariates shift, posterior drift,…

LAPA: Log-Domain Prediction-Driven Dynamic Sparsity Accelerator for Transformer Model

arXiv:2512.07855v1 Announce Type: new Abstract: Attention-based Transformers have revolutionized natural language processing (NLP) and shown strong performance in computer vision (CV) tasks. However, as the input sequence varies, the computational bottlenecks in Transformer models exhibit dynamic behavior across stages, which…

Medical Test-free Disease Detection Based on Big Data

arXiv:2512.07856v1 Announce Type: new Abstract: Accurate disease detection is of paramount importance for effective medical treatment and patient care. However, the process of disease detection is often associated with extensive medical testing and considerable costs, making it impractical to perform…

GPU Memory Prediction for Multimodal Model Training

arXiv:2512.07853v1 Announce Type: new Abstract: As deep learning models in agentic AI systems grow in scale and complexity, GPU memory requirements increase and often exceed the available GPU memory capacity, so that out-of-memory (OoM) errors occur. It is well known…

HSTMixer: A Hierarchical MLP-Mixer for Large-Scale Traffic Forecasting

arXiv:2512.07854v1 Announce Type: new Abstract: Traffic forecasting task is significant to modern urban management. Recently, there is growing attention on large-scale forecasting, as it better reflects the complexity of real-world traffic networks. However, existing models often exhibit quadratic computational complexity,…

SABER: Small Actions, Big Errors – Safeguarding Mutating Steps in LLM Agents

arXiv:2512.07850v1 Announce Type: new Abstract: Despite rapid progress in LLM agents, performance on long-horizon, tool-using tasks remains fragile. To better understand this fragility, we ask a simple question: emph{do all actions contribute equally to failure?} Analyzing execution traces on $tau$-Bench…