Archives AI News

Beyond Sharp Minima: Robust LLM Unlearning via Feedback-Guided Multi-Point Optimization

arXiv:2509.20230v3 Announce Type: replace Abstract: Current LLM unlearning methods face a critical security vulnerability that undermines their fundamental purpose: while they appear to successfully remove sensitive or harmful knowledge, this “forgotten” information remains precariously recoverable through relearning attacks. We identify…

PALADIN: Self-Correcting Language Model Agents to Cure Tool-Failure Cases

arXiv:2509.25238v1 Announce Type: new Abstract: Tool-augmented language agents frequently fail in real-world deployment due to tool malfunctions–timeouts, API exceptions, or inconsistent outputs–triggering cascading reasoning errors and task abandonment. Existing agent training pipelines optimize only for success trajectories, failing to expose…

IMPACT: Importance-Aware Activation Space Reconstruction

arXiv:2507.03828v2 Announce Type: replace-cross Abstract: Large language models (LLMs) achieve strong performance across many domains but are difficult to deploy in resource-constrained settings due to their size. Low-rank weight matrix compression is a popular strategy for reducing model size, typically…

Asymptotic Classification Error for Heavy-Tailed Renewal Processes

arXiv:2408.10502v2 Announce Type: replace Abstract: Despite the widespread occurrence of classification problems and the increasing collection of point process data across many disciplines, study of error probability for point process classification only emerged very recently. Here, we consider classification of…

Sharpness of Minima in Deep Matrix Factorization: Exact Expressions

arXiv:2509.25783v1 Announce Type: new Abstract: Understanding the geometry of the loss landscape near a minimum is key to explaining the implicit bias of gradient-based methods in non-convex optimization problems such as deep neural network training and deep matrix factorization. A…

Fair Classification by Direct Intervention on Operating Characteristics

arXiv:2509.25481v1 Announce Type: new Abstract: We develop new classifiers under group fairness in the attribute-aware setting for binary classification with multiple group fairness constraints (e.g., demographic parity (DP), equalized odds (EO), and predictive parity (PP)). We propose a novel approach,…

Test time training enhances in-context learning of nonlinear functions

arXiv:2509.25741v1 Announce Type: new Abstract: Test-time training (TTT) enhances model performance by explicitly updating designated parameters prior to each prediction to adapt to the test data. While TTT has demonstrated considerable empirical success, its theoretical underpinnings remain limited, particularly for…