Archives AI News

Thoughtbubbles: an Unsupervised Method for Parallel Thinking in Latent Space

arXiv:2510.00219v1 Announce Type: new Abstract: Current approaches for scaling inference-time compute in transformers rely on training them to emit explicit chain-of-thought tokens before producing an answer. While these methods are powerful, they are limited because they cannot be applied during…

October 2, 2025

The Pitfalls of KV Cache Compression

arXiv:2510.00231v1 Announce Type: new Abstract: KV cache compression promises increased throughput and efficiency with negligible loss in performance. While the gains in throughput are indisputable and recent literature has indeed shown minimal degradation on particular benchmarks, in general the consequences…

October 2, 2025

UTrace: Poisoning Forensics for Private Collaborative Learning

arXiv:2409.15126v3 Announce Type: replace-cross Abstract: Privacy-preserving machine learning (PPML) systems enable multiple data owners to collaboratively train models without revealing their raw, sensitive data by leveraging cryptographic protocols such as secure multi-party computation (MPC). While PPML offers strong privacy guarantees,…

October 2, 2025

Differentiable Autoencoding Neural Operator for Interpretable and Integrable Latent Space Modeling

arXiv:2510.00233v1 Announce Type: new Abstract: Scientific machine learning has enabled the extraction of physical insights from high-dimensional spatiotemporal flow data using linear and nonlinear dimensionality reduction techniques. Despite these advances, achieving interpretability within the latent space remains a challenge. To…

October 2, 2025

Resolving UnderEdit & OverEdit with Iterative & Neighbor-Assisted Model Editing

arXiv:2503.11895v3 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are widely deployed in downstream tasks, but keeping their knowledge up-to-date via retraining or fine-tuning is often computationally expensive. Model editing provides a more efficient alternative by updating a targeted subset…

October 2, 2025

Per-example gradients: a new frontier for understanding and improving optimizers

arXiv:2510.00236v1 Announce Type: new Abstract: Training algorithms in deep learning usually treat a mini-batch of samples as a single object; they average gradients over the mini-batch, and then process the average in various ways. Computing other statistics beyond the average…

October 2, 2025

Are All Marine Species Created Equal? Performance Disparities in Underwater Object Detection

arXiv:2508.18729v2 Announce Type: replace-cross Abstract: Underwater object detection is critical for monitoring marine ecosystems but poses unique challenges, including degraded image quality, imbalanced class distribution, and distinct visual characteristics. Not every species is detected equally well, yet underlying causes remain…

October 2, 2025

Debunk the Myth of SFT Generalization

arXiv:2510.00237v1 Announce Type: new Abstract: A prevailing view holds that supervised fine-tuning (SFT) memorizes training data and fails to generalize, whereas reinforcement learning (RL) attains broader robustness. We revisit this claim through a systematic evaluation on two decision-making benchmarks, Sokoban…

October 2, 2025

Learning Inter-Atomic Potentials without Explicit Equivariance

arXiv:2510.00027v1 Announce Type: new Abstract: Accurate and scalable machine-learned inter-atomic potentials (MLIPs) are essential for molecular simulations ranging from drug discovery to new material design. Current state-of-the-art models enforce roto-translational symmetries through equivariant neural network architectures, a hard-wired inductive bias…

October 2, 2025

Reward driven discovery of the optimal microstructure representations with invariant variational autoencoders

arXiv:2510.00243v1 Announce Type: new Abstract: Microscopy techniques generate vast amounts of complex image data that in principle can be used to discover simpler, interpretable, and parsimonious forms to reveal the underlying physical structures, such as elementary building blocks in molecular…

October 2, 2025