Archives AI News

XConv: Low-memory stochastic backpropagation for convolutional layers

arXiv:2106.06998v3 Announce Type: replace Abstract: Training convolutional neural networks at scale demands substantial memory, largely due to storing intermediate activations for backpropagation. Existing approaches — such as checkpointing, invertible architectures, or gradient approximation methods like randomized automatic differentiation — either…

Quantifying Memorization and Privacy Risks in Genomic Language Models

arXiv:2603.08913v1 Announce Type: new Abstract: Genomic language models (GLMs) have emerged as powerful tools for learning representations of DNA sequences, enabling advances in variant prediction, regulatory element identification, and cross-task transfer learning. However, as these models are increasingly trained or…

The Gaussian-Multinoulli Restricted Boltzmann Machine: A Potts Model Extension of the GRBM

arXiv:2505.11635v2 Announce Type: replace Abstract: Many real-world tasks, from associative memory to symbolic reasoning, benefit from discrete, structured representations that standard continuous latent models can struggle to express. We introduce the Gaussian-Multinoulli Restricted Boltzmann Machine (GM-RBM), a generative energy-based model…

Uncovering a Winning Lottery Ticket with Continuously Relaxed Bernoulli Gates

arXiv:2603.08914v1 Announce Type: new Abstract: Over-parameterized neural networks incur prohibitive memory and computational costs for resource-constrained deployment. The Strong Lottery Ticket (SLT) hypothesis suggests that randomly initialized networks contain sparse subnetworks achieving competitive accuracy without weight training. Existing SLT methods,…

Automating Forecasting Question Generation and Resolution for AI Evaluation

arXiv:2601.22444v2 Announce Type: replace Abstract: Forecasting future events is highly valuable in decision-making and is a robust measure of general intelligence. As forecasting is probabilistic, developing and evaluating AI forecasters requires generating large numbers of diverse and difficult questions, and…

Adversarial Latent-State Training for Robust Policies in Partially Observable Domains

arXiv:2603.07313v2 Announce Type: replace Abstract: Robustness under latent distribution shift remains challenging in partially observable reinforcement learning. We formalize a focused setting where an adversary selects a hidden initial latent distribution before the episode, termed an adversarial latent-initial-state POMDP. Theoretically,…