Archives AI News

The Path Not Taken: Duality in Reasoning about Program Execution

arXiv:2604.20917v1 Announce Type: new Abstract: Large language models (LLMs) have shown remarkable capabilities across diverse coding tasks. However, their adoption requires a true understanding of program execution rather than relying on surface-level patterns. Existing benchmarks primarily focus on predicting program…

April 24, 2026

Forget, Then Recall: Learnable Compression and Selective Unfolding via Gist Sparse Attention

arXiv:2604.20920v1 Announce Type: new Abstract: Scaling large language models to long contexts is challenging due to the quadratic computational cost of full attention. Mitigation approaches include KV-cache selection or compression techniques. We instead provide an effective and end-to-end learnable bridge…

April 24, 2026

Absorber LLM: Harnessing Causal Synchronization for Test-Time Training

arXiv:2604.20915v1 Announce Type: new Abstract: Transformers suffer from a high computational cost that grows with sequence length for self-attention, making inference in long streams prohibited by memory consumption. Constant-memory alternatives such as RNNs and SSMs compress history into states with…

April 24, 2026

FairyFuse: Multiplication-Free LLM Inference on CPUs via Fused Ternary Kernels

arXiv:2604.20913v1 Announce Type: new Abstract: Large language models are increasingly deployed on CPU-only platforms where memory bandwidth is the primary bottleneck for autoregressive generation. Weight quantization to four bits or below reduces memory pressure, yet existing systems still dequantize weights…

April 24, 2026

Do Masked Autoencoders Improve Downhole Prediction? An Empirical Study on Real Well Drilling Data

arXiv:2604.20909v1 Announce Type: new Abstract: Downhole drilling telemetry presents a fundamental labeling asymmetry: surface sensor data are generated continuously at 1~Hz, while labeled downhole measurements are costly, intermittent, and scarce. Current machine learning approaches for downhole metric prediction universally adopt…

April 24, 2026

Fixation Sequences as Time Series: A Topological Approach to Dyslexia Detection

arXiv:2604.21698v1 Announce Type: cross Abstract: Persistent homology, a method from topological data analysis, extracts robust, multi-scale features from data. It produces stable representations of time series by applying varying thresholds to their values (a process known as a textit{filtration}). We…

April 24, 2026

ILDR: Geometric Early Detection of Grokking

arXiv:2604.20923v1 Announce Type: new Abstract: Grokking describes a delayed generalization phenomenon in which a neural network achieves perfect training accuracy long before validation accuracy improves, followed by an abrupt transition to strong generalization. Existing detection signals are indirect: weight norm…

April 24, 2026

On the Role of Preprocessing and Memristor Dynamics in Reservoir Computing for Image Classification

arXiv:2604.21602v1 Announce Type: cross Abstract: Reservoir computing (RC) is an emerging recurrent neural network architecture that has attracted growing attention for its low training cost and modest hardware requirements. Memristor-based circuits are particularly promising for RC, as their intrinsic dynamics…

April 24, 2026

SCM: Sleep-Consolidated Memory with Algorithmic Forgetting for Large Language Models

arXiv:2604.20943v1 Announce Type: new Abstract: We present SCM (Sleep-Consolidated Memory), a research preview of a memory architecture for large language models that draws on neuroscientific principles to address a fundamental limitation in current systems: the absence of persistent, structured, and…

April 24, 2026

Neural surrogates for crystal growth dynamics with variable supersaturation: explicit vs. implicit conditioning

arXiv:2604.21753v1 Announce Type: cross Abstract: Simulations of crystal growth are performed by using Convolutional Recurrent Neural Network surrogate models, trained on a dataset of time sequences computed by numerical integration of Allen-Cahn dynamics including faceting via kinetic anisotropy. Two network…

April 24, 2026