Probing the Limits of Compressive Memory: A Study of Infini-Attention in Small-Scale Pretraining
arXiv:2512.23862v1 Announce Type: new Abstract: This study investigates small-scale pretraining for Small Language Models (SLMs) to enable efficient use of limited data and compute, improve accessibility in low-resource settings and reduce costs. To enhance long-context extrapolation in compact models, we…
