Llamazip: Leveraging LLaMA for Lossless Text Compression and Training Dataset Detection
arXiv:2511.17589v1 Announce Type: new Abstract: This work introduces Llamazip, a novel lossless text compression algorithm based on the predictive capabilities of the LLaMA3 language model. Llamazip achieves significant data reduction by only storing tokens that the model fails to predict,…
