Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2–4× Faster than Previous Models

Why was a new multilingual encoder needed? XLM-RoBERTa (XLM-R) has dominated multilingual NLP for more than 5 years, an unusually long reign in AI research. While encoder-only models like BERT and RoBERTa were central to early progress, most research energy shifted toward decoder-based generative models. Encoders, however, remain more efficient and often outperform decoders on […] The post Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2–4× Faster than Previous Models appeared first on MarkTechPost.

2025-09-11 07:00 GMT · 10 months ago www.marktechpost.com

Why was a new multilingual encoder needed? XLM-RoBERTa (XLM-R) has dominated multilingual NLP for more than 5 years, an unusually long reign in AI research. While encoder-only models like BERT and RoBERTa were central to early progress, most research energy shifted toward decoder-based generative models. Encoders, however, remain more efficient and often outperform decoders on […] The post Meet mmBERT: An Encoder-only Language Model Pretrained on 3T Tokens of Multilingual Text in over 1800 Languages and 2–4× Faster than Previous Models appeared first on MarkTechPost.

Original: https://www.marktechpost.com/2025/09/10/meet-mmbert-an-encoder-only-language-model-pretrained-on-3t-tokens-of-multilingual-text-in-over-1800-languages-and-2-4x-faster-than-previous-models/