New IBM Granite 4 Models to Reduce AI Costs with Inference-Efficient Hybrid Mamba-2 Architecture

November 18, 2025

2025-11-17 17:48 GMT · 5 months ago aimagpro.com

IBM recently announced the Granite 4.0 family of small language models. The model family aims to deliver faster speeds and significantly lower operational costs at acceptable accuracy vs. larger models. Granite 4.0 features a new hybrid Mamba/transformer architecture that largely reduces memory requirements, enabling Granite to run on significantly cheaper GPUs and at significantly reduced costs. By Bruno Couriol