Presentation: GenAI at Scale: What It Enables, What It Costs, and How To Reduce the Pain

Mark Kurtz explains how to overcome the technical and financial hurdles of scaling GenAI. He shares how to optimize LLM deployments with open-source tools, including vLLM for efficient serving, LLM Compressor for model compression, and InstructLab for fine-tuning with synthetic data. He provides a deep dive into balancing performance, accuracy, and cost to ensure successful production deployment. By Mark Kurtz

2025-09-08 14:00 GMT · 7 months ago www.infoq.com

Mark Kurtz explains how to overcome the technical and financial hurdles of scaling GenAI. He shares how to optimize LLM deployments with open-source tools, including vLLM for efficient serving, LLM Compressor for model compression, and InstructLab for fine-tuning with synthetic data. He provides a deep dive into balancing performance, accuracy, and cost to ensure successful production deployment. By Mark Kurtz

Original: https://www.infoq.com/presentations/genai-scale/?utm_campaign=infoq_content&utm_source=infoq&utm_medium=feed&utm_term=AI%2C+ML+%26+Data+Engineering