Quantization for GenAI Models
Unlock the power of model optimization! Learn how to apply quantization and make your GenAI models efficient with Python
What you will learn:
Understand model optimization techniques: Pruning, Distillation, and QuantizationLearn the basics of data types like FP32, FP16, BFloat16, and INT8
Master downcasting from FP32 to BF16 and FP32 to INT8
Learn the difference between symmetric and asymmetric quantization
Implement quantization techniques in Python with real examples
Apply quantization to make models more efficient and deployment-ready
Gain practical skills to optimize models for edge devices and resource-constrained environments