Quantization |

Quantization

Quantization is a technique to:

Tools like llama.cpp, ggml, and mlc-llm quantize models to make them run on M1 chips, Raspberry Pi, or Android.