Open-source platform for the full ML lifecycle including experiment tracking, model packaging, model registry, and deployment.
Hugging Face's production-ready toolkit for deploying LLMs. Optimized for high throughput with tensor parallelism, quantization, and Flash Attention.
Free, open source alternative to OpenAI API. Run LLMs, generate images, audio, and more locally or on-prem with no GPU required.