Subscribe to our newsletter for the latest news and updates
LLM inference in pure C/C++. Run LLaMA and other models on consumer hardware with CPU and GPU support. The engine behind many local AI apps.
暂无描述/由厂商提交后补全
Hugging Face's production-ready toolkit for deploying LLMs. Optimized for high throughput with tensor parallelism, quantization, and Flash Attention.
TypeScript toolkit for building AI-powered applications with React, Next.js, Vue, Svelte, and Node.js. Unified API for multiple AI providers with streaming.
Free, open source alternative to OpenAI API. Run LLMs, generate images, audio, and more locally or on-prem with no GPU required.