LLM inference in pure C/C++. Run LLaMA and other models on consumer hardware with CPU and GPU support. The engine behind many local AI apps.
Run large language models locally. Get up and running with LLaMA, Mistral, Gemma, and other open models with a single command.
AI pair programming in your terminal. Chat with GPT-4/Claude to edit code in your local git repos. Understands entire codebases.