Llamafile

Active
GitHub C++ NOASSERTION

Description

Mozilla's approach to packaging LLMs as a single executable with zero dependencies.

Key Features

  • Single file — Model weights plus runtime packed into one exe
  • Zero deps — No Python, CUDA, or pip needed
  • Cross-platform — Linux/macOS/Windows with the same experience
  • OpenAI compatible — Built-in OpenAI API server
  • Multi-model — Llama, Mistral, Phi, etc.
  • Distributable — Easy to embed in desktop apps

Use Cases

💡 Embed local LLMs in desktop apps.
💡 Provide LLM capabilities for one-off scripts.
💡 Run LLMs on servers without Python.

Quick Start

# Download a llamafile
curl -L -o llamafile https://llamafile.ai/...
chmod +x llamafile
# Start the OpenAI-compatible server
./llamafile --server

Related Projects