Chroma
ActiveDescription
Chroma is an open-source AI-native embedding database designed for building LLM applications. It provides simple APIs to store embeddings and perform similarity search, making it ideal for RAG applications.
Key Features
- Minimal core API — only 4 functions: create_collection, add, query, get — up and running in 5 minutes
- Auto embedding — automatically handles tokenization, embedding, and indexing on add, no manual processing
- Metadata filtering — precise filtering by metadata fields and full-text document search
- Hybrid search — supports combined vector similarity search and full-text retrieval modes
- Multi-language clients — Python and JavaScript/TypeScript clients, pip/npm one-click install
- Persistent storage — in-memory mode for prototyping, persistent mode for production, chroma run for server
Use Cases
Categories
Quick Start
pip install chromadb
import chromadb
# Start in-memory for quick prototyping
client = chromadb.Client()
collection = client.create_collection('my-docs')
# Add documents (auto-embedding)
collection.add(
documents=['This is document 1', 'This is document 2'],
metadatas=[{'source': 'notion'}, {'source': 'google-docs'}],
ids=['doc1', 'doc2']
)
# Query top 2 most similar results
results = collection.query(
query_texts=['query document'],
n_results=2
)
print(results)