Kotaemon

Active
GitHub Python Apache-2.0

Description

Kotaemon is an open-source RAG-based tool for chatting with your documents, featuring a clean chat interface and support for multiple LLM and embedding model backends.

Key Features

  • Hybrid RAG pipeline — combines full-text and vector retrieval with re-ranking for best retrieval quality
  • Multimodal document QA — supports PDF, HTML, XLSX formats with figure and table extraction from documents
  • Advanced citations with document preview — relevance-scored citations with in-browser PDF viewer and highlight support
  • Multiple reasoning modes — question decomposition for complex multi-hop Q&A, plus ReAct and ReWOO agent-based reasoning
  • Configurable settings UI — adjust retrieval and generation parameters directly in the UI including prompt templates
  • Multi-user collaboration — supports multi-user login, private/public collection management, and chat sharing

Use Cases

💡 Personal document knowledge base: upload PDFs and chat with them through a conversational interface
💡 Team knowledge sharing platform: create private document collections for collaborative team use
💡 Academic literature research: multimodal Q&A on papers with figure data extraction and citation tracking
💡 Enterprise compliance document retrieval: quickly locate relevant clauses and explanations in compliance files
💡 Multilingual document processing: privacy-friendly offline Q&A using local LLMs like Ollama

Quick Start

```bash
docker run -e GRADIO_SERVER_NAME=0.0.0.0 -e GRADIO_SERVER_PORT=7860 -v ./ktem_app_data:/app/ktem_app_data -p 7860:7860 -it --rm ghcr.io/cinnamon/kotaemon:main-full
```

Related Projects