RAGatouille
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
Tools for retrieval-augmented generation
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
A collection of projects showcasing RAG, agents, workflows, and other AI use cases with practical examples and tutorials.
A MemAgent framework that can extrapolate to 3.5M context tokens, along with a training framework for RL training of any agent workflow.
Open-source BGE series embedding models and retrieval tools from BAAI, providing state-of-the-art text embeddings and rerankers for Chinese and English, widely used in RAG systems and agent retrieval pipelines.
LightRAG is a simple and fast Retrieval-Augmented Generation framework using graph-enhanced retrieval, published at EMNLP 2025.
All-in-one RAG framework supporting text, images, tables, equations and more document formats for retrieval-augmented generation with unified knowledge QA.
An open-source graph-vector database built from scratch in Rust, combining graph database and vector retrieval capabilities to provide AI agents with unified storage for both knowledge graphs and semantic search.
An AI-powered answering engine with multi-model integration, web search and local knowledge base, providing a Perplexity-like search experience.
AutoRAG is an open-source RAG evaluation and optimization framework using AutoML-style automation to help developers automatically find the best RAG pipeline configurations and benchmark them.
VectorAdmin is the universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with an intuitive web interface for data import, querying, and maintenance.
A comprehensive showcase of advanced Retrieval-Augmented Generation (RAG) techniques with detailed notebook tutorials and code examples, covering foundational to cutting-edge RAG implementations.
NeurIPS 2024 RAG framework inspired by human long-term memory, combining knowledge graphs with personalized PageRank for continuous knowledge integration in LLMs.
A low-code MCP framework for building complex and innovative RAG pipelines. Combines visual pipeline design with MCP protocol integration for end-to-end RAG — from data ingestion and chunking to retrieval and generation.
KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs for building logical reasoning and factual Q&A solutions for professional domain knowledge bases, effectively overcoming the limitations of traditional RAG vector similarity models.
Opinionated RAG framework for integrating GenAI into your apps. Works with any LLM, any vectorstore, any files — so you can focus on your product instead of building RAG pipelines.
A production-ready Agentic RAG system with RESTful API, featuring multimodal document ingestion, hybrid search, knowledge graph construction, and agent-driven retrieval-augmented generation workflows.
100+ AI Agent and RAG apps you can actually run — clone, customize, and ship. A great reference for quickly building LLM-powered applications.
EmbedAnything is a highly performant, modular, and memory-safe embedding inference and indexing framework built in Rust, providing production-ready RAG ingestion and indexing pipelines for local and cloud deployment.
Vectra is a local vector database for Node.js with features similar to Pinecone but built using local files. It supports semantic search and document embeddings with no external service dependencies, ideal for RAG application development in Node.js environments.
LLM-powered stock analysis system for A/H/US markets with multi-source quotes, real-time news, LLM decision dashboard and multi-channel push notifications.
LLM-driven extraction of unstructured data, built for API deployments and ETL pipeline workflows. Automates document parsing, PDF extraction, and intelligent data processing with LLM-powered intelligence.
SQL-Driven RAG Engine that automatically builds knowledge graphs during querying, combining SQL query capabilities with Retrieval-Augmented Generation for efficient knowledge retrieval.
AI Data Runtime for Agents. Provides serverless Postgres with a multimodal datalake, enabling scalable retrieval and training. Unifies vector storage, dataset management, and streaming data loading for AI agent workflows.
The open-source RAG platform with built-in citations, deep research, 22+ file formats, partitions, and MCP server.
A multi-modal multi-agent framework for document understanding that leverages multiple specialized agents to analyze and comprehend complex documents.
Open-source context retrieval layer for AI agents that automatically extracts, indexes, and retrieves structured context from diverse data sources.
A lightweight, lightning-fast, in-process vector database by Alibaba with C++ core, Node.js and Python bindings, designed for RAG, agent memory, and vector search use cases.
A web scraping and browser automation library for Node.js to build reliable crawlers, supporting Puppeteer, Playwright, Cheerio, and raw HTTP. Extract data for AI, LLMs, RAG, or GPTs with proxy rotation and both headful and headless modes.
An interactive visualization tool for large embeddings by Apple. Explore, cross-filter, and search embeddings and metadata to understand and debug embedding models, vector retrieval, and RAG system behavior.
A private AI platform for agents, assistants, and enterprise search with built-in agent builder, deep research, document analysis, and multi-model support.
A vector search SQLite extension. Add vector similarity search to SQLite with float32/int8 vectors — ideal for local RAG applications.
Transforms PDF, documents and images into enriched structured data with table recognition, reading order restoration, and Markdown output.
Run any open-source LLMs such as DeepSeek and Llama as OpenAI-compatible API endpoints in the cloud. Supports fine-tuning, quantization, and distributed inference for production-grade LLM deployment.
Open-source text-to-SQL and text-to-chart GenBI agent with a semantic layer. Ask your database questions in natural language and get accurate SQL, charts, and BI insights. Supports 12+ data sources and any LLM.
An open-source enterprise-level AI knowledge base and MCP management platform with integrated knowledge retrieval, model management, and agent chat for enterprise AI applications.
A local knowledge base RAG and Agent application platform built on Langchain with support for ChatGLM, Qwen, Llama and other LLMs, offering conversation, knowledge base management, and agent capabilities.
The lightweight ingestion library for fast, efficient and robust RAG pipelines. Supports multiple chunking strategies and embedding models to significantly improve retrieval-augmented generation results.
Chroma is an open-source AI-native embedding database designed for building LLM applications. It provides simple APIs to store embeddings and perform similarity search, making it ideal for RAG applications.
CodeFuse-muAgent is an innovative agent framework driven by a knowledge graph engine, integrating EKG (Enterprise Knowledge Graph) technology for multi-agent collaboration, RAG-enhanced retrieval, and tool learning.
Contextal is a context management and retrieval-enhancement tool for multi-turn agents, long conversations, and complex knowledge injection workflows.
CozoDB is a transactional, relational-graph-vector database that uses Datalog for queries. Designed as the hippocampus for AI, it unifies graph traversal, vector search, and relational queries.
Crawl4AI is a web crawling toolkit for LLM and agent systems, offering structured extraction, site traversal, cleanup, and crawl controls for external knowledge acquisition.
Comprehensive guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.
A Data Agent Ready Warehouse unifying Analytics, Search, AI, and Python Sandbox in one system. Runs on your S3 with built-in vector search, full-text search, and Python execution for AI-powered data analysis.
Open-source LLM DevOps platform providing one-stop AI application development with GenAI workflow, RAG, Agent, model management, evaluation, and enterprise system administration.
JVector is the most advanced embedded vector search engine, built in pure Java by DataStax. It provides high-performance ANN search for RAG and AI applications on the JVM.
A comprehensive tutorial on AI agent principles and practice, systematically covering core concepts, framework usage and hands-on projects.
Haystack is an enterprise-grade framework for RAG and search applications, covering document processing, retrieval, generation, and evaluation end to end.
All-in-one platform for search, recommendations, RAG, and analytics offered via API. Built in Rust with vector search, full-text search, and semantic reranking for enterprise-grade AI retrieval applications.
A multi-modal vector database that supports upserts and vector queries using unified MySQL-compatible SQL on structured and unstructured data, meeting high concurrency and ultra-low latency requirements.
Docling is an open-source document processing tool by IBM that converts PDF, Word, PPT, HTML and more into structured data for AI, purpose-built for GenAI and RAG pipelines.
Embedchain is a universal memory layer for AI agents, enabling quick integration of diverse data sources into LLMs for context-aware AI applications.
MTEB (Massive Text Embedding Benchmark) is a comprehensive benchmark framework for evaluating text embeddings across classification, retrieval, clustering, reranking, and more, helping select optimal embedding models for RAG systems.
A high-performance vector database designed to handle up to 1 billion vectors on a single node, delivering significant performance gains through optimized indexing and execution. Also available as a cloud service.
DB-GPT is an open-source agentic AI data assistant framework integrating multi-agent collaboration, RAG, and AWEL workflow engine, purpose-built for AI+Data applications.
Ragas is a framework for evaluating RAG (Retrieval Augmented Generation) systems. It provides various evaluation metrics including faithfulness, answer relevance, context precision, helping developers optimize RAG application performance.
A high-performance graph database built on GraphBLAS, optimized for LLM and GraphRAG scenarios with real-time knowledge graph construction and querying for graph-structured AI agent retrieval.
A universal local knowledge base solution based on vector databases and GPT, providing one-stop document processing with vectorization, semantic search, and intelligent Q&A for building private knowledge bases.
OCR and document extraction tool using vision models, efficiently converting PDFs and images into structured text.
Graphiti is a temporal knowledge-graph engine for agent memory, helping systems continuously accumulate long-term context.
An educational Agentic RAG project with clean code demonstrating how to build RAG systems with agent capabilities — routing, retrieval, evaluation, and iterative refinement.
TrustRAG is a RAG framework focused on reliable input and trusted output, providing complete RAG pipeline components including document parsing, chunking, retrieval, and reranking with multiple retrieval strategies and evaluation methods.
Official Google Gemini fullstack quickstart using LangGraph. Complete React + Python implementation for building production AI agent applications.
A Python library by Google for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization, designed for data annotation and knowledge extraction workflows.
AI Agent Tools library for the Graphlit Platform providing knowledge retrieval and content processing capabilities for Python agents.
PromptTools provides open-source tools for prompt testing and experimentation, supporting multiple LLMs (OpenAI, LLaMA) and vector databases (Chroma, Weaviate, LanceDB) to help developers systematically evaluate and optimize RAG systems.
An enterprise-ready Spring AI platform integrating RAG, tool calling, asynchronous ingestion, JWT/RBAC security, and observability.
A blazing fast inference solution for text embeddings models built in Rust, serving as core infrastructure for building RAG systems and vector retrieval pipelines with high throughput and low latency.
Infinity is an AI-native database providing incredibly fast hybrid search of dense vectors, sparse vectors, tensors, and full-text, designed for LLM applications and RAG systems.
A leading open-source RAG engine that fuses cutting-edge retrieval-augmented generation with agent capabilities to create a superior context layer for LLMs.
A full-stack AI infrastructure tool for data, model, and pipeline orchestration. Streamlines building versatile AI-first applications with a visual pipeline editor for end-to-end workflows from data ingestion to model inference.
Accelerate local LLM inference and finetuning on Intel XPU. Supports LLaMA, Mistral, Qwen, DeepSeek and more. Seamlessly integrates with LangChain, LlamaIndex, and other agent frameworks.
An LLM-based multi-agent framework for web search engines, similar to Perplexity.ai Pro and SearchGPT, enabling intelligent web search.
A production-focused Agentic RAG course teaching how to build scalable, reliable RAG agent systems with indexing strategies, retrieval optimization, and monitoring.
A hyper-fast local vector database for use with LLM Agents, providing lightweight vector storage and similarity search capabilities for embedding as instant memory and knowledge retrieval components in agent applications.
A modular RAG system with MCP Server architecture. Using Skill to make AI follow each step of the spec and complete the code 100% by AI.
Jina AI Serve is a cloud-native framework for building multimodal AI applications, supporting RAG pipelines, agent systems, and multimodal search.
An open-source JupyterLab extension that connects AI agents to computational notebooks, enabling code generation, error explanation, and document Q&A.
Sparrow is a structured data extraction tool that supports instruction calling with ML, LLM, and Vision LLM for extracting structured information from documents, suitable for document parsing in RAG pipelines.
An embedded property graph database built for speed with built-in vector search and full-text search, implementing Cypher query language for knowledge graph construction and AI agent structured knowledge retrieval.
FastGPT is a knowledge-based platform built on LLMs, offering out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration for easily developing and deploying complex question-answering systems.
An open-source embedded retrieval library for multimodal AI with zero server configuration, using the Lance columnar format for efficient vector search and filtering, ideal for agent memory and RAG applications.
A comprehensive single-package Retrieval-Augmented Generation platform built on Langflow, Docling, and OpenSearch, providing a complete pipeline from document parsing to vector retrieval and generation with multi-model and multi-vector-database support.
A PostgreSQL vector database extension for building AI applications, adding high-performance vector search capabilities to PostgreSQL with support for generating and indexing embeddings directly in the database.
Layra is an enterprise-ready solution combining visual RAG with multi-step agent workflow orchestration, providing out-of-the-box document parsing, knowledge base construction, and intelligent Q&A capabilities.
A hands-on Java and Spring AI project for building AI agents with RAG, tool calling, MCP, and ReAct-style autonomous planning.
llmware is a unified enterprise RAG framework for deploying small specialized models, featuring knowledge graphs, document parsing, vector indexing, and agent toolchains for building private, compliant AI applications.
Mem0 is a long-term memory layer for AI agents, supporting cross-session memory management and personalized context retrieval.
MemVid is a long-term memory layer for AI agents that uses video encoding for lightweight single-file storage, replacing complex RAG pipelines with instant retrieval.
Firecrawl is the Web Data API for AI, turning web pages into clean, structured, LLM-friendly data with crawl, scrape, and search capabilities.
A modular graph-based Retrieval-Augmented Generation system by Microsoft that uses LLMs to extract structured knowledge graphs from text, enabling global and local community summarization queries.
Milvus is a high-performance open-source vector database built for AI applications. It supports storage, indexing, and similarity search of large-scale vector data, ideal for RAG, recommendation systems, and more.
Open-source AI engine to run any model — LLMs, vision, voice, image, video — on any hardware without GPU. Provides OpenAI-compatible API for fully local, privacy-first AI inference.
Extract and convert data from any document (PDFs, images, Word, PPT, URLs) into multiple formats including Markdown, JSON, and CSV.
Official Neo4j GraphRAG Python SDK providing an integrated toolkit for knowledge graph construction, vector retrieval, and graph querying, supporting agent-driven graph retrieval-augmented generation workflows.
AI agents with graph-based reasoning memory by Neo4j. Scaffold graph databases in seconds to give agents knowledge-graph-driven memory and reasoning capabilities.
QAnything is an open-source local knowledge base Q&A system by NetEase Youdao, supporting any file format with offline RAG capabilities for building private knowledge Q&A.
All-in-one AI framework for semantic search, LLM orchestration, and language model workflows with agent support, RAG, and vector database
A complete LangGraph-based example of multi-agent RAG, showing agents collaborating on retrieval, routing, reasoning, and answer generation.
A completely locally running search aggregator using LLM agents. Users can ask questions and the system uses a chain of LLMs to find answers without any external API keys.
A toolkit for making AI agents and workflows measurably reliable, with epistemic measurement, Noetic RAG, sentinel gating, and grounded calibration.
Open source AI platform with enterprise-grade AI chat, advanced RAG and AI search capabilities that works with every LLM.
Transforms complex documents like PDFs into LLM-ready markdown/JSON for Agentic workflows, supporting layout analysis, formula recognition, and table extraction.
A complete search engine and RAG pipeline in your browser, server, or edge network. Supports full-text, vector, and hybrid search in less than 2kb. Perfect for building AI-powered search experiences anywhere.
In-depth tutorials on LLMs, RAGs and real-world AI agent applications. Rich notebook examples for learning AI engineering practices.
Ready-to-run cloud templates for RAG, AI pipelines and enterprise search with live data, always in sync with Sharepoint, Google Drive, S3, Kafka and more.
Pathway is a Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG applications.
AI-powered PDF scientific paper translation with preserved formats, supporting Google/DeepL/Ollama/OpenAI services via CLI/GUI/MCP/Docker/Zotero.
Open-source vector similarity search extension for PostgreSQL, enabling native vector storage and ANN retrieval in relational databases, a foundational component for building agent memory and RAG systems.
chromem-go is an embeddable vector database for Go with a Chroma-like interface and zero third-party dependencies. It supports in-memory storage with optional persistence, ideal for lightweight RAG applications.
Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone. Provides out-of-box RAG solution with support for knowledge base building, semantic search, and context management.
High-performance Python library for data extraction, analysis, conversion and manipulation of PDF and other document formats.
Qdrant is a high-performance vector database widely used as the retrieval layer for RAG and agent memory search scenarios.
An intelligent dialogue system based on RAG (Retrieval-Augmented Generation) technology with a complete web UI for document upload, knowledge base management, and smart Q&A.
The easiest way to use Agentic RAG in any enterprise. Provides out-of-the-box retrieval-augmented generation capabilities with Docker-based deployment for simplified enterprise RAG application building and management.
A private and local AI personal knowledge management app. All data and processing stay on-device with built-in RAG, semantic search, and knowledge graph features for managing personal knowledge bases with full privacy.
LlamaIndex is a data framework that provides the data connection layer for LLM applications, with strong RAG capabilities across diverse data sources and vector databases.
AI-driven public opinion and trend monitor with multi-platform aggregation, RSS subscriptions, smart keyword filtering, AI-powered news analysis and briefings, supporting MCP integration and push notifications via WeChat, Feishu, DingTalk, Telegram and more.
A cross-platform, ultra-efficient SQLite extension that brings vector search capabilities to embedded databases, ideal for local-first RAG applications and agent memory storage.
Superlinked Inference Engine is an open-source inference server and production cluster for embeddings, reranking, and extraction, providing high-performance data processing pipelines for RAG systems.
A local-first LLM wiki and knowledge-graph builder that can serve as a RAG knowledge base, agent memory store, and AI second brain.
GPT Researcher is an autonomous research agent that can gather, organize, and analyze information to produce detailed research reports.
Tencent's open-source LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG.
A semantic knowledge platform for human-AI collaboration that can serve as a wiki, knowledge base, context graph, semantic layer, or agentic memory.
Open-source LLM toolkit for building trustworthy LLM applications with TigerArmor (AI safety), TigerRAG (embedding and RAG), and TigerTune (fine-tuning) modules.
ColiVara is a suite of services for storing, searching, and retrieving documents based on visual embeddings. It uses vision models instead of chunking and text-processing, achieving state-of-the-art retrieval on both text and visual documents without OCR.
A knowledge engine for AI agent memory that builds knowledge graphs and memory layers in 6 lines of code, supporting graph databases, vector stores, and more for knowledge extraction and retrieval.
Cognita is a modular RAG framework for production environments by TrueFoundry, supporting flexible document parsing, vector storage, and retrieval pipeline orchestration for scalable knowledge QA systems.
TruLens is an open-source tool for evaluating and tracking LLM apps. It provides specialized evaluation for RAG applications including context relevance, groundedness, and answer relevance.
A graph-native context development platform for storing, enriching, and retrieving structured knowledge with semantic search and portable context cores, supporting RDF, SPARQL, and other standards for AI agent knowledge management.
An agentic LLM-powered data processing and ETL system. Enables complex data transformations using natural language-defined pipelines, turning unstructured data into structured, analyzable outputs with LLM intelligence.
Unstructured provides document parsing and cleaning capabilities, commonly used in RAG ingestion and preprocessing pipelines.
USearch is a fast open-source search and clustering engine for vectors and arbitrary objects, with bindings in C++, Python, JavaScript, Rust, Java, Swift, C#, Go, and Wolfram for large-scale vector retrieval.
Context7 is Upstash's context-engineering toolkit for agents, helping applications manage long context windows, retrieval injection, and history compression.
Chat with your SQL database using natural language. Accurate Text-to-SQL Generation via LLMs using Agentic RAG.
Vald is a highly scalable distributed vector search engine built on cloud-native architecture, designed for high-performance approximate nearest neighbor search across massive vector datasets.
A high-throughput and memory-efficient inference and serving engine for LLMs, featuring PagedAttention, continuous batching, and optimized KV cache management for production deployments.
An open-source RAG chatbot powered by Weaviate vector database, supporting multiple data import methods, LLM backends, and embedding models for out-of-the-box retrieval-augmented generation.
Weaviate is an open-source vector database that stores objects and vectors, allowing for combining vector search with structured filtering. It has built-in vectorization modules and supports multimodal data search.
A general-purpose Java agent built with Spring Boot, Spring AI, RAG, tool calling, and MCP, supporting multi-turn dialogue and persistent memory.
A multi-tenant agent harness platform integrating LightRAG knowledge base and knowledge graphs, built with LangChain, Vue, and FastAPI, supporting DeepAgents, MinerU PDF parsing, Neo4j graph database, and MCP protocol.
RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device. Published at MLsys 2026.
A minimalistic AI-powered search engine that helps you find information on the internet and cites it too. Powered by Vercel AI SDK.
VectorDBBench is a benchmarking tool for vector databases, providing standardized performance testing and comparative analysis for popular vector databases including Milvus, Qdrant, Chroma, Weaviate, and more.
A deep dive into the four-layer agent memory architecture, with practical code for vector retrieval and memory compression to help you build scalable long-term memory systems.
Learn how to build stateful AI agents with long-term memory using Letta (formerly MemGPT), solving the LLM context window limitation.
Production-focused best practices for index design, filtering, reranking, and evaluation when building RAG retrieval layers with Qdrant.
Most RAG pipelines fail at retrieval, not generation. This article covers five chunking strategies, hybrid search, reranking pipelines, and a production-ready decision framework.
Learn how to evaluate RAG systems using Ragas and DeepEval, including measuring key metrics like faithfulness, answer relevance, and context precision.
An in-depth explanation of Retrieval-Augmented Generation and how to build private knowledge bases for AI agents to improve accuracy and reliability.