Chonkie
活跃简介
The lightweight ingestion library for fast, efficient and robust RAG pipelines. Supports multiple chunking strategies and embedding models to significantly improve retrieval-augmented generation results.
The lightweight ingestion library for fast, efficient and robust RAG pipelines. Supports multiple chunking strategies and embedding models to significantly improve retrieval-augmented generation results.
AI Data Runtime for Agents. Provides serverless Postgres with a multimodal datalake, enabling scalable retrieval and training. Unifies vector storage, dataset management, and streaming data loading for AI agent workflows.
ColiVara is a suite of services for storing, searching, and retrieving documents based on visual embeddings. It uses vision models instead of chunking and text-processing, achieving state-of-the-art retrieval on both text and visual documents without OCR.
RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device. Published at MLsys 2026.
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.