LanceDB

Active
GitHub HTML Apache-2.0

Description

An open-source embedded retrieval library for multimodal AI with zero server configuration, using the Lance columnar format for efficient vector search and filtering, ideal for agent memory and RAG applications.

Key Features

  • Millisecond vector search with state-of-the-art indexing for billions of vectors
  • Comprehensive search: vector similarity, full-text search, and SQL queries in one platform
  • Multimodal support — store and query text, images, videos, point clouds, and more
  • Zero-copy, automatic versioning — manage data versions without extra infrastructure
  • GPU-accelerated vector index building for dramatically faster large-scale data processing
  • Rich ecosystem: LangChain, LlamaIndex, Apache Arrow, Pandas, DuckDB integrations

Use Cases

💡 Build long-term memory stores for AI agents with efficient semantic retrieval and context recall
💡 Create RAG (Retrieval-Augmented Generation) applications for multimodal knowledge-base Q&A
💡 Build recommendation systems using vector similarity for personalized content matching
💡 Process large-scale multimodal datasets with cross-modal search and analysis capabilities
💡 Embed as a database in Python/TypeScript/Rust apps with zero server deployment

Quick Start

Install with pip install lancedb. Connect: import lancedb; db = lancedb.connect('~/.lancedb'). Create table: table = db.create_table('my_table', data). Search: results = table.search(query).limit(10).to_pandas(). Quickstart at docs.lancedb.com/quickstart.

Related Projects