LlamaIndex

Active
GitHub Python MIT

Description

LlamaIndex is a data framework that provides the data connection layer for LLM applications, with strong RAG capabilities across diverse data sources and vector databases.

Key Features

  • Data connectors — 300+ integration packages connecting diverse data sources (files, databases, APIs, web, etc.)
  • Vector indexing and query engine — Supports multiple vector databases with semantic search and hybrid retrieval
  • Agent workflow orchestration — Build complex multi-step AI agent flows with Workflows
  • LlamaParse document parsing — Agentic OCR and document parsing supporting 130+ formats
  • Structured data extraction — Extract structured information from unstructured documents
  • Modular architecture — Core and integration packages separated, install only what you need

Use Cases

💡 Building RAG-based Q&A systems over private document collections
💡 Creating multi-modal document analysis and information extraction pipelines
💡 Developing autonomous AI agents with tool-calling capabilities
💡 Building enterprise-grade data indexing and semantic search platforms

Categories

Quick Start

# Install LlamaIndex core and OpenAI integration
pip install llama-index llama-index-llms-openai

# Import modules
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

# Load documents from a directory
documents = SimpleDirectoryReader('./data').load_data()

# Build vector index
index = VectorStoreIndex.from_documents(documents)

# Create query engine and ask questions
query_engine = index.as_query_engine()
response = query_engine.query("What are the key points mentioned in the documents?")
print(response)

Related Projects

Related Articles