LlamaIndex

Active
GitHub Python MIT

Description

Leading data framework for LLM applications, with unified RAG, Agent, and Workflow capabilities.

Key Features

  • Data connectors — Ingest and parse unstructured data from 100+ sources (PDFs, DBs, APIs, Notion, Slack)
  • RAG pipelines — End-to-end retrieval-augmented generation with chunking, embedding, retrieval, reranking, and answer synthesis
  • Agent abstractions — FunctionCallingAgent and ReActAgent out of the box with tool calls and memory
  • Workflow orchestration — Event-driven workflow engine with multi-step, loops, concurrency, and error recovery
  • LlamaParse — Industrial-grade PDF/Excel parser that extracts tables, charts, and layout accurately
  • Observability — Built-in OpenTelemetry integration to trace every retrieval and generation step

Use Cases

💡 Build enterprise knowledge base Q&A over PDFs/Word/Excel with unified retrieval and answer generation.
💡 Power customer service RAG agents that query orders, inventory, and other tools automatically.
💡 Research assistants that ingest papers and web pages into a citable knowledge base.

Categories

Quick Start

# Install
pip install llama-index
# Simple RAG: load and query
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
docs = SimpleDirectoryReader('data').load_data()
index = VectorStoreIndex.from_documents(docs)
query_engine = index.as_query_engine()
print(query_engine.query('Summarize the key points of this document'))

Related Projects