AgentList
HomeProjectsArticlesAbout
Explore Projects
HomeProjectsArticlesAbout
Explore Projects
Projects PyMuPDF

PyMuPDF

Active
GitHub Python AGPL-3.0

Description

High-performance Python library for data extraction, analysis, conversion and manipulation of PDF and other document formats.

Tags

python rag tools data-processing api

Categories

📚 RAG Tools
Visit GitHub

Project Metrics

Stars 9.7k
Forks 720
Watchers 9.7k
Issues 58
Created October 6, 2012
Last commit May 11, 2026

Deployment

Local

Related Projects

SAG

1.1k · Python
Stale

SQL-Driven RAG Engine that automatically builds knowledge graphs during querying, combining SQL query capabilities with Retrieval-Augmented Generation for efficient knowledge retrieval.

pythonragtools +2

Airweave

6.3k · Python
Active

Open-source context retrieval layer for AI agents that automatically extracts, indexes, and retrieves structured context from diverse data sources.

pythonragagent +2

Modular RAG MCP Server

889 · Python
Normal

A modular RAG system with MCP Server architecture. Using Skill to make AI follow each step of the spec and complete the code 100% by AI.

pythonragmcp +2

Docstrange

1.5k · Python
Stale

Extract and convert data from any document (PDFs, images, Word, PPT, URLs) into multiple formats including Markdown, JSON, and CSV.

pythonragtools +2
AgentList

The most comprehensive directory of open-source AI Agent projects. Discover and compare top Agent frameworks like LangChain, CrewAI, and more.

Quick Links

  • Project List
  • Featured Articles
  • Browse Categories

Contact

  • About
  • Privacy Policy
  • Contact Us

© 2026 AgentList. All rights reserved.

Made with for the open source community