AgentList
HomeProjectsArticlesAbout
Explore Projects
HomeProjectsArticlesAbout
Explore Projects
Home / Projects / PyMuPDF

PyMuPDF

Active
GitHub Python AGPL-3.0

Description

High-performance Python library for data extraction, analysis, conversion and manipulation of PDF and other document formats.

Tags

python rag tools data-processing api

Categories

📚 RAG Tools
Visit GitHub

Project Metrics

Stars 9.5k
Forks 713
Watchers 60
Issues 56
Created October 6, 2012
Last commit April 18, 2026

Deployment

Local

Related Projects

PDFMathTranslate

33.2k · Python
Active

AI-powered PDF scientific paper translation with preserved formats, supporting Google/DeepL/Ollama/OpenAI services via CLI/GUI/MCP/Docker/Zotero.

ragpythontools +2

Docstrange

1.4k · Python
Active

Extract and convert data from any document (PDFs, images, Word, PPT, URLs) into multiple formats including Markdown, JSON, and CSV.

pythonragtools +2

MinerU

60.4k · Python
Active

Transforms complex documents like PDFs into LLM-ready markdown/JSON for Agentic workflows, supporting layout analysis, formula recognition, and table extraction.

data-processingragpython +2

RAG Techniques

26.9k · Jupyter Notebook
Active

A comprehensive showcase of advanced Retrieval-Augmented Generation (RAG) techniques with detailed notebook tutorials and code examples, covering foundational to cutting-edge RAG implementations.

ragpythonprompt-engineering +1
AgentList

Curated directory of open-source AI agent projects

Quick Links

  • Project List
  • Featured Articles
  • Browse Categories

Contact

  • About
  • Privacy Policy
  • Contact Us

© 2026 AgentList. All rights reserved.

Made with for the open source community