MinerU
ActiveDescription
Transforms complex documents like PDFs into LLM-ready markdown/JSON for Agentic workflows, supporting layout analysis, formula recognition, and table extraction.
Transforms complex documents like PDFs into LLM-ready markdown/JSON for Agentic workflows, supporting layout analysis, formula recognition, and table extraction.
AI-powered PDF scientific paper translation with preserved formats, supporting Google/DeepL/Ollama/OpenAI services via CLI/GUI/MCP/Docker/Zotero.
Opinionated RAG framework for integrating GenAI into your apps. Works with any LLM, any vectorstore, any files — so you can focus on your product instead of building RAG pipelines.
LLM-driven extraction of unstructured data, built for API deployments and ETL pipeline workflows. Automates document parsing, PDF extraction, and intelligent data processing with LLM-powered intelligence.
A universal local knowledge base solution based on vector databases and GPT, providing one-stop document processing with vectorization, semantic search, and intelligent Q&A for building private knowledge bases.