AgentList
HomeProjectsArticlesAbout
Explore Projects
HomeProjectsArticlesAbout
Explore Projects
Home / Projects / Zerox

Zerox

Active
GitHub TypeScript MIT

Description

OCR and document extraction tool using vision models, efficiently converting PDFs and images into structured text.

Tags

typescript rag tools data-processing llm

Categories

📚 RAG Tools
Visit GitHub

Project Metrics

Stars 12.2k
Forks 840
Watchers 63
Issues 87
Created July 21, 2024
Last commit April 18, 2026

Deployment

Local

Related Projects

Crawlee

22.8k · TypeScript
Active

A web scraping and browser automation library for Node.js to build reliable crawlers, supporting Puppeteer, Playwright, Cheerio, and raw HTTP. Extract data for AI, LLMs, RAG, or GPTs with proxy rotation and both headful and headless modes.

typescriptjavascriptdata-processing +3

MinerU

60.4k · Python
Active

Transforms complex documents like PDFs into LLM-ready markdown/JSON for Agentic workflows, supporting layout analysis, formula recognition, and table extraction.

data-processingragpython +2

Vane

33.8k · TypeScript
Active

An AI-powered answering engine with multi-model integration, web search and local knowledge base, providing a Perplexity-like search experience.

ragtypescriptllm +2

PDFMathTranslate

33.2k · Python
Active

AI-powered PDF scientific paper translation with preserved formats, supporting Google/DeepL/Ollama/OpenAI services via CLI/GUI/MCP/Docker/Zotero.

ragpythontools +2
AgentList

Curated directory of open-source AI agent projects

Quick Links

  • Project List
  • Featured Articles
  • Browse Categories

Contact

  • About
  • Privacy Policy
  • Contact Us

© 2026 AgentList. All rights reserved.

Made with for the open source community