AgentList
HomeProjectsArticlesAbout
Explore Projects
HomeProjectsArticlesAbout
Explore Projects
Home / Projects / Parsr

Parsr

Active
GitHub JavaScript Apache-2.0

Description

Transforms PDF, documents and images into enriched structured data with table recognition, reading order restoration, and Markdown output.

Tags

javascript rag tools data-processing automation

Categories

📚 RAG Tools
Visit GitHub

Project Metrics

Stars 6.2k
Forks 324
Watchers 80
Issues 72
Created August 5, 2019
Last commit April 18, 2026

Deployment

Local

Related Projects

Crawlee

22.8k · TypeScript
Active

A web scraping and browser automation library for Node.js to build reliable crawlers, supporting Puppeteer, Playwright, Cheerio, and raw HTTP. Extract data for AI, LLMs, RAG, or GPTs with proxy rotation and both headful and headless modes.

typescriptjavascriptdata-processing +3

PDFMathTranslate

33.2k · Python
Active

AI-powered PDF scientific paper translation with preserved formats, supporting Google/DeepL/Ollama/OpenAI services via CLI/GUI/MCP/Docker/Zotero.

ragpythontools +2

Unstract

6.5k · Python
Active

LLM-driven extraction of unstructured data, built for API deployments and ETL pipeline workflows. Automates document parsing, PDF extraction, and intelligent data processing with LLM-powered intelligence.

data-processingragpython +3

Zerox

12.2k · TypeScript
Active

OCR and document extraction tool using vision models, efficiently converting PDFs and images into structured text.

typescriptragtools +2
AgentList

Curated directory of open-source AI agent projects

Quick Links

  • Project List
  • Featured Articles
  • Browse Categories

Contact

  • About
  • Privacy Policy
  • Contact Us

© 2026 AgentList. All rights reserved.

Made with for the open source community