AgentList
HomeProjectsArticlesAbout
Explore Projects
HomeProjectsArticlesAbout
Explore Projects
Projects Parsr

Parsr

Normal
GitHub JavaScript Apache-2.0

Description

Transforms PDF, documents and images into enriched structured data with table recognition, reading order restoration, and Markdown output.

Tags

javascript rag tools data-processing automation

Categories

📚 RAG Tools
Visit GitHub

Project Metrics

Stars 6.2k
Forks 324
Watchers 6.2k
Issues 72
Created August 5, 2019
Last commit March 20, 2026

Deployment

Local

Related Projects

Crawlee

23.6k · TypeScript
Active

A web scraping and browser automation library for Node.js to build reliable crawlers, supporting Puppeteer, Playwright, Cheerio, and raw HTTP. Extract data for AI, LLMs, RAG, or GPTs with proxy rotation and both headful and headless modes.

typescriptjavascriptdata-processing +3

Unstract

6.6k · Python
Active

LLM-driven extraction of unstructured data, built for API deployments and ETL pipeline workflows. Automates document parsing, PDF extraction, and intelligent data processing with LLM-powered intelligence.

data-processingragpython +3

SAG

1.1k · Python
Stale

SQL-Driven RAG Engine that automatically builds knowledge graphs during querying, combining SQL query capabilities with Retrieval-Augmented Generation for efficient knowledge retrieval.

pythonragtools +2

Airweave

6.4k · Python
Active

Open-source context retrieval layer for AI agents that automatically extracts, indexes, and retrieves structured context from diverse data sources.

pythonragagent +2
AgentList

The most comprehensive directory of open-source AI Agent projects. Discover and compare top Agent frameworks like LangChain, CrewAI, and more.

Quick Links

  • Project List
  • Featured Articles
  • Browse Categories

Contact

  • About
  • Privacy Policy
  • Contact Us

© 2026 AgentList. All rights reserved.

Made with for the open source community