AgentList
首页项目文章关于
探索项目
首页项目文章关于
探索项目
首页 / 项目 / Parsr

Parsr

活跃
GitHub JavaScript Apache-2.0

简介

Transforms PDF, documents and images into enriched structured data with table recognition, reading order restoration, and Markdown output.

标签

javascript rag tools data-processing automation

分类

📚 RAG 工具
访问 GitHub

项目指标

Stars 6.2k
Forks 324
Watchers 80
Issues 72
创建时间 2019年8月5日
最近提交 2026年4月18日

部署方式

本地部署

相关项目

Crawlee

22.8k · TypeScript
活跃

A web scraping and browser automation library for Node.js to build reliable crawlers, supporting Puppeteer, Playwright, Cheerio, and raw HTTP. Extract data for AI, LLMs, RAG, or GPTs with proxy rotation and both headful and headless modes.

typescriptjavascriptdata-processing +3

PDFMathTranslate

33.2k · Python
活跃

AI-powered PDF scientific paper translation with preserved formats, supporting Google/DeepL/Ollama/OpenAI services via CLI/GUI/MCP/Docker/Zotero.

ragpythontools +2

Unstract

6.5k · Python
活跃

LLM-driven extraction of unstructured data, built for API deployments and ETL pipeline workflows. Automates document parsing, PDF extraction, and intelligent data processing with LLM-powered intelligence.

data-processingragpython +3

Zerox

12.2k · TypeScript
活跃

OCR and document extraction tool using vision models, efficiently converting PDFs and images into structured text.

typescriptragtools +2
AgentList

开源机器人/Agent 项目导航站

快速链接

  • 项目列表
  • 精选文章
  • 分类浏览

联系我们

  • 关于我们
  • 隐私政策
  • 联系我们

© 2026 AgentList. 保留所有权利。

Made with for the open source community