Docstrange
活跃简介
Extract and convert data from any document (PDFs, images, Word, PPT, URLs) into multiple formats including Markdown, JSON, and CSV.
Extract and convert data from any document (PDFs, images, Word, PPT, URLs) into multiple formats including Markdown, JSON, and CSV.
A Python library by Google for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization, designed for data annotation and knowledge extraction workflows.
AI-powered PDF scientific paper translation with preserved formats, supporting Google/DeepL/Ollama/OpenAI services via CLI/GUI/MCP/Docker/Zotero.
A web scraping and browser automation library for Node.js to build reliable crawlers, supporting Puppeteer, Playwright, Cheerio, and raw HTTP. Extract data for AI, LLMs, RAG, or GPTs with proxy rotation and both headful and headless modes.
集成语义搜索、LLM 编排和语言模型工作流的全能 AI 框架,支持 Agent、RAG 和向量数据库