⚠

Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.

PythonPaddlePaddle/PaddleOCR

PaddleOCR

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

96.0/100

★ 86.2KForks: 11.1K

View on GitHub →Homepage →

Loading report...

Similar Projects

ParseBench

ParseBench - A Document Parsing Benchmark for AI Agents

Python★ 533

MinerU

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

Python★ 75.6K

unstract

LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows

Python★ 6.9K

EvoScientist

🔬 Harness Vibe Research with Self-evolving AI Scientists

Python★ 4.3K

← Back to List