Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
PythonDicklesworthstone/llm_aided_ocr

llm_aided_ocr

Enhances Tesseract OCR output using LLMs (local or API) for error correction, smart chunking, and markdown formatting of scanned PDFs

71.0/100
2.9KForks: 206
View on GitHub
Loading report...

Similar Projects

OpenLLM

89

Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

Python12.1K

lmdeploy

85

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python7.7K

opencompass

86

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python6.7K

text-extract-api

66

Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

Python3.0K
Back to List