Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
npx skills add https://github.com/paddlepaddle/paddleocr --skill paddleocr-doc-parsingInstale esta skill com a CLI e comece a usar o fluxo de trabalho SKILL.md em seu espaço de trabalho.
English | 简体中文 | 繁體中文 | 日本語 | 한국어 | Français | Русский | Español | العربية
PaddleOCR converts PDF documents and images into structured, LLM-ready data (JSON/Markdown) with industry-leading accuracy. With 70k+ Stars and trusted by top-tier projects like Dify, RAGFlow, and Cherry Studio, PaddleOCR is the bedrock for building intelligent RAG and Agentic applications.
Transforming messy visuals into structured data for the LLM era.
The global gold standard for high-speed, multilingual text spotting.
PaddleOCR-VL series, PP-StructureV3, and PP-DocTranslation now support exporting parsed results to DOCX for convenient viewing and editing in Microsoft Word.PaddleOCR.js, the official browser inference SDK that supports running PP-OCRv5 directly in the browser.Released PaddleOCR-VL:
Model Introduction:
Core Features:
Released PP-OCRv5 Multilingual Recognition Model:
Significant Model Additions:
Deployment Capability Upgrades:
Benchmark Support:
Bug Fixes:
use_chart_parsing) in the PP-StructureV3 configuration files compared to other pipelines.Other Enhancements:
PaddleOCR official website provides interactive Experience Center and APIs—no setup required, just one click to experience.
For local usage, please refer to the following documentation based on your needs:
⭐ Star this repository to keep up with exciting updates and new releases, including powerful OCR and document parsing capabilities! ⭐
| PaddlePaddle WeChat official account | Join the tech discussion group |
|---|---|
![]() |
![]() |
PaddleOCR wouldn't be where it is today without its incredible community! 💗 A massive thank you to all our longtime partners, new collaborators, and everyone who's poured their passion into PaddleOCR — whether we've named you or not. Your support fuels our fire!
| Project Name | Description |
|---|---|
| Dify |
Production-ready platform for agentic workflow development. |
| RAGFlow |
RAG engine based on deep document understanding. |
| pathway |
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG. |
| MinerU |
Multi-type Document to Markdown Conversion Tool |
| Umi-OCR |
Free, Open-source, Batch Offline OCR Software. |
| cherry-studio |
A desktop client that supports for multiple LLM providers. |
| haystack |
AI orchestration framework to build customizable, production-ready LLM applications. |
| OmniParser |
OmniParser: Screen Parsing tool for Pure Vision Based GUI Agent. |
| QAnything |
Question and Answer based on Anything. |
| Learn more projects | More projects based on PaddleOCR |
This project is released under the Apache 2.0 license.
@misc{cui2025paddleocr30technicalreport,
title={PaddleOCR 3.0 Technical Report},
author={Cheng Cui and Ting Sun and Manhui Lin and Tingquan Gao and Yubo Zhang and Jiaxuan Liu and Xueqing Wang and Zelun Zhang and Changda Zhou and Hongen Liu and Yue Zhang and Wenyu Lv and Kui Huang and Yichao Zhang and Jing Zhang and Jun Zhang and Yi Liu and Dianhai Yu and Yanjun Ma},
year={2025},
eprint={2507.05595},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2507.05595},
}
@misc{cui2025paddleocrvlboostingmultilingualdocument,
title={PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model},
author={Cheng Cui and Ting Sun and Suyin Liang and Tingquan Gao and Zelun Zhang and Jiaxuan Liu and Xueqing Wang and Changda Zhou and Hongen Liu and Manhui Lin and Yue Zhang and Yubo Zhang and Handong Zheng and Jing Zhang and Jun Zhang and Yi Liu and Dianhai Yu and Yanjun Ma},
year={2025},
eprint={2510.14528},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2510.14528},
}
@misc{cui2026paddleocrvl15multitask09bvlm,
title={PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing},
author={Cheng Cui and Ting Sun and Suyin Liang and Tingquan Gao and Zelun Zhang and Jiaxuan Liu and Xueqing Wang and Changda Zhou and Hongen Liu and Manhui Lin and Yue Zhang and Yubo Zhang and Yi Liu and Dianhai Yu and Yanjun Ma},
year={2026},
eprint={2601.21957},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2601.21957},
}