PaddleOCR agent skills for text recognition and document parsing. PaddleOCR 文本识别与文档解析技能仓库。
npx skills add https://github.com/aidenwu0209/paddleocr-skills --skill paddleocr-text-recognitionInstala esta habilidad con la CLI y comienza a usar el flujo de trabajo SKILL.md en tu espacio de trabajo.
This directory contains official PaddleOCR Agent Skills. They integrate with AI apps such as Claude Code for OCR text extraction from images/PDFs and layout-aware document parsing.
paddleocr-text-recognition: extract text from images/PDFs.paddleocr-doc-parsing: document parsing that converts images/PDFs to Markdown.API_URL and Token. They correspond to the API URL and access token used for authentication. Supported models per skill:
paddleocr-text-recognition: PP-OCRv5paddleocr-doc-parsing: PP-StructureV3, PaddleOCR-VL, PaddleOCR-VL-1.5The instructions below cover both skills. Install and configure only the skill(s) you need.
skills CLIThe skills CLI installs skills globally on the device so all AI apps can use them. Node.js is required.
npx skills add PaddlePaddle/PaddleOCR -g --skill paddleocr-text-recognition -y
npx skills add PaddlePaddle/PaddleOCR -g --skill paddleocr-doc-parsing -y
This repository is relatively large. On slower networks,
npx skills addmay time out. If that happens, clone the repository locally first, then install from the local path:git clone https://github.com/PaddlePaddle/PaddleOCR.git npx skills add ./PaddleOCR/skills/paddleocr-text-recognition
clawhub (OpenClaw)clawhub install paddleocr-text-recognition
clawhub install paddleocr-doc-parsing
See the OpenClaw Skills documentation for details.
If the above options are not available, you can clone the repository and manually copy the skill directories to the location required by your AI app (Git required):
git clone https://github.com/PaddlePaddle/PaddleOCR.git
After cloning, skill source code is located under PaddleOCR/skills. Refer to the documentation for your AI app to complete installation:
After installation, configure the required environment variables so the skills can work properly. Each skill requires the following:
| Skill | Required | Optional |
|---|---|---|
paddleocr-text-recognition |
PADDLEOCR_OCR_API_URL (API URL), PADDLEOCR_ACCESS_TOKEN (access token) |
PADDLEOCR_OCR_TIMEOUT (API request timeout) |
paddleocr-doc-parsing |
PADDLEOCR_DOC_PARSING_API_URL (API URL), PADDLEOCR_ACCESS_TOKEN (access token) |
PADDLEOCR_DOC_PARSING_TIMEOUT (API request timeout) |
Below are configuration methods for some AI apps:
Claude Code: add an env field to .claude/settings.local.json in your project:
{
"env": {
"PADDLEOCR_ACCESS_TOKEN": "<ACCESS_TOKEN>",
"PADDLEOCR_OCR_API_URL": "<OCR_API_URL>",
"PADDLEOCR_DOC_PARSING_API_URL": "<DOC_PARSING_API_URL>"
}
}
OpenClaw: add skill configuration to ~/.openclaw/openclaw.json:
{
"skills": {
"entries": {
"paddleocr-text-recognition": {
"enabled": true,
"env": {
"PADDLEOCR_OCR_API_URL": "<OCR_API_URL>",
"PADDLEOCR_ACCESS_TOKEN": "<ACCESS_TOKEN>"
}
},
"paddleocr-doc-parsing": {
"enabled": true,
"env": {
"PADDLEOCR_DOC_PARSING_API_URL": "<DOC_PARSING_API_URL>",
"PADDLEOCR_ACCESS_TOKEN": "<ACCESS_TOKEN>"
}
}
}
}
}
After configuration, describe the OCR or document parsing task in natural language and provide a file URL or local path so the AI app can invoke the corresponding skill.
paddleocr-text-recognition
URL example:
Extract all text from this file: https://example.com/invoice.jpg
Local file example:
Extract all text from local file C:\docs\invoice.pdf
paddleocr-doc-parsing
URL example:
Parse this PDF and return the main body plus all tables: https://example.com/report.pdf
Local file example:
Parse local file C:\docs\report.pdf and return complete structured output.
This section describes how to run smoke tests locally to verify that the skills work correctly.
The examples below cover both skills. Run only the commands for the skill(s) you need.
Make sure your working directory is the directory containing this file.
Install dependencies.
python -m pip install -r paddleocr-text-recognition/scripts/requirements.txt
python -m pip install -r paddleocr-doc-parsing/scripts/requirements.txt
# Optional: required only when using document file optimization
python -m pip install -r paddleocr-doc-parsing/scripts/requirements-optimize.txt
Configure environment variables (see Configure Environment Variables for the list of variables).
export PADDLEOCR_OCR_API_URL="<OCR_API_URL>"
export PADDLEOCR_ACCESS_TOKEN="<ACCESS_TOKEN>"
export PADDLEOCR_DOC_PARSING_API_URL="<DOC_PARSING_API_URL>"
Run the smoke test scripts.
python paddleocr-text-recognition/scripts/smoke_test.py
python paddleocr-doc-parsing/scripts/smoke_test.py