mineru

An AI Skill that transforms PDFs into clean Markdown using MinerU's VLM engine. Supports LaTeX formulas, tables, images, and batch async processing.

Installation
CLI
npx skills add https://github.com/nebutra/mineru-skill --skill mineru

Install this skill with the CLI and start using the SKILL.md workflow in your workspace.

Last updated 4/22/2026

MinerU Skill

GitHub Release Python License ClawHub

GitHub Stars GitHub Forks

An AI Skill that transforms PDF documents into clean Markdown using MinerU's VLM engine.

ไธญๆ–‡ๆ–‡ๆกฃ | English


๐Ÿค– What is a Skill?

A Skill is an AI capability package that extends your AI assistant's abilities. When you ask the AI to do something, it automatically:

  1. Recognizes the task from your natural language
  2. Activates the appropriate skill
  3. Executes the task using the skill's tools
  4. Delivers results back to you

Example Conversation

You: ่งฃๆž่ฟ™ไบ›่€ƒ็ ”ๆ•ฐๅญฆ็œŸ้ข˜ PDF ๅˆฐๆˆ‘็š„ Obsidian

AI: ๐Ÿ“š ๅ‘็Žฐ 40 ไธช PDF ๆ–‡ไปถ
    โณ ๅผ€ๅง‹ๅนถ่กŒ่งฃๆž (5 workers)...
    โœ… 1993ๅนด่€ƒ็ ”ๆ•ฐๅญฆ๏ผˆไธ€๏ผ‰็œŸ้ข˜ โ†’ Markdown
    โœ… 1994ๅนด่€ƒ็ ”ๆ•ฐๅญฆ๏ผˆไธ€๏ผ‰็œŸ้ข˜ โ†’ Markdown
    ...
    โœ… ๅฎŒๆˆ๏ผๅทฒไฟๅญ˜ๅˆฐ Obsidian/่€ƒ็ ”/ๆ•ฐๅญฆไธ€/

๐Ÿš€ Install as Skill

npx skills add Nebutra/MinerU-Skill

Supported: OpenCode, Claude Code, Codex, Cursor, 35+ more

OpenClaw

# Clone to your skills directory
git clone https://github.com/Nebutra/MinerU-Skill.git ~/openclaw-skills/mineru/

# Set API token
export MINERU_TOKEN="your-token-here"  # Get from https://mineru.net/user-center/api-token

ClawHub

# Install via clawhub CLI
clawhub install mineru

Claude Code / Cursor / Windsurf

# Clone to AI skills folder
git clone https://github.com/Nebutra/MinerU-Skill.git ~/.claude/skills/mineru/

๐Ÿ’ฌ Usage Examples

Single File

ๆŠŠ ./document.pdf ่งฃๆžๆˆ Markdown

Batch Directory

่งฃๆž ./papers/ ็›ฎๅฝ•ไธ‹็š„ๆ‰€ๆœ‰ PDF๏ผŒ่พ“ๅ‡บๅˆฐ ./output/

Direct to Obsidian

ๆŠŠ่ฟ™ไบ› PDF ่งฃๆžๅŽ็›ดๆŽฅไฟๅญ˜ๅˆฐๆˆ‘็š„ Obsidian Vault

Chinese Example

่งฃๆž 1987-2025 ๅนด่€ƒ็ ”ๆ•ฐๅญฆ็œŸ้ข˜๏ผŒไฟๅญ˜ๅˆฐ Obsidian/่€ƒ็ ”/ๆ•ฐๅญฆไธ€/
็”จ 10 ไธชๅนถๅ‘๏ผŒ่ทณ่ฟ‡ๅทฒๅค„็†็š„ๆ–‡ไปถ

โšก Features

Feature Description
๐Ÿ“„ PDF Input Local files, URLs, batch directories
๐Ÿ“ Output Markdown + JSON metadata + Images
๐Ÿ”ข LaTeX Math formulas preserved
๐Ÿ“Š Tables Structure extraction
๐Ÿ–ผ๏ธ Images Auto-extracted to images/
โšก Async 15x parallel uploads
๐Ÿ”„ Resume Skip processed files
๐Ÿ“ Obsidian Direct vault output

๐Ÿ› ๏ธ CLI Reference

You can also use directly via CLI:

# Single file
python scripts/mineru_v2.py --file ./doc.pdf --output ./output/

# Batch with resume
python scripts/mineru_v2.py \
  --dir ./pdfs/ \
  --output ~/Obsidian/MyVault/ \
  --workers 10 \
  --resume
Option Description
--dir PATH Input directory
--file PATH Single file
--output PATH Output directory
--workers N Concurrency (default: 5)
--resume Skip processed files
--token TOKEN API token

๐Ÿ“ Output Structure

output/
โ”œโ”€โ”€ document-name/
โ”‚   โ”œโ”€โ”€ document-name.md    # Main Markdown
โ”‚   โ”œโ”€โ”€ images/             # Extracted images
โ”‚   โ”‚   โ”œโ”€โ”€ image_0_0.png
โ”‚   โ”‚   โ””โ”€โ”€ ...
โ”‚   โ””โ”€โ”€ content.json        # Metadata
โ””โ”€โ”€ ...

๐Ÿ“Š Performance

Test: 10 PDFs, ~15 pages each (MacBook Air M1)

Configuration Time Speed
Sequential 8.5 min 1.2 files/min
Async (5 workers) 3.2 min 3.1 files/min
Async (15 workers) 1.8 min 5.6 files/min

๐Ÿ”‘ Get API Token

  1. Visit MinerU
  2. Create free API token
  3. Set environment:
export MINERU_TOKEN="your-token-here"

Free Tier: 2000 pages/day, 200MB max file


โญ Star History

Star History Chart

๐Ÿ—๏ธ Skill Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    USER REQUEST                             โ”‚
โ”‚      "Parse these PDFs to Markdown"                         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                           โ”‚
                           โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    AI ASSISTANT                             โ”‚
โ”‚  โ€ข Recognizes PDF parsing task                             โ”‚
โ”‚  โ€ข Activates MinerU skill                                  โ”‚
โ”‚  โ€ข Reads SKILL.md for instructions                         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                           โ”‚
                           โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                  MINERU SKILL ENGINE                        โ”‚
โ”‚  Scanner โ”€โ”€โ–บ Scheduler โ”€โ”€โ–บ Worker Pool (N workers)         โ”‚
โ”‚                           โ”‚                                 โ”‚
โ”‚                           โ–ผ                                 โ”‚
โ”‚  API: Get URL โ”€โ”€โ–บ Upload โ”€โ”€โ–บ Poll โ”€โ”€โ–บ Download             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                           โ”‚
                           โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      OUTPUT                                 โ”‚
โ”‚     Markdown + JSON + Images โ”€โ”€โ–บ Obsidian/Files            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿค Contributing

  1. Fork โ†’ Branch โ†’ Commit โ†’ Push โ†’ PR

๐Ÿ“ License

MIT License - see LICENSE


๐Ÿ™ Acknowledgments


If this skill helps you, give it a โญ!

Made with โค๏ธ by Nebutra