AI skill for parsing PDF, images, Word & PPT into Markdown/JSON via SoMark API. Works with Claude Code, Cursor, Cline and 40+ agents.
npx skills add https://github.com/somarkai/skills --skill contract-reviewerInstala esta habilidad con la CLI y comienza a usar el flujo de trabajo SKILL.md en tu espacio de trabajo.
Official SoMark skills collection for document parsing, image OCR, and intelligent extraction — built for AI agent workflows.
npx skills add https://github.com/SoMarkAI/skills
Works with Claude Code, Cursor, Cline, OpenCode, and 40+ other agents.
| Skill | Description |
|---|---|
| somark-document-parser | Parse PDFs, Word, PowerPoint, and images into structured Markdown, JSON |
| image-parser | Core image OCR capability that returns text with precise coordinates (OCR + location awareness) |
| document-diff | Compare two documents and generate a structured diff report showing changes, additions, and deletions |
| contract-reviewer | Review contracts for risks, unfair clauses, missing provisions, and key obligations with severity ratings |
| resume-parser | Parse resumes and CVs into structured JSON profiles with opinionated candidate assessment |
| tender-analyzer | Extract qualification requirements, scoring criteria, deadlines, and submission checklists from procurement documents |
| paper-digest | Deeply analyze academic papers into structured research cards covering methods, results, and contributions |
| financial-report-analyzer | Extract financial metrics, risk signals, and management commentary from annual reports and earnings releases |
| pitch-screener | Screen startup pitch decks from a VC/angel investor perspective — parses the deck, runs background research via web search, and produces a pre-meeting investment memo |
When you share a document with your AI agent, SoMark parses it into structured Markdown, JSON that the agent can actually reason over — not just OCR'd text, but proper headings, tables, formulas, and layout.
The image-parser skill goes further: it returns every text block with its exact pixel coordinates on the original image, enabling field extraction, region location, and document automation.
Supported formats:
| Type | Formats |
|---|---|
| Documents | PDF, DOC, DOCX, PPT, PPTX |
| Images | PNG, JPG, JPEG, BMP, TIFF, WEBP, HEIC, HEIF, GIF |
Example triggers:
Get an API key at somark.tech, then set it as an environment variable:
export SOMARK_API_KEY=sk-your-api-key
Or add it to your agent's settings. The skill will guide you through setup on first use.
Free quota: SoMark offers a free tier. Visit the purchase page and follow the instructions there to claim it.
Most agents struggle with documents because raw PDF/image data loses structure. SoMark preserves:
The result: your agent gives accurate, context-aware answers instead of hallucinating from garbled text.
| Constraint | Limit |
|---|---|
| Max file size | 200 MB |
| Max pages | 300 pages |
| QPS per account | 1 |
MIT