Claude Code skill: analyse video content by extracting frames with ffmpeg and using AI vision to generate timestamped summaries
npx skills add https://github.com/fabriqaai/ffmpeg-analyse-video-skill --skill ffmpeg-analyse-videoInstall this skill with the CLI and start using the SKILL.md workflow in your workspace.
AI agent skill that analyses video content by extracting frames with ffmpeg and using vision to generate timestamped step-by-step summaries.
Works with screen recordings, tutorials, presentations, footage, and animations.
npx skills add fabriqaai/ffmpeg-analyse-video-skill
brew install ffmpegsudo apt install ffmpegchoco install ffmpeg or winget install ffmpegProvide a video file path and ask your agent to analyse it:
Analyse this video: /path/to/recording.mp4
What happens in this video? ~/Desktop/demo.mov
Summarise this recording: ./tutorial.mp4
Analyse 2:00 to 5:00 of meeting.mp4
Analyse this video in high detail: demo.mp4
Focus on the code shown in this video: screencast.mp4
Main Agent Sub-Agents (disposable context)
────────── ──────────────────────────────
1. ffprobe metadata ───►
2. ffmpeg frame extraction ───►
3. Split frames into batches ──► 4. Read images (vision)
Write text descriptions
to batch_N_analysis.md
5. Read text files only ◄─── (context discarded)
6. Synthesise final output
Frame images are only ever read inside disposable sub-agent contexts. The main agent receives lightweight text-only analysis files — no images enter the main conversation. This cuts context usage by ~90% compared to reading frames directly.
| Video Duration | Strategy | Expected Frames |
|---|---|---|
| 0-60s | Interval (1 frame/2s) | 1-30 |
| 1-10min | Scene detection | 15-60 |
| 10-30min | Keyframe extraction | 30-80 |
| 30min+ | Thumbnail filter | Capped at 60 |
The skill produces a structured markdown report:
MIT