Server-side video workflows for agents: ingest, understand, search, edit, stream.
npx skills add https://github.com/video-db/skills --skill videodbInstala esta habilidad con la CLI y comienza a usar el flujo de trabajo SKILL.md en tu espacio de trabajo.
The only perception skill your agent needs.
Works with Claude Code, Cursor, Copilot, and other AI agents
This skill gives your agent one consistent interface to:
See: Realtime desktop screen, mic and system audio, RTSP streams, ingest files, URLs, YouTube.
Understand: Visual understanding, transcribe, index and search moments with playble clips
Act: Stream results, trigger alerts on live feeds, edit timelines, generate subtitles and overlays, export clips.
VideoDB Skills lets your AI coding agent run end to end, server-side video workflows in real time and batch:
Return playable HLS links for anything you build.
Get started in two quick steps. Open your AI coding agent (Requires Python 3.9+) and follow along.
npx skills add video-db/skills
Or install with Claude Code plugin:
/plugin marketplace add video-db/skills
/plugin install videodb@videodb-skills
/videodb setup
The agent will guide setup for your VideoDB API key ($20 free credits, no credit card required), install the SDK, and verify the connection.
For Cursor, Copilot, and other agents, ask your agent to "setup videodb"
Set your API key using either method:
# Recommended: Export in terminal
export VIDEO_DB_API_KEY=sk-xxx
# Or add to your project's .env file
VIDEO_DB_API_KEY=sk-xxx
Ask your agent to run instructions like these. The skill loads automatically.
VideoDB is the server side video stack for agents and apps.
Run reliable, scalable, cost efficient workflows across realtime streams and batch video, with built in AI understanding, without wiring up ffmpeg glue.
Keep your client and agent stack light: send video in, get back structured context, searchable moments, and playable streams.
| Capability | What it unlocks |
|---|---|
| Capture | Capture desktop screen, mic, and system audio for realtime processing |
| Upload | Ingest video from YouTube, URLs, or local files |
| Context | Generate realtime structured context for any RTSP feed or desktop stream |
| Search | Find exact moments by speech, scenes, or metadata, return playable evidence |
| Transcripts | Generate clean, timestamped transcripts from any video |
| Subtitles | Auto generate subtitles, then style and burn in or export |
| Edit | Trim, merge, clip, overlay text, images, audio, plus dubbing and translation |
| AI Generate | Create images, video, music, sound effects, and voiceovers from text |
| Transcode / Reframe | Change resolution, quality, aspect ratio, and social crops, all on the server |
| Stream | Get instant playable HLS links (built in CDN) for anything you ingest or generate. |
See → Understand → Act, as an API, for video and audio.
Supported Platforms: macOS, Linux, Windows (PowerShell)
Made with ❤️ by the VideoDB team