vllm-deploy-docker

Agent skills for vLLM

Installation
CLI
npx skills add https://github.com/vllm-project/vllm-skills --skill vllm-deploy-docker

Install this skill with the CLI and start using the SKILL.md workflow in your workspace.

Last updated 4/24/2026

vLLM Skills

A collection of skills for deploying and benchmarking vLLM. This project follows the anthropics/skills template format and is installable as a Claude Code plugin marketplace.

Overview

This repository provides modular, reusable agent skills required to operate and benchmark vLLM, following the Anthropics SKILL.md specification. Each skill is a self-contained directory implementing automation, scripts, and metadata for a specific operational task.

Skills Index

Skill Description
vllm-deploy-docker Deploy vLLM using Docker (pre-built images or build-from-source) with NVIDIA GPU support and run the OpenAI-compatible server.
vllm-deploy-k8s Deploy vLLM to Kubernetes with GPU support, health probes, and OpenAI-compatible API endpoint.
vllm-deploy-simple Quick install and deploy vLLM, start serving with a simple LLM, and test OpenAI API.
vllm-prefix-cache-bench Benchmark the efficiency of vLLM automatic prefix caching using fixed prompts, real datasets, or synthetic prefix/suffix patterns.
vllm-bench-random-synthetic Run vLLM performance benchmark using synthetic random data to measure throughput, TTFT, TPOT, and other key performance metrics without downloading external datasets.
vllm-bench-serve Benchmark vLLM or OpenAI-compatible serving endpoints using vllm bench serve.

Installation

Install directly from the plugin marketplace in Claude Code:

/plugin marketplace add vllm-project/vllm-skills
/plugin install vllm-skills@vllm-skills

Manual Install

Clone the repository and copy skills to your Claude Code skills directory:

git clone https://github.com/vllm-project/vllm-skills.git
cd vllm-skills

Copy to global skill folder:

cp -r plugins/vllm-skills/skills/vllm-deploy-simple ~/.claude/skills/

Or copy to the project skill folder:

cp -r plugins/vllm-skills/skills/vllm-deploy-simple .claude/skills/

Usage

Once installed, use the skills with slash commands or natural language:

/vllm-deploy-simple
Deploy vLLM with Qwen2.5-1.5B-Instruct on port 8000
Install and start a vLLM server using the vllm-deploy-simple skill

Supported Models

See vLLM documentation for the full list.

Contributing

This project follows the anthropics/skills template. When adding new skills:

  1. Create a new directory under plugins/vllm-skills/skills/ (e.g., plugins/vllm-skills/skills/your-skill/)
  2. Add a SKILL.md file with YAML frontmatter:
    ---
    name: your-skill
    description: Brief description of what this skill does
    ---
    
  3. Add optional scripts/, references/, and assets/ directories
  4. Update this README with your skill documentation

License

Licensed under the Apache License 2.0. See LICENSE.

Resources