bkit Vibecoding Kit - PDCA methodology + Claude Code mastery for AI-native development
npx skills add https://github.com/popup-studio-ai/bkit-claude-code --skill btwCLI를 사용하여 이 스킬을 설치하고 작업 공간에서 SKILL.md 워크플로 사용을 시작하세요.
PDCA methodology + CTO-Led Agent Teams + AI coding assistant mastery for AI-native development
bkit is a Claude Code plugin that transforms how you build software with AI. It provides structured development workflows, automatic documentation, and intelligent code assistance through the PDCA (Plan-Do-Check-Act) methodology.

Context Engineering is the systematic curation of context tokens for optimal LLM inference—going beyond simple prompt crafting to build entire systems that consistently guide AI behavior.
Traditional Prompt Engineering:
"The art of writing good prompts"
Context Engineering:
"The art of designing systems that integrate prompts, tools, and state
to provide LLMs with optimal context for inference"
bkit is a practical implementation of Context Engineering, providing a systematic context management system for Claude Code.

bkit implements Context Engineering through three interconnected layers:
| Layer | Components | Purpose |
|---|---|---|
| Domain Knowledge | 39 Skills | Structured expert knowledge (phases, levels, specialized domains) |
| Behavioral Rules | 36 Agents | Role-based constraints with model selection (opus/sonnet/haiku) |
| State Management | 128 Lib Modules (~27,085 LOC) across 15 subdirs | PDCA state machine, workflow engine, automation control, audit, quality gates, intent detection, team coordination, 3-Layer Orchestration (Sprint 7), Clean Architecture 4-Layer (Domain/Application/Infrastructure/Presentation) |
Context injection occurs at six distinct layers:
Layer 1: hooks.json (Global) → SessionStart, UserPromptSubmit, PreCompact, PostCompact, PreToolUse, PostToolUse, Stop, StopFailure + 13 more (21 events)
Layer 2: Skill Frontmatter → Domain-specific hooks (deprecated in v1.4.4, use hooks.json)
Layer 3: Agent Frontmatter → Task-specific hooks with constraints
Layer 4: Description Triggers → Semantic matching in 8 languages
Layer 5: Scripts (42 modules) → Actual Node.js execution logic with unified handlers
Layer 6: Plugin Data Backup → ${CLAUDE_PLUGIN_DATA} persistent state management
Learn more: See Context Engineering Principles for detailed implementation.

lifecycle.reconcile() auto-release (expectedFix seed × 4). Invocation Contract L1~L5: 226 assertions (L1 + L4 CI-gated) + L2 hook attribution 13 TC + L3 MCP stdio runtime 42 TC (real spawn + tools/list) + L5 E2E shell smoke 5 scenarios. Defense-in-Depth 4-Layer formalized: Layer 1 (CC Built-in sandbox) → Layer 2 (bkit PreToolUse Hook: pre-write.js + unified-bash-pre.js + defense-coordinator) → Layer 3 (audit-logger OWASP A03/A08 sanitizer, 7-key PII redaction) → Layer 4 (Token Ledger .bkit/runtime/token-ledger.json NDJSON). Docs=Code CI (scripts/docs-code-sync.js, 0 drift, 8 counts + BKIT_VERSION 5-location invariant). Sprint 7 3-Layer Orchestration (lib/orchestrator/ 5 modules = intent-router + next-action-engine + team-protocol + workflow-state-machine + index, 19 exports): SKILL_TRIGGER_PATTERNS 4→15, matchRate SSoT 100→90, Enterprise teammates 5→6, Trust Score level auto-reflect restored, cto-lead body 5 Task spawn examples + Task(pm-lead) / Task(qa-lead) / Task(pdca-iterator) in frontmatter, 79 @version bulk refresh to 2.1.10. BKIT_VERSION centralization complete (ENH-167, bkit.config.json single SoT; 5-location invariant: plugin.json / hooks.json / session-start.js / README / CHANGELOG). ENH-202 context: fork 1→9 skills. Legacy 3 modules removed (421 LOC). audit-logger 682GB recursion root-fixed (createDualSink avoidance + Integration Runtime TC permanent defense). Critical Bug C1/C2 fix (startDate→date, PII 6-key blacklist + 500-char cap). Domain purity CI (check-domain-purity.js, 11 files / 0 forbidden imports: fs/child_process/net/http/https/os). PreCompact counter (ENH-247/257, 2-week measurement). Hook attribution 3 sites (Stop/SessionEnd/SubagentStop). 6 Validator CLIs: check-guards / docs-code-sync / check-deadcode / check-domain-purity / l3-mcp-runtime / test/e2e/run-all.sh. Architecture: 39 Skills, 36 Agents, 21 Hook Events (24 blocks), 16 MCP Tools, 2 MCP Servers, 128 Lib Modules (~27,085 LOC across 15 subdirs), 47 Scripts, 113 test files, 3,762 TC (3,760 PASS / 0 FAIL / 2 expected legacy). CC recommended v2.1.117+ (75 consecutive compatible releases). Invocation Contract 100% preserved + Starter/Dynamic/Enterprise zero-action update. MEMORY.md 302→79 lines decomposed (detail: cc_version_history_v21xx.md / enh_backlog.md / github_issues_monitor.md).context: fork macOS verification (Issue #51165 non-reproduction on macOS, ENH-196/202 investment protected), ENH-254 Defense-in-Depth security architecture formalization (Layer 1 CC runtime sandbox × Layer 2 bkit DANGEROUS_PATTERNS hook, docs/03-analysis/security-architecture.md), ENH-259 Custom Skills data loss warning for Issue #51234 (CUSTOMIZATION-GUIDE.md backup/restore/plugin-path guidance — bkit plugin itself unaffected at ${CLAUDE_PLUGIN_ROOT}/skills/), ENH-263 Docs=Code 15-file architectural correction (@version 1.6.0 in lib/: 0 matches — ENH-270 acceptance). Positive drift: ENH-264 lib/core/io.js outputBlockWithContext infrastructure + 2 call sites in unified-bash-pre.js (deploy/QA phase paths), ENH-265 ENABLE_PROMPT_CACHING_1H SessionStart branch + operational guide (docs/03-analysis/prompt-caching-optimization.md, 30-40% token savings on long PDCA sessions). Architecture: 39 Skills, 36 Agents, 43 Scripts, 101 Lib modules (11 subdirs), 21 Hook Events, 18 Templates, 4 Output Styles, 2 MCP Servers. CC recommended v2.1.116+ (74 consecutive compatible releases, v2.1.115 skipped). Shipping QA: Match Rate 100% / Coverage 90.3% / P0 Blocker 0 / Regression 0 (docs/05-qa/cc-v2114-v2116-shipping-readiness.report.md).session-context.js guard restored (ENH-238, ENH-226 Docs=Code violation fixed — ui.contextInjection.enabled + sections[] 3-way toggle mirrors ui.dashboard pattern), P0 compaction SHA-256 fingerprint dedup lock (ENH-239, lib/core/session-ctx-fp.js — prevents PreCompact re-fire duplicate injection, 1h TTL, GC 30d/LRU 100), P1 PersistedOutputGuard (ENH-240, lib/core/context-budget.js — CC 10,000-char cap defense with 8,000-char hard cap + priority-preserved truncation), P3 hooks/hooks.json once: true ADR documented in docs/context-engineering.md (ENH-244), Iterate-discovered getUIConfig() bug fixed (3 new fields exposed in lib/core/config.js), CC v2.1.111+ recommended (72 consecutive compatible releases), 11 files +641 LOC changed, 25 new TCs + 43 regression TCs = 74 PASS / 0 FAIL, Match Rate 100%. Additional: 16 bug fixes consolidated from 10-agent QA Discovery methodology (11 core bugs B1~B11 + 5 Q10-review findings B12~B16), 24 new QA TCs, 239 total PASS / 1 FAIL (pre-existing), ENH-167 partial (BKIT_VERSION centralization in paths.js + MCP servers)updatePdcaStatus argument order fix (skill-post.js:229), P0 full-auto chain completion (generateAutoTrigger report/completed phase added to automation.js), P1 phantom feature prevention (pre-write.js active feature guard), P2 gap-detector analysis document auto-generation (gap-detector-stop.js), pdca-skill-stop.js [PDCA-COMPLETE] directive for report phase, CC v2.1.110+ recommended (71 consecutive compatible releases), 5 files ~54 LOC changedlib/pdca/session-title.js, phase-change-only emission (6→1 per message, ≈83% reduction), 3-way UI opt-out (ui.{sessionTitle,dashboard,contextInjection}.enabled in bkit.config.json), stale feature TTL (24h default) auto-cleanup; PreCompact decision:block on PDCA do/check/act phases; output-styles audit script (CC v2.1.107 regression #47482 defense); BKIT_VERSION dynamic lookup (Docs=Code); 3268/3268 tests PASS (99.6%, 0 FAIL), 69 consecutive compatible releases (v2.1.34~v2.1.108)if conditional field documentation (CC v2.1.85+), Enterprise org policy documentation, CC v2.1.86+ recommended, 52 consecutive compatible releases (v2.1.34~v2.1.86)lib/context/ 7 modules), Self-Healing agent (opus) for automated error recovery, Deploy skill with 3-environment state machine, PDCA Handoff Loss Fix Phase 2+3 (PRD→Code context preservation 30-40% → 75-85%), 11 infrastructure templates (ArgoCD, Terraform, observability), 72 lib modules, 37 skills, 32 agents, 59 scriptslib/core/paths.js), PDCA doc path registry, config cleanup (dead keys removed, missing keys added), state directory migration to .bkit/{state,runtime,snapshots}/, auto-migration with EXDEV fallback, 190 exports.bkit/agent-state.json for Studio IPCoutputStyles in plugin.json + 4th style bkit-pdca-enterprise/pdca skill with 8 actions (plan, design, do, analyze, iterate, report, status, next)Claude Code introduced Skill Evals in Skills 2.0—a framework for measuring skill quality through automated testing. bkit extends this concept into a complete skill lifecycle management system that answers a question no other plugin addresses: "Are my skills still worth keeping?"
Skill Evals run automated quality checks against skills by sending test prompts and comparing outputs against expected results. Think of them as unit tests for AI skills—they catch regressions when models update and measure whether a skill still adds value.
bkit builds three layers on top of Claude Code's native Evals:
| Layer | Claude Code Native | bkit Enhancement |
|---|---|---|
| Eval Execution | Basic eval runner | evals/runner.js with benchmark mode, 29 pre-built eval definitions |
| A/B Testing | Not available | evals/ab-tester.js compares skill performance across models (e.g., Sonnet 4.6 vs Opus 4.6) |
| Skill Classification | Not available | All 39 skills classified as Workflow / Capability / Hybrid with deprecation-risk scoring (v2.1.10 scope) |
evals/
├── config.json # Global settings (thresholds, classifications)
├── runner.js # Eval execution engine (CLI + module)
├── reporter.js # Markdown/JSON result reporting
├── ab-tester.js # Model comparison + parity testing
├── workflow/{10 skills}/ # Eval definitions for permanent skills
├── capability/{18 skills}/ # Eval definitions for model-dependent skills
└── hybrid/{1 skill}/ # Eval definitions for single dual-purpose skill
Not all skills age the same way. bkit classifies each skill to manage its lifecycle:
| Classification | Count | Purpose | What Evals Measure |
|---|---|---|---|
| Workflow | 17 | Process automation (PDCA, pipelines) | Quality regression only—these skills are permanent |
| Capability | 18 | Model ability augmentation (mockups, APIs) | Parity testing—can the model match this skill's output without it? |
| Hybrid | 1 | Both process + capability | Both regression and parity |
When a model upgrade makes a Capability skill redundant, the Model Parity Test detects it:
# Does the model now produce equivalent results without this skill?
node evals/ab-tester.js --parity phase-3-mockup --model claude-opus-4-6
# Compare skill performance between two models
node evals/ab-tester.js --skill pdca --modelA claude-sonnet-4-6 --modelB claude-opus-4-6
# Run all 29 skill evaluations
node evals/runner.js --benchmark
| Before (v1.5.9) | After (v1.6.1 with Evals) |
|---|---|
| 28 skills, no quality measurement | 37 skills, 29 with automated eval definitions |
| No way to know if a skill degraded after model update | Benchmark detects regression across all skills |
| Manual judgment on skill usefulness | Data-driven deprecation recommendations |
| Skills accumulate indefinitely | Skill lifecycle: create → eval → deprecate → remove |
| "Does this skill help?" is a guess | Parity test gives a quantified answer |
Skill Evals connect directly to bkit's PDCA workflow:
skill-creator/) generates new skills with eval templates pre-includedPhilosophy: bkit's third principle is "No Guessing." Skill Evals replace intuition with measurement—you never have to guess whether a skill is still earning its place in your workflow.
First time using Claude Code?
Start with bkit-starter!
- Beginner-friendly guide
- No programming experience required
- Build your first project hands-on
/plugin enable bkit-starterbkit is the advanced extension designed for users who have mastered bkit-starter.
| Requirement | Minimum Version | Notes |
|---|---|---|
| Claude Code | v2.1.78+ | Required. bkit v2.1.10 uses agent frontmatter (effort/maxTurns/disallowedTools), 21 hook events (24 blocks), 2 MCP servers + 16 tools, Clean Architecture Domain layer, and ${CLAUDE_PLUGIN_DATA}. Recommended: v2.1.117+ (75 consecutive compatible releases, includes Defense-in-Depth 4-Layer + Invocation Contract 226 assertions). |
| Node.js | v18+ | For hook script execution |
| Agent Teams (optional) | Set CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 |
Required only for CTO-Led Agent Teams feature |
Troubleshooting: If you see
"Failed to load hooks"error after installation, update Claude Code to the latest version:claude update
⚠️ CC v2.1.116 Users — Custom Skills Data Loss Warning
On CC v2.1.113+ first-run, the
~/.claude/skills/directory may be silently deleted (#51234).
- ✅ bkit plugin itself is unaffected (uses
${CLAUDE_PLUGIN_ROOT}/skills/)- ⚠️ If you keep user custom skills there, back up before upgrading:
cp -R ~/.claude/skills ~/.claude/skills.backup.$(date +%Y%m%d)- Full guide: CUSTOMIZATION-GUIDE.md — Custom Skills Data Loss Warning
Note: bkit is designed for Claude Code. For Gemini CLI, see bkit-gemini.
The easiest way to install bkit is through the Claude Code marketplace.
# Step 1: Add bkit marketplace
/plugin marketplace add popup-studio-ai/bkit-claude-code
# Step 2: Install bkit plugin
/plugin install bkit
Use /plugin command and navigate to Marketplaces tab to manage your plugin sources:

Navigate to Discover tab to browse and install available plugins:

| Plugin | Description | Best For |
|---|---|---|
| bkit | Full PDCA methodology + Claude Code mastery | Experienced developers |
| bkit-starter | Korean learning guide for beginners | First-time Claude Code users |
Keep your plugins up-to-date automatically by configuring auto-update in your settings:
// ~/.claude/settings.json
{
"plugins": {
"autoUpdate": true
}
}
Update Commands:
u in the Marketplaces view to update all pluginsr to remove a marketplaceSpace to toggle plugin selection in Discover viewbkit-claude-code/
├── .claude-plugin/
│ ├── plugin.json # Claude Code plugin manifest
│ └── marketplace.json # Marketplace registry
├── agents/ # Specialized AI agents
├── commands/ # CLI command definitions
├── skills/ # Domain knowledge
├── hooks/ # Event hooks (hooks.json)
├── evals/ # Skill eval definitions & runner
├── scripts/ # Hook execution scripts
├── servers/ # MCP servers (bkit-pdca, bkit-analysis)
├── lib/ # Shared utilities (128 modules across 15 subdirs — Clean Architecture 4-Layer)
├── output-styles/ # Level-based response formatting
├── templates/ # Document templates
└── bkit.config.json # Centralized configuration
After installing bkit via the marketplace, you can customize any component by copying it to your project's .claude/ folder.
Comprehensive Guide: See CUSTOMIZATION-GUIDE.md for detailed instructions on customizing bkit for your organization, including platform-specific paths, component examples, and license attribution requirements.
Claude Code searches for configuration files in this priority order:
.claude/ (highest priority - your customizations)~/.claude/# Step 1: Find the plugin installation location
ls ~/.claude/plugins/bkit/
# Step 2: Copy only the files you want to customize
mkdir -p .claude/skills/starter
cp ~/.claude/plugins/bkit/skills/starter/SKILL.md .claude/skills/starter/
# Step 3: Edit the copied file in your project
# Your project's .claude/skills/starter/SKILL.md will override the plugin's version
# Step 4: Commit to version control (optional)
git add .claude/
git commit -m "feat: customize bkit starter skill"
| Component | Location | Description |
|---|---|---|
| Skills | ~/.claude/plugins/bkit/skills/ |
Domain knowledge, context and slash commands (e.g., /pdca plan) |
| Agents | ~/.claude/plugins/bkit/agents/ |
Specialized AI assistants |
| Templates | ~/.claude/plugins/bkit/templates/ |
Document templates |
| Scripts | ~/.claude/plugins/bkit/scripts/ |
Hook scripts |
| Config | ~/.claude/plugins/bkit/bkit.config.json |
Central configuration |
/claude-code-learning
/starter # Static website (Starter level)
/dynamic # Fullstack with BaaS (Dynamic level)
/enterprise # Microservices with K8s (Enterprise level)
v2.0.3: Each phase now includes Interactive Checkpoints that pause for user confirmation before proceeding. Plan confirms requirements, Design presents 3 architecture options, Do confirms implementation scope, Check offers fix strategy choices.
/pdca pm {feature} # PM analysis & PRD generation (43 frameworks)
/pdca plan {feature} # Create plan document (Checkpoint 1-2)
/pdca design {feature} # Create design document (Checkpoint 3: 3 architecture options)
/pdca do {feature} # Implementation guide
/pdca analyze {feature} # Run gap analysis
/pdca iterate {feature} # Auto-fix with Evaluator-Optimizer pattern
/pdca report {feature} # Generate completion report
/pdca status # Check current PDCA status
/pdca next # Guide to next PDCA step
CTO-Led Agent Teams enable parallel PDCA execution with multiple AI agents orchestrated by a CTO lead agent.
# Start CTO Team for a feature
/pdca team {feature}
# Monitor team progress
/pdca team status
# Cleanup team resources
/pdca team cleanup
How it works:
Requirements:
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1Available Team Agents:
| Team | Agent | Model | Role |
|---|---|---|---|
| CTO | cto-lead | opus | Team orchestration, PDCA workflow management |
| CTO | frontend-architect | sonnet | UI/UX design, component architecture |
| CTO | product-manager | sonnet | Requirements analysis, feature prioritization |
| CTO | qa-strategist | sonnet | Test strategy, quality metrics coordination |
| CTO | security-architect | opus | Vulnerability analysis, auth design review |
| PM | pm-lead | opus | PM Team orchestration, PRD synthesis |
| PM | pm-discovery | sonnet | Opportunity Solution Tree analysis |
| PM | pm-strategy | sonnet | Value Proposition, Lean Canvas |
| PM | pm-research | sonnet | Personas, competitors, market sizing |
| PM | pm-prd | sonnet | PRD document generation |
PM Agent Team runs before the Plan phase to produce a comprehensive PRD (Product Requirements Document) through automated product discovery.
# Run PM analysis before planning
/pdca pm user-authentication
# Then proceed with PDCA planning (PRD auto-referenced)
/pdca plan user-authentication
How it works:
docs/00-pm/{feature}.prd.mdFrameworks: Based on pm-skills by Pawel Huryn (MIT License)
| Level | Description | Stack |
|---|---|---|
| Starter | Static websites, portfolios | HTML, CSS, JS |
| Dynamic | Fullstack applications | Next.js, BaaS |
| Enterprise | Microservices architecture | K8s, Terraform, MSA |

bkit is primarily designed for software development. However, some components can inspire structured workflows beyond coding:
| Component | Beyond Development Uses |
|---|---|
| PDCA Methodology | Project management, process improvement |
| Document Templates | Planning any structured project |
| Gap Analysis | Comparing any plan vs. actual outcome |
Note: For general writing, research, or non-technical tasks, plain Claude Code (without bkit) is better suited.
The bkit-system/ documentation is optimized for Obsidian's Graph View:
bkit-system/ as an Obsidian vaultCtrl/Cmd + G to open Graph ViewSee bkit-system/README.md for detailed instructions.
bkit automatically detects your language from trigger keywords:
| Language | Trigger Keywords |
|---|---|
| English | static website, beginner, API design |
| Korean | 정적 웹, 초보자, API 설계 |
| Japanese | 静的サイト, 初心者, API設計 |
| Chinese | 静态网站, 初学者, API设计 |
| Spanish | sitio web estático, principiante |
| French | site web statique, débutant |
| German | statische Webseite, Anfänger |
| Italian | sito web statico, principiante |
Claude Code supports configuring your preferred response language through the language setting in your settings file.
| File | Scope | Git Tracked |
|---|---|---|
.claude/settings.local.json |
Project (personal) | No (gitignored) |
.claude/settings.json |
Project (shared) | Yes |
~/.claude/settings.json |
User (global) | N/A |
Add the language key to any settings file:
{
"language": "korean"
}
| Language | Setting Value |
|---|---|
| English | "english" (default) |
| Korean | "korean" |
| Japanese | "japanese" |
| Chinese | "chinese" |
| Spanish | "spanish" |
| French | "french" |
| German | "german" |
| Italian | "italian" |
Note: Trigger keywords work in any language. The
languagesetting only affects Claude's response language.
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
admin team members can merge to mainbkit is not a productivity hack. It is an attempt to bring engineering discipline to AI-native development.
The software industry has spent decades refining how humans write code—version control, code review, CI/CD, testing pyramids. But when AI enters the development loop, most of that discipline evaporates. Developers prompt, accept, and ship. The feedback loop disappears. Documentation becomes an afterthought. Quality becomes a matter of luck.
bkit exists because we believe AI-assisted development deserves the same rigor as traditional engineering.
Process over output. A single feature built through proper planning, design, implementation, and verification is worth more than ten features hacked together. The PDCA cycle is not overhead—it is the product.
Verification over trust. AI generates plausible code. Plausible is not correct. Every implementation goes through gap analysis against its design document. If the match rate falls below 90%, the system iterates automatically. We do not ship hope.
Context over prompts. A well-structured prompt helps once. A well-structured context system helps every time. bkit's 128 lib modules (15 subdirs, Clean Architecture 4-Layer), 39 skills, and 36 agents exist to ensure the AI receives the right context at the right moment—not through clever prompting, but through systematic engineering.
Constraints over features. We intentionally limit what bkit does. Three project levels, not infinite configuration. A fixed 9-stage pipeline, not a customizable workflow builder. Opinionated defaults, not a framework for frameworks. Constraints eliminate decision fatigue and make the system learnable.
When you use bkit, you will write a plan document before writing code. You will generate a design specification before implementation. You will run gap analysis after every feature. You will produce a completion report that captures what was built, what was verified, and what was learned.
This is slower than prompting and shipping. It is also how software that lasts gets built.
"We do not offer a hundred features. We engineer each one through proper design and verification. That is the difference between a tool and a discipline."
Copyright 2024-2026 POPUP STUDIO PTE. LTD.
Licensed under the Apache License, Version 2.0. See LICENSE for details.
You must include the NOTICE file in any redistribution.
Made with AI by POPUP STUDIO