Claude Code plugin for Elixir/Phoenix/LiveView — 20 specialist agents, Iron Laws enforcement, and Tidewave MCP integration. Plan features with parallel research agents, execute with automatic verification, review with 4-agent parallel audits, and capture learnings as reusable knowledge.
npx skills add https://github.com/oliver-kriska/claude-elixir-phoenix --skill skill-monitorInstallieren Sie diesen Skill über die CLI und beginnen Sie mit der Verwendung des SKILL.md-Workflows in Ihrem Arbeitsbereich.
Claude Code is great. But it doesn't know that assign_new silently skips on reconnect, that :float will corrupt your money fields, or that your Oban job isn't idempotent.
This plugin does. It coordinates 20 specialist agents that plan, implement,
review, and verify your Elixir/Phoenix code in parallel -- each with domain
expertise, fresh context, and enforced Iron Laws
that catch the bugs your tests won't.
# You describe the feature. The plugin figures out the rest.
/phx:plan Add real-time comment notifications
# 4 research agents analyze your codebase in parallel.
# A structured plan lands in .claude/plans/comment-notifications/plan.md
# Then:
/phx:work .claude/plans/comment-notifications/plan.md
# Implements task by task. Compiles after each change.
# Stops cold if code violates an Iron Law.
/phx:review
# 4 specialist agents audit in parallel:
# idioms, security, tests, compilation.
# Deduplicates findings. Flags pre-existing issues separately.
No prompt engineering. No "please check for N+1 queries." The plugin auto-loads
the right domain knowledge based on what files you're editing and enforces rules
that prevent the mistakes Elixir developers actually make in production.
┌─────────────────────────────────────────────────────────────────────┐
│ ⚗ Elixir/Phoenix Plugin for Claude Code │
│ │
│ ┌──────────┬──────────┬──────────┬──────────┬──────────┐ │
│ │ 20 │ 40 │ 96 │ 18 │ 22 │ │
│ │ Agents │ Skills │ Refs │ Hooks │Iron Laws │ │
│ └──────────┴──────────┴──────────┴──────────┴──────────┘ │
│ │
│ AGENTS COMMANDS │
│ ───────────────────── ────────────────────────── │
│ Orchestrators (opus) Workflow │
│ workflow-orchestrator /phx:plan /phx:work │
│ planning-orchestrator /phx:review /phx:full │
│ parallel-reviewer /phx:compound /phx:quick │
│ context-supervisor /phx:brief /phx:triage │
│ │
│ Reviewers (sonnet) Investigation & Debug │
│ elixir-reviewer /phx:investigate /phx:trace │
│ testing-reviewer /ecto:n1-check /phx:perf │
│ security-analyzer /ecto:constraint-debug │
│ iron-law-judge /lv:assigns │
│ │
│ Architecture (sonnet) Analysis & Review │
│ liveview-architect /phx:audit /phx:verify │
│ ecto-schema-designer /phx:techdebt /phx:boundaries │
│ phoenix-patterns-analyst /phx:pr-review /phx:challenge │
│ otp-advisor /phx:research /phx:document │
│ │
│ Investigation (sonnet/haiku) Knowledge (auto-loaded) │
│ deep-bug-investigator liveview-patterns ecto-patterns │
│ call-tracer elixir-idioms security │
│ xref-analyzer phoenix-contexts oban │
│ verification-runner testing deploy tidewave │
│ │
│ Domain (sonnet) Hooks │
│ oban-specialist auto-format · auto-compile │
│ deployment-validator iron-law-verify · security-scan │
│ hex-library-researcher debug-stmt-detect · error-critic │
│ web-researcher progress-tracking · block-danger │
│ │
│ ─────────────────────────────────────────────────────────── │
│ 22 Iron Laws · Tidewave MCP · plan→work→verify→review→compound │
│ github.com/oliver-kriska/claude-elixir-phoenix │
└─────────────────────────────────────────────────────────────────────┘
v2.8.0 -- 41 skills, 20 agents, 8-dimension quality eval, autoresearch self-improvement loop. Feedback welcome via issues.
# In Claude Code, add the marketplace
/plugin marketplace add oliver-kriska/claude-elixir-phoenix
# Install the plugin
/plugin install elixir-phoenix
git clone https://github.com/oliver-kriska/claude-elixir-phoenix.git
# Option A: Add as local marketplace
/plugin marketplace add ./claude-elixir-phoenix
/plugin install elixir-phoenix
# Option B: Test plugin directly
claude --plugin-dir ./claude-elixir-phoenix/plugins/elixir-phoenix
New to the plugin? Run the interactive tutorial:
/phx:intro
It walks through the workflow, commands, and features in 6 short sections (~5 min).
Skip to any section with /phx:intro --section N.
# Just describe what you need — the plugin detects complexity and suggests the right approach
> Fix the N+1 query in the user dashboard
# Plan a feature with parallel research agents, then execute
/phx:plan Add email notifications for new comments
/phx:work .claude/plans/email-notifications/plan.md
# Full autonomous mode — plan, implement, review, capture learnings
/phx:full Add user profile avatars with S3 upload
# 4-agent parallel code review (idioms, security, tests, compilation)
/phx:review
# Quick implementation — skip ceremony, just code
/phx:quick Add pagination to the users list
# Structured bug investigation with 4 parallel tracks
/phx:investigate Timeout errors in the checkout LiveView
# Project health audit across 5 categories
/phx:audit
The plugin auto-loads domain knowledge based on what files you're editing
(LiveView patterns for *_live.ex, Ecto patterns for schemas, security rules for auth code)
and enforces Iron Laws that prevent common Elixir/Phoenix mistakes.
The plugin implements a Brainstorm, Plan, Work, Verify, Review, Compound lifecycle. Each phase produces artifacts in a namespaced directory:
/phx:brainstorm → /phx:plan → /phx:work → /phx:verify → /phx:review → /phx:compound
│ │ │ │ │ │
↓ ↓ ↓ ↓ ↓ ↓
interview.md plans/{slug}/ (in namespace) (in namespace) (in namespace) solutions/
.claude/plans/{slug}/ -- plan, research, reviews, progress, scratchpad.[x] = done, [ ] = pending. /phx:work finds the first unchecked task and continues.Every plan gets its own directory with all related artifacts:
.claude/
├── plans/{slug}/ # Everything for ONE plan
│ ├── plan.md # The plan itself (checkboxes = state)
│ ├── research/ # Research agent output
│ ├── reviews/ # Review findings (individual tracks)
│ ├── summaries/ # Compressed multi-agent output
│ ├── progress.md # Session progress log
│ └── scratchpad.md # Auto-written decisions, dead-ends, handoffs
├── reviews/ # Ad-hoc reviews (no plan context)
└── solutions/ # Compound knowledge (reusable across plans)
No more scattered files across .claude/planning/, .claude/progress/, .claude/reviews/. One plan, one directory, everything together.
The plugin uses 20 agents organized into 3 tiers:
┌──────────────────────────────┐
│ Orchestrators (opus model) │
│ Coordinate phases, spawn │
│ specialists, manage flow │
└──────────┬───────────────────┘
│
┌──────────────────────┼──────────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌───────────────────┐ ┌────────────────────┐
│ workflow- │ │ planning- │ │ parallel- │
│ orchestrator │ │ orchestrator │ │ reviewer │
│ (full cycle) │ │ (research phase) │ │ (review phase) │
└───────────────┘ └───────────────────┘ └────────────────────┘
│ │
┌──────────┼──────────┐ ┌──────┼──────┐
▼ ▼ ▼ ▼ ▼ ▼
┌──────────┐ ┌────────┐ ┌──────┐ ... 4 specialist
│ liveview │ │ ecto │ │ web │ review agents
│ architect│ │ schema │ │ rsch │
└──────────┘ └────────┘ └──────┘
│
┌──────────┴──────────┐
▼ ▼
┌────────────┐ ┌──────────────┐
│ context- │ │ Orchestrator │
│ supervisor │ ───► │ reads ONLY │
│ (haiku) │ │ the summary │
└────────────┘ └──────────────┘
Orchestrators (opus) -- Primary workflow coordinators, security-critical analysis.
Specialists (sonnet) -- Domain experts, secondary orchestrators, judgment-heavy tasks. Sonnet 4.6 achieves near-opus quality at sonnet pricing.
Lightweight (haiku) -- Mechanical tasks: verification, compression, dependency analysis.
When an orchestrator spawns 4-8 research agents, their combined output can exceed 50k tokens -- flooding the parent's context window. The context-supervisor solves this using an OTP-inspired pattern:
┌────────────────────────────────────────────────────┐
│ Orchestrator (thin coordinator, ~10k context) │
│ Only reads: summaries/consolidated.md │
└──────────────────┬─────────────────────────────────┘
│ spawns AFTER workers finish
┌──────────────────▼─────────────────────────────────┐
│ context-supervisor (haiku, fresh 200k context) │
│ Reads: all worker output files │
│ Applies: compression strategy based on size │
│ Validates: every input file represented │
│ Writes: summaries/consolidated.md │
└──────────────────┬─────────────────────────────────┘
│ reads from
┌─────────────┼─────────────┐
▼ ▼ ▼
worker 1 worker 2 worker N
research/ research/ research/
patterns.md security.md liveview.md
How compression works:
| Total Output | Strategy | Compression | What's Kept |
|---|---|---|---|
| Under 8k tokens | Index | ~100% | Full content with file list |
| 8k - 30k tokens | Compress | ~40% | Key findings, decisions, risks |
| Over 30k tokens | Aggressive | ~20% | Only critical items |
The supervisor also deduplicates -- if two agents flag the same issue
(e.g., both the security analyzer and code reviewer find a missing
authorization check), it merges them into one finding with both sources cited.
Used by: planning-orchestrator (research synthesis), parallel-reviewer (review deduplication), audit skill (cross-category analysis).
When you run /phx:plan Add real-time notifications:
1. planning-orchestrator analyzes your request
│
2. Spawns specialists IN PARALLEL based on feature needs:
├── phoenix-patterns-analyst (always -- scans your codebase)
├── liveview-architect (if UI/real-time feature)
├── ecto-schema-designer (if database changes needed)
├── security-analyzer (if auth/user data involved)
├── oban-specialist (if background jobs needed)
├── web-researcher (if unfamiliar technology)
└── ... up to 8 agents
│
3. Each agent writes to plans/{slug}/research/{topic}.md
│
4. context-supervisor compresses all research into one summary
│
5. Orchestrator reads the summary + synthesizes the plan
│
6. Output: plans/{slug}/plan.md with [P1-T1] checkboxes
When you run /phx:review:
1. parallel-reviewer collects your git diff
│
2. Delegates to 4 EXISTING specialist agents:
├── elixir-reviewer → Idioms, patterns, error handling
├── security-analyzer → SQL injection, XSS, auth gaps
├── testing-reviewer → Test coverage, factory patterns
└── verification-runner → mix compile, format, credo, test
│
3. Each writes to plans/{slug}/reviews/{track}.md
│
4. context-supervisor deduplicates + consolidates
│
5. Output: plans/{slug}/summaries/review-consolidated.md
Just describe what you need. The plugin auto-detects complexity and suggests the right approach:
> Fix the N+1 query in the dashboard
Claude: This is a simple fix (score: 2). I'll handle it directly.
Or use /phx:quick to skip ceremony:
/phx:quick Add pagination to the users list
Use /phx:plan to create an implementation plan, then /phx:work to execute it:
/phx:plan Add email notifications for new comments
The plugin will:
When starting implementation, the plugin recommends a fresh session for plans with 5+ tasks. The plan file is self-contained, so no context from the planning session is needed:
# In a new Claude Code session:
/phx:work .claude/plans/email-notifications/plan.md
Use deep research planning:
/phx:plan Add OAuth login with Google and GitHub --depth deep
This spawns 4+ parallel research agents, then produces a detailed plan.
For security-sensitive features, the plugin will ask clarifying questions
before proceeding. Or use /phx:full for fully autonomous development.
After implementing, run a review:
/phx:review
Four parallel agents check your code (idioms, tests, security, compilation). If blockers are found, the plugin asks whether to replan or fix directly:
Review found 2 blockers:
1. Missing authorization in handle_event -- security risk
2. N+1 query in list_comments -- performance issue
Options:
- Replan fixes (/phx:plan --existing)
- Fix directly (/phx:work)
- Handle myself
Run a comprehensive audit with 5 parallel specialist agents:
/phx:audit # Full audit
/phx:audit --quick # 2-3 minute pulse check
/phx:audit --focus=security # Deep dive single area
/phx:audit --since HEAD~10 # Audit recent changes only
The audit scores your project across 5 categories (architecture, performance, security, tests, dependencies) and produces an actionable report.
For hands-off development:
/phx:full Add user profile avatars with S3 upload
Runs the complete cycle: plan (with research), work, verify, review. After review fixes, re-verifies before cycling back. Captures learnings on completion.
/phx:plan creates a self-contained plan file with all implementation details/phx:work in a fresh session to maximize context spacePlan checkboxes are the state. If a session ends mid-work:
# Just run /phx:work on the same plan -- it finds the first [ ] and continues
/phx:work .claude/plans/my-feature/plan.md
When a feature has 10+ tasks across different domains, the plugin offers to split into multiple plan files:
Created 3 plans (14 total tasks):
1. .claude/plans/auth/plan.md (5 tasks -- login, register, reset)
2. .claude/plans/profiles/plan.md (4 tasks -- avatar, bio, settings)
3. .claude/plans/admin/plan.md (5 tasks -- dashboard, roles)
Recommended order: 1 -> 2 -> 3
Execute each plan separately with /phx:work.
After fixing a bug or receiving a correction:
/phx:learn-from-fix Fixed N+1 query -- always preload associations in context functions
This updates the plugin's common-mistakes.md knowledge base so the same mistake is prevented in future sessions.
The plugin enforces critical rules and stops with an explanation if code would violate them:
LiveView: No database queries in disconnected mount. Use streams for lists >100 items. Check connected?/1 before PubSub subscribe.
Ecto: Never use :float for money. Always pin values with ^ in queries. Separate queries for has_many, JOIN for belongs_to.
Oban: Jobs must be idempotent. Args use string keys. Never store structs in args.
Security: No String.to_atom with user input. Authorize in every LiveView handle_event. Never use raw/1 with untrusted content.
OTP: No process without a runtime reason. Supervise all long-lived processes.
Elixir: Declare @external_resource for compile-time files. Wrap third-party library APIs behind project-owned modules. Never use assign_new for values refreshed every mount.
| Command | Description |
|---|---|
/phx:full <feature> |
Full autonomous cycle (plan, work, verify, review, compound) |
/phx:brainstorm <topic> |
Adaptive requirements gathering before planning |
/phx:plan <input> |
Create implementation plan with specialist agents |
/phx:plan --existing |
Enhance existing plan with deeper research |
/phx:work <plan-file> |
Execute plan tasks with verification |
/phx:review [focus] |
Multi-agent code review (4 parallel agents) |
/phx:compound |
Capture solved problem as reusable knowledge |
/phx:triage |
Interactive triage of review findings |
/phx:document |
Generate @moduledoc, @doc, README, ADRs |
/phx:learn-from-fix <lesson> |
Capture lessons learned |
/phx:brief <plan> |
Interactive plan walkthrough |
/phx:perf |
Performance analysis with specialist agents |
/phx:pr-review |
Address PR review comments |
| Command | Description |
|---|---|
/phx:intro |
Interactive plugin tutorial (6 sections, ~5 min) |
/phx:init |
Initialize plugin in a project (auto-activation rules) |
/phx:help |
Interactive command advisor — recommends the right command |
/phx:quick <task> |
Fast implementation, skip ceremony |
/phx:investigate <bug> |
Systematic bug debugging (4 parallel investigation tracks) |
/phx:research <topic> |
Research Elixir topics on the web |
/phx:verify |
Run full verification (compile, format, credo, test) |
/phx:permissions |
Scan sessions, recommend safe Bash permissions |
/phx:trace <function> |
Build call trees to trace function flow |
/phx:boundaries |
Analyze Phoenix context boundaries with mix xref |
/phx:examples |
Practical examples and pattern walkthroughs |
/ecto:constraint-debug |
Debug Ecto constraint violations |
| Command | Description |
|---|---|
/ecto:n1-check |
Detect N+1 query patterns |
/lv:assigns <file> |
Audit LiveView assigns for memory issues |
/phx:techdebt |
Find technical debt and refactoring opportunities |
/phx:audit |
Full project health audit with 5 parallel agents |
/phx:challenge |
Rigorous review mode ("grill me") |
| Agent | Model | Memory | Role |
|---|---|---|---|
| workflow-orchestrator | opus | project | Full cycle coordination (plan, work, review) |
| planning-orchestrator | opus | project | Parallel research agent coordination |
| parallel-reviewer | opus | -- | 4-agent parallel code review |
| deep-bug-investigator | sonnet | -- | 4-track parallel bug investigation |
| call-tracer | sonnet | -- | Parallel call tree tracing |
| security-analyzer | opus | -- | OWASP vulnerability scanning |
| context-supervisor | haiku | -- | Multi-agent output compression |
| verification-runner | haiku | -- | mix compile, format, credo, test |
| iron-law-judge | sonnet | -- | Pattern-based Iron Law detection |
| xref-analyzer | haiku | -- | Module dependency analysis |
| hex-library-researcher | sonnet | -- | Hex.pm library evaluation |
| liveview-architect | sonnet | -- | Component structure, streams, async patterns |
| ecto-schema-designer | sonnet | -- | Migrations, data models, query patterns |
| phoenix-patterns-analyst | sonnet | project | Codebase pattern discovery |
| elixir-reviewer | sonnet | -- | Code idioms, patterns, conventions |
| testing-reviewer | sonnet | -- | ExUnit, Mox, LiveView test patterns |
| oban-specialist | sonnet | -- | Worker idempotency, error handling |
| otp-advisor | sonnet | -- | GenServer, Supervisor, process design |
| deployment-validator | sonnet | -- | Docker, Kubernetes, Fly.io config |
| web-researcher | sonnet | -- | ElixirForum, HexDocs, GitHub research |
Agents with project memory build up knowledge across sessions
in .claude/agent-memory/<agent-name>/. Orchestrators remember
architectural decisions; pattern analysts skip redundant discovery.
These load automatically based on file context -- no commands needed:
| Skill | Triggers On |
|---|---|
elixir-idioms |
OTP/BEAM code, GenServer, Supervisor, Task |
phoenix-contexts |
Context modules, router, plugs, controllers |
liveview-patterns |
*_live.ex, mount, handle_event, streams |
ecto-patterns |
Schemas, migrations, Repo calls, changesets |
testing |
*_test.exs, factories, test support |
oban |
Oban workers, perform/1, queue config |
security |
Auth, sessions, CSRF/CSP, input validation |
deploy |
Dockerfile, fly.toml, runtime.exs, releases |
tidewave-integration |
Runtime debugging, live process inspection |
intent-detection |
First message routing to /phx: commands |
compound-docs |
Solution documentation lookups |
When your Phoenix app runs with Tidewave, the plugin automatically detects it and uses runtime tools:
# Add to mix.exs
{:tidewave, "~> 0.1", only: :dev}
# Add to endpoint.ex (in dev block)
plug Tidewave
Available runtime tools: execute Elixir code, run SQL queries, get docs for your exact dependency versions, introspect Ecto schemas, read application logs.
PRs welcome! See CLAUDE.md for full conventions.
Every PR must pass the CI quality gate (lint + test + eval). Run locally before pushing:
make help # Show all available commands
make eval # Quick: lint + score changed skills/agents only
make eval-all # Full structural: all 41 skills + all 20 agents
make eval-fix # Auto-fix lint + show failures + suggest autoresearch
make test # 52 pytest tests for eval framework
make ci # Full CI: lint + test + eval (same as GitHub Actions)
The eval framework scores skills across 8 dimensions and agents across 5 dimensions.
Skills must score >= 0.95 to pass. Run make eval-all for details.
references/ for details. Must include Iron Laws, "Use when..." in description.disallowedTools: Write, Edit, NotebookEdit for reviewers, permissionMode: bypassPermissions always.npm run lintnpm run eval before mergingnpm run eval:fix to auto-detect and fix quality issuesThe plugin includes an internal eval framework (lab/eval/) that scores all skills and agents. When quality drops, the autoresearch loop can fix it:
# Score everything, show failures, get auto-fix command
npm run eval:fix
# Or run the autoresearch loop directly (targets weakest skill, fixes one issue per iteration)
claude -p 'Run autoresearch...' --allowedTools 'Edit,Read,Write,Bash,Glob,Grep'
The eval framework uses 24 deterministic Python matchers + haiku-based behavioral trigger testing. See lab/eval/ for details.
The plugin includes session analysis tools that help identify improvement opportunities.
If you use this plugin (or work on Elixir/Phoenix projects with Claude Code),
you can analyze your own sessions to find patterns that the plugin should handle better.
Setup:
git clone https://github.com/oliver-kriska/claude-elixir-phoenix.gitclaude mcp add ccrider -- npx @neilberkman/ccriderAvailable tools (dev-only, not shipped with the plugin):
# Tier 1: Discover sessions and compute deterministic metrics
/session-scan
/session-scan --project myapp
# Tier 2: Qualitative analysis of high-signal sessions
/session-deep-dive
/session-deep-dive --date 2026-03-01
# Trends: Windowed aggregates (7d/30d/all) from metrics ledger
/session-trends
/session-trends --compare baseline
# Skill effectiveness monitoring (requires session-scan data)
/skill-monitor # Dashboard: all skills
/skill-monitor --improve # Generate improvement recommendations
Each analysis report includes a Plugin Improvement Opportunities section that identifies:
Share these findings in issues or PRs to help make the plugin better for everyone.
This plugin was built with insights from these articles, repositories, and tools:
MIT