AgentOps

Local operating layer for coding agents across Claude, Codex, Cursor, and OpenCode.

AgentOps gives agents a shared ao control plane, lifecycle hooks, validation gates, and a repo-owned .agents/ corpus so work survives chat windows and vendor boundaries.

Install · Quick Start · Cross-Vendor · Why DevOps? · Skills · CLI · Doctrine · Docs

Install

Pick the runtime you use.

Claude Code

claude plugin marketplace add boshu2/agentops
claude plugin install agentops@agentops-marketplace

Codex CLI on macOS, Linux, or WSL

curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install-codex.sh | bash

Codex CLI on Windows PowerShell

irm https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install-codex.ps1 | iex

OpenCode

curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install-opencode.sh | bash

Other skills-compatible agents

npx skills@latest add boshu2/agentops --cursor -g

Restart your agent after install. Then type /quickstart in your agent chat.

The ao CLI is optional, but recommended. It unlocks repo-native bookkeeping, retrieval, health checks, and terminal workflows.

macOS

brew tap boshu2/agentops https://github.com/boshu2/homebrew-agentops
brew install agentops
ao version

Windows PowerShell

irm https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install-ao.ps1 | iex
ao version

You can also install the CLI from release binaries or build from source.

Concern	Answer
What it touches	Installs skills globally and registers runtime hooks when requested; agent work writes local bookkeeping to `.agents/`
Source code changes	None during install
Network behavior	Install and update paths fetch from GitHub; repo artifacts stay local unless you choose external tools or remote model runtimes
Telemetry	None required
Permission surface	Skills can run shell commands and read or write repo files during agent work, so install where you want agents to operate
Reversible	Remove the installed skill directories, delete `.agents/`, and remove hook entries from your runtime settings

Troubleshooting: docs/troubleshooting.md · Configuration: docs/ENV-VARS.md

What AgentOps Gives You

AgentOps gives your coding agent four things it does not have by default:

Layer	What changes
Bookkeeping	Learnings, findings, handoffs, and reusable context land in local `.agents/` — the private corpus. `ao compile`, `ao maturity --evict`, decay, and lint keep it from rotting. Removes the toil of re-explaining context every session.
Validation	`/pre-mortem`, `/vibe`, and `/council` challenge plans and code before they ship. Removes the toil of catching the same mistake twice.
Primitives	Skills, hooks, and the `ao` CLI give agents reusable building blocks. Removes the toil of re-implementing the same flow per agent.
Flows	`/research`, `/implement`, `/validation`, and `/rpi` compose those primitives end to end. Removes the toil of running the same multi-step process by hand.

Session 1, your agent spends two hours debugging a timeout bug. Session 15, a new agent finds the lesson in seconds because the repo kept it.

flowchart LR
    S[Session work] --> B[Bookkeeping]
    S --> V[Validation]
    B --> C[The corpus]
    V --> C
    C --> N[Next session]
    N --> S

All agent runtime state lives in local .agents/ — auditable and yours, but git-ignored by policy because it can churn and may contain sensitive session context. Plain text you can grep, diff, and review locally. Zero telemetry. Zero cloud dependency.

Those layers are runtime-neutral. Claude Code, Codex CLI, Cursor, and OpenCode can use the same corpus, validation packets, and operating discipline instead of trapping each workflow inside one vendor's chat.

Proof: the three-gap contract

AgentOps closes three failure modes most agent setups don't even name:

Gap	What fails without it	Closed by
Judgment	Plan looks coherent. Code passes tests. Both miss the edge case. No one challenged either.	`/pre-mortem` · `/vibe` · `/council`
Durable Learning	Auth bug fixed Monday. Same auth bug returns Wednesday. The lesson lived in a chat transcript.	`/retro` · `/forge` · `ao lookup`
Loop Closure	Code diff lands. No lesson extracted. No constraint hardened. Next session re-learns from scratch.	`/post-mortem` · finding compiler · `/evolve`

Each factor in the 12-factor doctrine closes one or more of these. Full contract: docs/context-lifecycle.md.

AgentOps Is the Cross-Vendor Operating Layer

The plugin is one entrypoint, not the product boundary. AgentOps is a local operating layer around coding agents: shared skills tell agents how to work, the ao CLI owns repo-native state and control-plane workflows, hooks keep lifecycle discipline active, and the daemon path moves that work toward always-on local operation.

Surface	What it does	Why it matters
Skills and plugins	Load AgentOps flows into Claude Code, Codex CLI, Cursor, and OpenCode	Agents get the same operating language across vendors
`ao` CLI	Searches, compiles, curates, and assembles repo context; runs RPI, factory, evolve, and daemon commands	The control plane lives outside any one chat window
Hooks	React to runtime events such as session start, user prompts, tool use, and stop	Lifecycle and validation discipline can fire automatically
`.agents/` corpus	Stores learnings, findings, handoffs, council reports, and run evidence locally	The durable asset belongs to the repo and team
Daemon path	Runs queued and scheduled local jobs through `ao daemon` surfaces as that layer matures	AgentOps can move from chat-invoked flows toward always-on operation

/council is the clearest proof of that system boundary. It is not just a review command; it is a way to make multiple agents and runtimes evaluate the same evidence and return one auditable verdict.

Command	What it demonstrates
`/council validate this PR`	The active runtime can spawn independent judges around one shared packet
`/council --mixed validate this PR`	Claude and Codex can receive the same evidence, apply the same perspectives, and hand their verdicts back to AgentOps for consolidation
`/council --preset=security-audit validate the auth system`	Expertise is configured by the operating layer, not left to a single model's default behavior
`/council --evidence --commit-ready validate the release plan`	The result becomes repo-local decision evidence, not just chat history

That is the deeper product shape: agents stay replaceable, vendors can cooperate, and the corpus plus control plane remain yours.

Why DevOps?

DevOps changed how we ship software by closing three loops: flow (work moves forward), feedback (work that breaks comes back fast), and continual learning (the system gets smarter with every cycle). The Three Ways. They are not metaphors. They are the architecture of every team that ships reliably under pressure.

Coding agents need the same architecture. They have prompts and weights — neither of which is an operations layer. Run an agent against a real codebase and you'll feel the gap immediately: no flow control between sessions, no feedback that survives compaction, no learning that compounds. Each session starts where every prior session started: zero.

AgentOps applies the Three Ways to coding agents:

DevOps Three Ways	AgentOps surface	What it means in practice
Flow	Primitives + Flows (`/research` → `/plan` → `/implement` → `/validation` → `/rpi`)	Work moves through scoped, auditable phases. No phase compresses into another.
Feedback	Validation gates that block, not advise (`/pre-mortem`, `/vibe`, `/council`)	Multi-model consensus catches errors before they propagate. Verdicts are recorded, not assumed.
Continual Learning	Bookkeeping + the knowledge flywheel (`/retro` → `/forge` → `ao inject` → next session)	Every session emits learnings. Learnings get scored, promoted, and decayed. Next session starts loaded.

Theoretical foundation lives in docs/the-science.md (Meadows' leverage points + DevOps Three Ways) and docs/brownian-ratchet.md (chaos + filter + one-way gate = net forward progress).

The lineage is direct: DevOps is what made software ship. AgentOps is what makes coding agents compound. Same shape, new substrate.

AgentOps and every harness like it gets absorbed into the model layer over time. Memory primitives, learning loops, even validation gates — frontier vendors will ship them natively. What stays yours is the corpus. AgentOps is the bridge tool that helps you build the moat now, before the harness layer commoditizes. See PRODUCT.md for the full thesis.

The lineage

Software shipped because we codified the work. Iteration. Test discipline. Pipelines. Toil reduction. Flow and waste. Each generation gave teams an artifact: the wiki (Ward Cunningham, 1995, in the same circle as XP), the runbook, the postmortem, the toil budget.

AgentOps gives your agents the same kind of artifact: the corpus — a typed, versioned, agent-readable wiki maintained alongside the code. Same lineage. New substrate.

The pattern is broader than code; the product is focused on coding agents.

Quick Start

Inside a repo, use the path that matches what you are trying to do.

Path	Run	Done when
First repo setup	`ao quick-start`, then `/quickstart`	AgentOps reports repo readiness and a next action
First validated change	`/rpi "a small goal"`	Discovery, implementation, validation, and learning closeout leave evidence in `.agents/`
Review something now	`/council validate this PR` or `/vibe recent`	You get a consolidated verdict and an evidence record in `.agents/` before shipping

New project? Use the guided CLI seed first:

ao quick-start     # Canonical
ao quickstart      # Stable alias

That command applies the repeatable core seed: .agents/, GOALS.md,
AgentOps instructions, starter knowledge, and readiness guidance. Use
/bootstrap after that when you want the product/operations layer:
PRODUCT.md, README.md, PROGRAM.md/AUTODEV.md, and optional hooks.

Already installed? Ask your agent for the next action:

/quickstart

If you installed the CLI, check your local setup:

ao doctor
ao demo

Full catalog: docs/SKILLS.md · Unsure what to run? Skill Router

See It Work

One command: validate a PR across vendors

> /council --mixed validate this PR

[council] evidence packet sealed -> 6 judges across 2 runtimes
[claude/judge-1] WARN - rate limiting missing on /login endpoint
[claude/judge-2] PASS - Redis integration follows middleware pattern
[codex/judge-1]  WARN - token bucket refill lacks jitter under burst
[codex/judge-2]  PASS - backoff bounds match retry policy
Consensus: WARN - fix /login rate limit and add refill jitter before shipping
Recorded: .agents/council/<run-id>/verdict.md

Full loop: research through post-mortem

> /rpi "add retry backoff to rate limiter"

[research]    Found 3 prior learnings on rate limiting
[plan]        2 issues, 1 wave
[pre-mortem]  Council validates the plan
[crank]       Executes the scoped work
[vibe]        Council validates the code
[post-mortem] Captures new learnings in .agents/
[flywheel]    Next session starts with better context

The point is not a bigger prompt. The point is a repo that remembers what worked.

Skills

Every skill works alone. Flows compose them when you want more structure.

Skill	Use it when
`/quickstart`	You want the fastest setup check and next action
`/council`	You want independent judges — optionally across Claude and Codex — to evaluate one evidence packet and return a consolidated verdict
`/research`	You need codebase context and prior learnings before changing code
`/pre-mortem`	You want to pressure-test a plan before implementation
`/implement`	You want one scoped task built and validated
`/rpi`	You want discovery, build, validation, and bookkeeping in one flow
`/vibe`	You want a code-quality and risk review before shipping
`/evolve`	You want a goal-driven improvement loop with regression gates
`/dream`	You want overnight knowledge compounding that never mutates source code

Full catalog - validation, flows, bookkeeping, and session skills

Validation: /council · /vibe · /pre-mortem · /post-mortem

Flows: /research · /plan · /implement · /crank · /swarm · /rpi · /evolve

Bookkeeping: /retro · /forge · /flywheel · /compile

Session: /handoff · /recover · /status · /trace · /provenance · /dream

Product: /product · /goals · /release · /readme · /doc

Utility: /brainstorm · /bug-hunt · /complexity · /scaffold · /push

Full reference: docs/SKILLS.md

Cross-runtime orchestration - mix Claude, Codex, Cursor, and OpenCode

Multi-runtime, one workflow. The same validation, research, delivery, and bookkeeping flows run whether the active worker is Claude Code, Codex, Cursor, or OpenCode.

One runtime leads a session. Another reviews the result. A third handles focused implementation. Adapters are runtime-specific. The contract is constant: independent context, auditable files, validation before promotion.

The `ao` CLI

The ao CLI is the repo-native control plane behind the skills. It handles retrieval, health checks, compounding, goals, and terminal workflows.

ao quick-start                            # Set up AgentOps in a repo
ao quickstart                             # Alias for quick-start
ao doctor                                 # Check local health
ao demo                                   # See the value path in 5 minutes
ao search "query"                         # Search session history and local knowledge
ao lookup --query "topic"                 # Retrieve curated learnings and findings
ao context assemble                       # Build a task briefing
ao rpi phased "fix auth startup"          # Run the phased lifecycle from the terminal
ao evolve --max-cycles 1                  # Run one autonomous improvement cycle
ao overnight setup                        # Prepare private Dream runs
ao metrics health                         # Show flywheel health

Full reference: CLI Commands

Advanced: Day Loop And Night Loop

Use /evolve when you want code improvement. It reads GOALS.md, fixes the worst fitness gap, runs regression gates, and records the cycle.

> /evolve

[evolve] GOALS.md loaded
[cycle-1] Worst gap selected
[rpi]     Implements the fix
[gate]    Tests and quality checks pass
[learn]   Post-mortem feeds the flywheel

Use /dream when you want knowledge compounding. It runs offline-style bookkeeping work over .agents/, reports what changed, and never mutates source code, invokes /rpi, or performs git operations.

> /dream start

[overnight] INGEST  harvest new artifacts
[overnight] REDUCE  dedup, defrag, close loops
[overnight] MEASURE corpus quality
[halted]    plateau reached

Morning report: .agents/overnight/<run-id>/summary.md

Run Dream overnight, then run Evolve in the morning against a fresher corpus. The model may be the same; the environment is smarter.

Competitive Positioning

Most tools optimize work within a session. AgentOps compounds across them. The bookkeeping and validation layer is the gap.

Tool	What it does well	What AgentOps adds
GSD	Fresh-context phased execution, recovery loops, runtime breadth	Cross-session bookkeeping, pre-build validation, the knowledge flywheel
Compound Engineer	Ideation, configurable reviewers, cross-runtime conversion	Automatic capture/scoring/injection, council validation, repo-native `ao` workflows
Spec Kit / Kiro	Spec-driven development and executable planning artifacts	Learning beyond specs: failures, decisions, retros, prevention rules
Superpowers	TDD discipline and autonomous work patterns	Memory, pre-mortems, validation across repeated sessions
Ruflo / Claude-Flow	High-scale swarm orchestration and MCP-heavy coordination	Local, auditable compounding around whatever executes the work

Detailed comparisons · Competitive radar

Docs

Topic	Where
Published site	boshu2.github.io/agentops
Start navigating	Docs index
New contributor orientation	Newcomer guide
Working with `.agents/`	Operator guide
Full skill catalog	Skills
CLI reference	CLI commands
Architecture	Architecture
Behavioral discipline	Behavior guide
FAQ	FAQ

Building docs locally. The site is built with MkDocs Material. Python 3.10+ is required; the dev toolchain is pinned in requirements-docs.txt.

scripts/docs-build.sh --serve    # live-reload dev server at http://127.0.0.1:8000
scripts/docs-build.sh --check    # strict build (mirrors what CI runs)
scripts/docs-build.sh            # build site to _site/

The first run creates .venv-docs/ and installs the toolchain via uv (preferred) or pip. The deploy workflow at .github/workflows/docs.yml runs the same mkdocs build --strict on every push to main and publishes to GitHub Pages.

The 12-Factor Doctrine

AgentOps is shaped by a set of public principles — the 12 factors of agent operations. Foundation, Flow, Knowledge, and Scale. Read them at 12factoragentops.com.

Tier	Factors
Foundation (I-III)	Context Is Everything · Track Everything in Git · One Agent, One Job
Flow (IV-VI)	Research Before You Build · Validate Externally · Lock Progress Forward
Knowledge (VII-IX)	Extract Learnings · Compound Knowledge · Measure What Matters
Scale (X-XII)	Isolate Workers · Supervise Hierarchically · Harvest Failures as Wisdom

The AgentOps product implements these principles through skills, the ao CLI, and local bookkeeping in .agents/. See each factor page at 12factoragentops.com/factors for the doctrine behind the mechanism.

Ready?

# 1. Install (pick your runtime above)
# 2. Run in your repo
ao quick-start
#    or: ao quickstart
# 3. Validate from your agent chat
/council validate this PR

Then explore the skills catalog, the ao CLI reference, and the 12-factor doctrine.

Contributing

See docs/CONTRIBUTING.md. Agent contributors should also read AGENTS.md and use bd for issue tracking.

License

Apache-2.0 · Docs · CLI Reference

council

AgentOps

Install

What AgentOps Gives You

Proof: the three-gap contract

AgentOps Is the Cross-Vendor Operating Layer

Why DevOps?

The lineage

Quick Start

See It Work

Skills

The `ao` CLI

Advanced: Day Loop And Night Loop

Competitive Positioning

Docs

The 12-Factor Doctrine

Ready?

Contributing

License

harvest

test

security

security-suite

oss-docs

AgentOps

Install

What AgentOps Gives You

Proof: the three-gap contract

AgentOps Is the Cross-Vendor Operating Layer

Why DevOps?

The lineage

Quick Start

See It Work

Skills

The ao CLI

Advanced: Day Loop And Night Loop

Competitive Positioning

Docs

The 12-Factor Doctrine

Ready?

Contributing

License

The `ao` CLI