npx skills add https://github.com/oimiragieo/agent-studio --skill slack-notificationsInstallieren Sie diesen Skill über die CLI und beginnen Sie mit der Verwendung des SKILL.md-Workflows in Ihrem Arbeitsbereich.
Portable multi-agent ecosystem for Claude Code.
Agent Studio packages agents, skills, rules, hooks, schemas, and validation tooling into a single repo that can run directly or be dropped into another project.
If you want a local-first, reproducible agent stack with strict validation and hybrid code search, this is it.
Getting Started · .claude/docs/GETTING_STARTED.md
Architecture · .claude/docs/ARCHITECTURE.md
Developer Workflow · .claude/docs/DEVELOPER_WORKFLOW.md
Hooks Reference · .claude/docs/HOOKS_REFERENCE.md
Memory System · .claude/docs/MEMORY_SYSTEM.md
Code Indexing · .claude/docs/CODE_INDEXING_DESIGN.md
Telegram Integration · .claude/docs/TELEGRAM_ARCHITECTURE.md
Agent Studio includes a background channel daemon that monitors Telegram and responds to messages using Claude. Inspired by clawhip and Claude Code's KAIROS assistant mode.
# 1. Configure (one-time)
# Add to your .env:
TELEGRAM_BOT_TOKEN=<token from @BotFather>
TELEGRAM_OWNER_ID=<your user ID from @userinfobot>
TELEGRAM_ALLOWED_USERS=<your user ID>
CHANNEL_AUTO_START=true
# 2. Verify config
/setup-telegram
# 3. Start monitoring
/enable-telegram
# 4. Stop monitoring
/disable-telegram
# 5. Restart daemon (without killing Claude session)
/restart-telegram
/help, /status, /memory, /dream, /tasks, /code, /usage, /insights, /personality, /schedule, /export, /pair, and more/code mission-aware coding — routes coding tasks through skill classification (16 agent types), builds feature specs, injects TDD workflow, grades results 0-100 against alignment rules[RALPH] tag), max 5 iterations[ULTRAWORK] tag)http://127.0.0.1:3101/status, /send, /history, /memory, /dreamPOST /webhook endpoint for GitHub, CI, and external event ingestion# Add TTS keys to .env:
ELEVENLABS_API_KEY=<key> # or OPENAI_API_KEY for fallback
# Verify: /setup-telegram-voice
# Enable: /enable-telegram-voice
Full docs: .claude/docs/TELEGRAM_ARCHITECTURE.md
Agent Studio v3.2.0 ships two tightly integrated capabilities: structured memory provenance and verified skill distribution.
CAT7 Memory extends the STM/MTM/LTM memory tiers with a 7-field schema that records concept, attributes, temporality, provenance, confidence, lineage, and embedding_refs on every record. The cat7-writer.cjs routes records automatically to the correct tier based on temporality. The MMP CLI (pnpm mmp:lineage, pnpm mmp:descendants) lets you walk and inspect the full derivation graph of any memory record, so agents can audit where a belief came from and which downstream records it influenced.
Skill Marketplace provides a verified distribution channel for skill packages. Packages are signed with HMAC-SHA256 and scored on a 4-tier trust ladder before installation. Path-traversal guards and a minimum-key-length policy prevent supply-chain abuse. Install a package with pnpm skill:install <package> — the installer verifies the signature, checks the trust score against SKILL_MARKETPLACE_MIN_TRUST, and unpacks only to the allowed skills directory.
Agent Studio v2.4.0 is the "production-grade" release. It addresses the two most-reported community pain points: opaque agent execution and unpredictable API spend.
Every agent spawn, skill invocation, and tool call now emits a structured OpenTelemetry GenAI event with parent_span_id and span_type. You can reconstruct the full call tree for any session.
# Inspect per-component token burn for a session
pnpm session:audit <session-id>
Output: a colored table showing token consumption broken down by agent, skill, and tool — no external observability service required.
Spend-guard auto-downgrade switches agents from sonnet to haiku when session cost approaches the configured ceiling:
# Set per-session spend ceiling (default: $5)
SPEND_GUARD_CEILING_USD=5
# Disable entirely
SPEND_GUARD=off
Before any agent spawn, the budget hook checks projected context size and warns before the session reaches the compression threshold:
# Warning threshold in tokens (default: 50000)
SPAWN_BUDGET_DEFAULT_CONTEXT=50000
# Hard-block spawns that exceed 1.6x threshold
SPAWN_BUDGET_HARD=on
| Variable | Default | Purpose |
|---|---|---|
SPAWN_BUDGET_DEFAULT_CONTEXT |
50000 |
Token threshold for spawn pre-flight warning |
SPAWN_BUDGET_HARD |
off |
Set on to hard-block over-budget spawns |
SPEND_GUARD_CEILING_USD |
5 |
Per-session cost ceiling before haiku downgrade |
SPEND_GUARD |
on |
Set off to disable spend-guard entirely |
See CHANGELOG.md and .claude/docs/HOOKS_REFERENCE.md for full details.
v3.0.0 introduces four breaking changes. The migration script handles most of them automatically.
# 1. Pull latest and install
git pull && pnpm install
# 2. Preview changes (no files written)
pnpm migrate:2x-to-3 --dry-run
# 3. Apply changes (backfills agent manifests, flags SSE transport)
pnpm migrate:2x-to-3
# 4. Review backups created for modified agents
# .claude/context/tmp/agents-pre-v3-migration/
# 5. Update any mcp.transport: "sse" entries in config to "streamable-http"
# (BC-1 — the script flags locations but does not rewrite config files)
# 6. Regenerate agent registry in v3 schema format
pnpm agents:registry
# 7. Enable enforcement when ready (optional — off by default)
# Set V3_MANIFEST_REQUIRED=on in .env
# 8. Verify
pnpm test:framework
| # | Change | Fix |
|---|---|---|
| BC-1 | mcp.transport: "sse" rejected |
Update to "streamable-http" in config |
| BC-2 | Agents without manifest: block fail startup |
Run pnpm migrate:2x-to-3 |
| BC-3 | Task() spawns require AIP token |
Router auto-injects; set AIP_TOKENS=off for dev |
| BC-4 | agent-registry.json v2 not auto-loaded |
Run pnpm agents:registry |
Full guide: docs/migration/v2-to-v3.md
debug-agent intent aliases now resolve to advanced-debugging, and the overlap-prone fallback keywords were trimmed so routing validation stays greenpnpm test completed cleanly at 3,063 top-level tests with 12,528 passing assertions and 0 failuresMEMORY_INJECTION_MAX_CHARS env var (default 3600, raise to 8000+)supersedes + archived fields on pattern/gotcha entries. Semantic matches create version chains instead of silent drops (arXiv:2603.19595)ROUTING_PRIORITY=semantic); keyword classification demoted to metadata/tiebreakerHIERARCHICAL_ROUTING=on)MODEL_ROUTER_ENABLED=on)parseSections() line-based fallback prevents 572KB bloat recurrencelogs/YYYY/MM/YYYY-MM-DD.md, 4-phase Dream-inspired consolidation (Orient/Gather/Consolidate/Prune), heuristic keyword extraction, idempotent processing with manifest tracking, session-end hook integration (48 tests)&, $(...), backtick substitutions) (76 tests)updatedInput for bash safety prefixes (set -euo pipefail injection on unsafe multi-line scripts), suppressOutput on security blocks to prevent context inflation, denial-based routing feedback with agent suggestions after repeated tool denials (10 tests)disallowedTools field (excludes tools from prompt assembly with conflict resolution), mcpServers scoping (limits MCP visibility per agent), fork_eligible boolean field in agent schema (29 tests)gh, webhook simulator, mention parser, task dispatcher, CI status reporter (152 tests)droid → agent, .factory-plugin → .claude-plugin across plugin system~/.claude/knowledge/ (119 tests)See CHANGELOG.md for full details.
Runtime: Node >=22.5.0, pnpm.
Windows Setup: Requires Python and C++ Build Tools installed for compiling native AST add-ons during setup.
Indexing Acceleration: Natively supports automatic Multi-GPU distribution for semantic indexing (dynamically spreading LanceDB embeddings across all detected NVIDIA GPUs via ONNX). Defaults gracefully to fully parallelized CPU parsing if GPUs are unavailable or disabled.
Agent Studio runs seamlessly on Windows PowerShell, WSL, macOS, and Linux.
Initialize the entire ecosystem (installs deps, compiles registries, indexes code):
pnpm run setup
Search immediately after indexing:
pnpm search:code "authentication logic" # hybrid text + semantic search (~5ms cached)
pnpm search:compress "how routing works" # search + compress + dedup pipeline
pnpm search:structure # project structure + deps + Mermaid diagram
pnpm search:tokens .claude/lib # token budget analysis + refactor recommendations
pnpm search:file .claude/lib/code-indexing/hybrid-lazy-indexer.cjs 1 60
Text search (pnpm search:code) works instantly even without the full index build.
Running code:index:reindex adds semantic ranking for concept-level queries.
Repeated queries are auto-cached (~5ms hit vs ~800ms miss). BM25 index auto-updates on file edits.
search:compress combines search + adaptive compression + memory dedup into a single command — use it when a topic spans many files and you need a compressed summary.
search:tokens shows file/directory sizes, token estimates, and recommends splitting oversized source files (>15K tokens) into smaller modules for better AI agent readability.
Some skills require external API keys. All are optional — core functionality works without them.
# Copy the example env file and add your keys
cp .env.example .env
| Variable | Used by | Notes |
|---|---|---|
OPENAI_API_KEY |
tts-generation (OpenAI TTS), transcription (cloud backend) |
Optional — local alternatives available |
ELEVENLABS_API_KEY |
tts-generation (ElevenLabs voices) |
Optional — OpenAI TTS or gTTS (free) as fallback |
EXA_API_KEY |
deep-research (enhanced semantic search) |
Optional — web search works without it |
Skills that work without any API key: transcription (local via faster-whisper), tts-generation (gTTS, free), browser-automation, diagram-generator, all code/routing skills.
Agent Studio dynamically supports Git Worktree isolation for dangerous/massive subagent tasks. The orchestrator spawns isolated-* agents (e.g., isolated-developer, isolated-architect) for high-risk or sweeping refactors. These agents inherently use the -w flag in Claude Code to sandbox their work in isolated branches—preventing race conditions during parallel execution.
Important for Worktrees: The ecosystem setup wizard automatically enables Git optimization (core.untrackedCache true and core.fsmonitor true). This prevents Git from hanging or triggering "too many active changes" warnings during massive parallel file generation or background vector indexing operations.
Agent Studio is designed to support Claude Code's Agent Teams feature for multi-session parallel coordination (Claude Code v2.1.32+, Opus 4.6 required). The Router-subordinate architecture allows the router to dispatch work to teammate agents running in parallel sessions. A WAL (Write-Ahead Log) memory synchronization protocol is planned to ensure safe concurrent writes to shared memory files during parallel execution. Enable via CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 and optionally set CLAUDE_CODE_SUBAGENT_MODEL for sub-agent cost optimization. Configure display mode via teammateMode in settings.json or the --teammate-mode CLI flag.
Agent Studio natively supports integrating with other headless LLM Code CLIs (Gemini, Codex, Cursor, and Claude Code). The multi-llm-consultant agent can dynamically detect which of these CLIs are authenticated on your system and distribute prompts in parallel. It also features a built-in llm-council skill that automatically runs a robust 3-stage deliberation protocol (independent completions -> anonymized peer review & ranking -> chairman synthesis) for complex architectural decisions.
SKILL.md definitions*.schema.json.claude/commands/*.mdAgent Studio includes several integrated subsystems built across four development phases:
| System | Path | Purpose |
|---|---|---|
| Mission Orchestrator | .claude/lib/mission/ |
Dispatch loop, handoff pipeline, milestone gates, state recovery |
| Plugin Marketplace | .claude/lib/plugins/ |
Manifest validation, 3-scope resolution, git marketplace, runtime loading |
| Headless Execution | .claude/lib/exec/ |
5-tier autonomy enforcement, multi-format output (JSON/markdown/SARIF/JUnit) |
| Code Review Pipeline | .claude/lib/review/ |
Diff parsing, P0-P3 severity classification, 8-criteria bug detection |
| Model Router | .claude/lib/routing/ |
Cost-aware model selection, budget engine with auto-downgrade chain |
| Readiness CLI | .claude/lib/readiness/ |
Project readiness scoring, configurable thresholds, 4-format reporting |
| Knowledge Graph | .claude/lib/memory/ |
Cross-repo federated query, relationship inference, portable exports |
| Observability CLI | .claude/lib/monitoring/ |
Unified log aggregation, alert management, cost tracking |
| Self-Evolving Skills | .claude/lib/evolution/ |
Usage tracking, pattern detection, suggestion generation, evolution triggers |
| GitHub Integration | .claude/lib/github/ |
gh CLI wrapper, webhook simulation, mention parsing, CI status reporting |
| Consensus Engine | .claude/lib/consensus/ |
Mixture-of-agents fan-out, multi-model consensus synthesis |
| Skill Auto-Creator | .claude/lib/evolution/ |
Session transcript analysis, autonomous skill generation, security scanning |
| Session FTS Index | .claude/lib/memory/ |
SQLite FTS5 full-text search over session JSONL logs |
| Process Registry | .claude/lib/workers/ |
Background process lifecycle, checkpoint/restore, stdout ring buffer |
Agent Studio's roadmap includes a structured multi-phase upgrade derived from analysing 8 external agent frameworks:
| Framework | Focus area |
|---|---|
| GSD (Get Shit Done) | Task discipline, atomic commits, deviation docs |
| BMAD-METHOD | Project constitution, workflow snapshots |
| CrewAI | Failure taxonomy, role-based routing |
| lossless-claw | Context compression, anomaly preservation |
| AgentRx | Agent fingerprinting, structured diagnostics |
| agency-agents | Review severity, code quality vocabulary |
| MetaClaw | Frontmatter parsing, skill metadata |
| awesome-llm-apps | Composable utility patterns |
The analysis produced 47 candidate features (12 P0, 25 P1, 10 P2). Phase 1 shipped 6 features:
| ID | Feature | Artifact |
|---|---|---|
| D8 | Configurable context thresholds | .env.example + spawn-token-guard.cjs |
| F1 | 10-category failure taxonomy | .claude/schemas/failure-taxonomy.schema.json |
| C4 | Review severity taxonomy | .claude/schemas/review-severity.schema.json |
| G1 | Agent fingerprinting | .claude/lib/utils/agent-fingerprint.cjs |
| D7 | Anomaly preservation | .claude/lib/utils/anomaly-detector.cjs |
| H1 | SKILL.md frontmatter parser | .claude/lib/utils/skill-frontmatter-parser.cjs |
Full implementation plan: .claude/context/plans/framework-upgrade-plan-2026-03-17.md
Agent Studio ships several features that enforce completion quality and reduce plan drift across agent pipelines.
A project constitution file (.claude/context/project-context.md) is auto-injected into spawn prompts. It carries operational constraints — scope boundaries, architecture conventions, and non-negotiables — so every spawned agent operates from the same baseline without needing them restated per-task.
A hook at .claude/hooks/session/analysis-paralysis-guard.cjs monitors consecutive read-only tool calls and fires a warning when an agent exceeds its tier threshold. Thresholds are agent-type-aware:
| Agent type | Read-only call limit |
|---|---|
executor |
5 |
analyst |
15 |
orchestrator |
20 |
hunter |
25 |
The must-haves schema (.claude/schemas/must-haves.schema.json) provides goal-backward verification. Planners declare truths (facts that must hold), artifacts (files that must exist), and key_links (cross-references) as acceptance criteria. The reflection-agent scores each task completion against the must_haves block.
When a developer agent needs to deviate from a plan, it documents the deviation — reason, scope change, impact — before making changes. This creates an audit trail and keeps planner state consistent with what was actually built.
The universal spawn template includes a criteria_met/criteria_failed block in TaskUpdate metadata. Every agent completion carries structured evidence of what passed and what did not, enabling downstream agents and the reflection pipeline to make data-driven decisions.
QA agents emit structured gap reports using the verification-gap schema (.claude/schemas/verification-gap.schema.json). Each gap has an ID (G1, G2...), severity (critical, high, medium, low), and a description. The planner ingests these reports and generates targeted fix tasks — closing the feedback loop between QA findings and implementation work.
Planners attach an estimated_tokens field to every task. Tasks projected to exceed 80K tokens are split before dispatch. This prevents agents from running into context overflow mid-task and avoids silent truncation.
A UserPromptSubmit hook (ccusage-statusline.cjs) parses Claude Code's JSONL session logs on every prompt and writes a live status file to .claude/context/runtime/ccusage-status.txt. The router reads this file and includes token usage in pipeline summaries.
The status tracks three layers of cost optimization:
[tokens] 57,685 today (in: 1,403 / out: 56,282) | Cost: $86.82
[cache] $316.97 saved | 66,701,262 reads, 7,961,389 writes
[compression] 18 events | 596.2KB freed (~152,627 tokens) | ~$0.76 saved
| Line | What it measures | Optimization layer |
|---|---|---|
[tokens] |
Actual API spend using real pricing tables | Raw cost |
[cache] |
Savings from Anthropic's prompt caching (90% discount on repeated context) | Server-side |
[compression] |
Tokens avoided by the framework's context compression pipeline | Client-side |
Pricing is calculated per-model using built-in rate tables (updated March 2026):
| Model | Input | Output | Cache Write | Cache Read |
|---|---|---|---|---|
| Opus 4.6 | $5.00/M | $25.00/M | $6.25/M | $0.50/M |
| Sonnet 4.6 | $3.00/M | $15.00/M | $3.75/M | $0.30/M |
| Haiku 4.5 | $1.00/M | $5.00/M | $1.25/M | $0.10/M |
Set CCUSAGE_MODEL=sonnet or CCUSAGE_MODEL=haiku to match your model. Defaults to opus.
Set CCUSAGE_STATUSLINE=off to disable.
Execution context is persisted using the workflow-snapshot schema (.claude/schemas/workflow-snapshot.schema.json). When a session is interrupted, the snapshot carries enough state for a new session to resume without re-running completed phases.
Pipelines emit standardized checkpoints (.claude/schemas/checkpoint-taxonomy.schema.json) at wave_complete, phase_gate, and quality_gate boundaries. Orchestrators use these to verify forward progress before advancing.
.claude/ # agents, skills, rules, hooks, tools, schemas, docs
.cursor/ # Cursor-specific assets
scripts/ # validation and maintenance scripts
tests/ # project and framework tests
.tmp/ # local debug/temp artifacts (not release docs)
Use this path if you are proposing changes to the ecosystem itself.
pnpm run setup
pnpm validate
pnpm validate:full
pnpm validate:schemas
pnpm validate:commands
pnpm validate:routing
pnpm test
pnpm test:framework
pnpm test:tools
pnpm test:code-indexing
pnpm lint
pnpm format:check
pnpm mission:init # scaffold new mission bundle
pnpm mission:validate <mission-path> # validate features.json + schemas
pnpm mission:lint <mission-path> # lint features for circular deps
pnpm mission:grade <mission-path> # grade against 17 alignment rules (0-100)
pnpm mission:audit <mission-path> # query audit trail
pnpm mission:status <mission-path> # feature vs assertion progress
Notes:
package.json scripts as the source of truth for runnable workflows.Use this path if you are running Agent Studio as an operational control plane.
pnpm agents:registry
pnpm skills:index
pnpm manifest:generate
pnpm routing:prototypes
pnpm mmp:lineage <record-id> # walk ancestry chain for a CAT7 memory record
pnpm mmp:descendants <record-id> # list all downstream records
pnpm skill:install <package> # install a verified skill package from the marketplace
pnpm memory:status
pnpm memory:health
pnpm worker:summary
pnpm integration:headless:json
pnpm validate:full
pnpm context:reset --scope soft --force
The memory path now supports two operating modes for spawned agents:
MEMORY_MODE=hybrid (default): legacy memory injection (gotchas/patterns/decisions/...).MEMORY_MODE=observational: injects observations_summary.md + recent rows from observations.jsonl.OBSERVATIONAL_MEMORY_ENABLED=off: kill switch that forces hybrid mode.Additional controls:
MEMORY_SUMMARY_BLOCK_MAX_TOKENS (default 400)MEMORY_RECENT_OBSERVATIONS_MAX_TOKENS (default 400)MEMORY_TIER_B_MAX_TOKENS (default 400)OBSERVATIONS_COMPACT_ON_SESSION_END=on (default)OBSERVATIONS_COMPACT_MAX=50 (default)OBSERVATIONS_CONTRADICTION_ENABLED=offOBSERVATIONS_CONTRADICTION_MAX_AGE_DAYS=90Primary reference:
.claude/docs/MEMORY_SYSTEM.mdOperational gates:
pnpm run test:memory:cipnpm run metrics:memory:slo:cipnpm run metrics:memory-cache:cipnpm run test:frameworkCI workflows:
.github/workflows/memory-ci.yml.github/workflows/memory-mvp-gate.ymlAgent Studio uses a hybrid lazy search model:
Setup:
# Build the full index (BM25 text + semantic vectors)
pnpm code:index:reindex # ~12 min with GPU, ~17 min CPU-only
# Enable semantic search in .env
HYBRID_EMBEDDINGS=on # text + semantic ranking (default after setup)
EMBED_SUBPROCESS=on # ONNX memory leak workaround (default)
Without code:index:reindex, text search still works but semantic/concept queries
(e.g. "authentication flow for refresh tokens") will return poor results.
Guidance:
pnpm search:code for broad discovery and ranked matches.pnpm search:structure for structure-oriented lookup.rg directly for strict literal/symbol matches and exact filters.| Tool/Mode | What it does best | Latency profile | Determinism | Token/output profile |
|---|---|---|---|---|
pnpm search:code "query" |
Conceptual discovery and ranked candidates | Fast (~0.2-0.7s on this repo) |
High | Compact ranked output (good for agents) |
pnpm search:code "ast:pattern" |
Structural intent with optional ast-grep refinement | Moderate (~0.18s warm daemon baseline, higher for explicit ast:) |
High if pattern is explicit | Compact, structure-aware candidates |
pnpm search:structure |
Repo map, entrypoints, dependency orientation | Fast one-shot structure pass | High | Very low output volume |
rg -F "literal" |
Exact symbol/literal lookup | Fastest (~15-35ms measured) |
Highest | Larger raw output unless scoped |
rga "query" |
Cross-file search (pdf/docs/archives) | Slower than rg |
High | Can be noisy; scope early |
rg → fzf |
Human interactive narrowing/selection | Interactive | Operator-dependent | Great for manual triage, not default agent path |
Selection contract:
pnpm search:code for discovery.rg -F for exact anchors before edits/refactors.ast: only when the question is structural (shape/pattern), not plain text intent.fzf optional and human-in-the-loop; do not make it a hard dependency of automated wrappers.Use daemon mode for repeated searches in active sessions.
# Start/inspect daemon
pnpm search:daemon:start
pnpm search:daemon:status
# Prewarm rg + LanceDB + semantic path
pnpm search:daemon:prewarm
# Run searches (daemon on by default)
pnpm search:code "authentication logic"
# Stop daemon when done
pnpm search:daemon:stop
Repeated or semantically similar queries are served from a local cache, avoiding redundant embedding lookups. The cache uses cosine similarity to match queries, so slight rephrasings still hit the cache.
After file edits, the BM25 text index updates incrementally (no full reindex needed).
| Variable | Default | Purpose |
|---|---|---|
SEARCH_CACHE_ENABLED |
on |
Semantic query cache (set to off to disable) |
SEARCH_CACHE_TTL_MS |
300000 |
Cache entry TTL (5 min) |
SEARCH_CACHE_SIMILARITY |
0.95 |
Cosine threshold for cache hit |
BM25_INCREMENTAL_UPDATE |
on |
Post-edit BM25 fast update (set to off to disable) |
Disable daemon or semantic mode when you need deterministic baselines:
# Direct (no daemon transport)
HYBRID_SEARCH_DAEMON=off pnpm search:code "authentication logic"
# Text-only (skip semantic ranking)
HYBRID_EMBEDDINGS=off pnpm search:code "authentication logic"
# Force semantic ranking
HYBRID_EMBEDDINGS=on pnpm search:code "authentication logic"
Daemon tuning toggles:
# Auto-prewarm on daemon startup
HYBRID_DAEMON_PREWARM=true pnpm search:daemon:start
# Idle timeout (ms) before daemon auto-exit
HYBRID_DAEMON_IDLE_MS=600000 pnpm search:daemon:start
# Custom daemon port
HYBRID_DAEMON_PORT=47653 pnpm search:daemon:start
Expected latency profile on this repo (Windows, measured):
search:daemon:prewarm: ~0.40s averageHYBRID_SEARCH_DAEMON=off) repeated CLI calls: ~0.73s averageIf you only remember one thing, remember this:
pnpm search:code "your query" to find likely files/snippets quickly.Skill({ skill: 'token-saver-context-compression' }) to compress/summarize evidence.[mem:...] and [rag:...].# 1) Discover
pnpm search:code "auth token refresh bug"
# 2) If context is too large, compress (inside agent flow via Skill)
# Skill({ skill: 'token-saver-context-compression' })
# 3) Persist useful outcomes (via MemoryRecord or write paths that trigger memory sync hooks)
# 4) Validate memory/search pipeline health
pnpm test:memory:ci
pnpm metrics:memory:slo:ci
This repo is still session/CI-driven, but now includes a background quality daemon you can run independently.
What it does:
.claude/context/runtime/artifact-quality-daemon-state.json.claude/context/runtime/remediation-queue.jsonlCommands:
# One cycle now
pnpm quality:daemon:run-once
# Continuous loop (foreground)
pnpm quality:daemon:start
# Inspect daemon heartbeat/state
pnpm quality:daemon:status
Key env var:
ARTIFACT_QUALITY_DAEMON_INTERVAL_MS (default 300000)Agent Studio includes a background heartbeat ecosystem that keeps the agent runtime healthy, indexed, informed, and reachable from your phone.
Heartbeat loops start automatically at each session. The router's Step 0.5 preflight reads
heartbeat-active.json and spawns heartbeat-orchestrator if any loops are missing or expired.
You can also start them manually with /heartbeat-start in your Claude Code session.
The 8 loops:
| Loop | Schedule | What it does |
|---|---|---|
| 0 — Auto-reschedule | Every 2 days | Re-registers any loops that expired (3-day Claude Code limit) |
| 1 — Continuous reflection | Every 2 hours | Extracts patterns from session transcripts; rotates memory when learnings.md exceeds 35KB |
| 2 — Agent evolution | Daily at 3am | Applies accumulated learnings to improve agent definitions |
| 3 — Morning briefing | 8am weekdays | Summarizes open issues, recent commits, and 2 priority tasks for the day |
| 4 — Codebase indexing | Every 4 hours | Keeps the hybrid BM25 + semantic search index fresh |
| 5 — Context drain | Every 15 minutes | Detects when the task pipeline is idle and prompts for /clear |
| 6 — Telegram polling | Every 2 minutes | Polls your Telegram bot for commands and routes them to agents |
| 7 — Research digest | Daily at 7am | Fetches ArXiv papers and Exa web results matching your configured topics |
All loops are session-scoped — they restart when you open a new terminal. Loop 0 prevents silent
expiry within a session by re-registering loops before the 3-day Claude Code limit is reached.
Full loop contracts and state files: .claude/docs/HEARTBEAT_STATE_CONTRACTS.md
Control Agent Studio from Telegram while the session is running.
Your bot: @Agent_studio_bot — already created and wired in.
Steps to connect your account:
Find your Telegram user ID: message @userinfobot on Telegram. It
replies with your numeric ID (e.g., 123456789).
Edit .env and set these three variables:
TELEGRAM_BOT_TOKEN=<your_token> # From @BotFather — already set if you followed setup
TELEGRAM_OWNER_ID=123456789 # Your numeric Telegram user ID
TELEGRAM_ALLOWED_USERS=123456789 # Comma-separated IDs allowed to use the bot
Start the heartbeat ecosystem (this activates Telegram polling as Loop 6):
/heartbeat-start
Open Telegram and message @Agent_studio_bot. Try /status
or /help to confirm the connection.
| Command | Who | What it does |
|---|---|---|
/help |
Anyone | List all commands |
/status |
Anyone | Active loops, pending tasks, last heartbeat time |
/tasks |
Anyone | Current task list with status |
/loops |
Anyone | Active heartbeat loops |
/logs |
Anyone | Last 20 session gap log entries |
/memory QUERY |
Anyone | Search recent learnings for a keyword |
/ask QUESTION |
Owner | Ask the AI a question and get a reply |
/spawn TYPE DESC |
Owner | Spawn an agent (general-assistant, researcher, technical-writer) |
/approve TASK_ID |
Owner | Two-step task approval (then /confirm TASK_ID within 60 seconds) |
/deny TASK_ID |
Owner | Cancel a pending task |
Owner-only commands require your Telegram user ID to match TELEGRAM_OWNER_ID.
Send any file in the Telegram chat to automatically convert it to Markdown and store it as agent memory:
| File Type | Extensions | Converted To |
|---|---|---|
| Documents | .pdf, .docx, .pptx, .xlsx | Structured Markdown |
| Web pages | .html, .htm | Clean Markdown |
| Data files | .csv, .json, .xml | Markdown tables |
| Images | .jpg, .png, .gif, .webp | Alt-text description |
| Audio | .mp3, .wav, .m4a, .ogg | Transcription (if supported) |
How it works:
Requirements: Python with markitdown: pip install 'markitdown[all]'
File size limit: 20MB (Telegram Bot API limit)
For the full list, see .env.example and .claude/docs/@ENVIRONMENT_CONFIG.md.
| Variable | Required | Description |
|---|---|---|
TELEGRAM_BOT_TOKEN |
For Telegram | Bot API token from @BotFather |
TELEGRAM_OWNER_ID |
For Telegram | Your Telegram numeric user ID (privileged commands) |
TELEGRAM_ALLOWED_USERS |
For Telegram | Comma-separated user IDs allowed to use the bot |
TELEGRAM_OWNER_USERNAME |
Optional | Your @username (no @ prefix, display only) |
TELEGRAM_OWNER_CHAT_ID |
Recommended | Numeric user ID (get from @userinfobot) |
ARXIV_KEYWORDS |
For research loop | Comma-separated ArXiv search topics |
EXA_MONITOR_TOPICS |
For research loop | JSON array of web monitoring topics |
.claude/ into the target repository.pnpm memory:init
pnpm agents:registry
pnpm routing:prototypes
cp .env.example .env
Common controls:
AGENT_STUDIO_ENVREFLECTION_ENABLEDDEBUG_HOOKSHYBRID_EMBEDDINGSSee .env.example and .claude/docs/@ENVIRONMENT_CONFIG.md.
If you want fast local terminal search tooling on Windows (non-admin), install rga and fzf via Scoop.
Install Scoop (non-admin PowerShell):
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
Invoke-RestMethod -Uri https://get.scoop.sh | Invoke-Expression
Install ripgrep-all + fuzzy finder + ast-grep:
# Install rga (ripgrep-all)
scoop install rga
# Install fzf
scoop install fzf
# Install ast-grep (includes `sg` shim)
scoop install ast-grep
Verify install:
rga --version
fzf --version
sg --version
Runtime discovery behavior:
node_modules/.bin, Scoop shims, and PATH.RG_BIN, AST_GREP_BIN, RGA_BIN, FZF_BIN).fzf is most useful as an interactive selector on top of rg/rga output.
It improves usability and reduces noise, but does not replace search engines.
For AI/automation, keep fzf optional; interactive prompts are non-deterministic for unattended runs.
Quick file+line picker with preview:
rg --line-number --no-heading --color=always "auth|token|session" . `
| fzf --ansi --delimiter ":" `
--preview "bat --color=always --style=numbers --highlight-line {2} {1}"
Search inside office/pdf/archive content (via rga) and narrow interactively:
rga --line-number --no-heading --color=always "invoice|receipt|policy" . `
| fzf --ansi --delimiter ":" `
--preview "bat --color=always --style=numbers --line-range=:300 {1}"
Advanced interactive ripgrep launcher (fzf reload pattern):
: | rg_prefix='rg --column --line-number --no-heading --color=always --smart-case' \
fzf --ansi --disabled \
--bind 'start:reload:$rg_prefix ""' \
--bind 'change:reload:$rg_prefix {q} || true'
AST + RG + fzf (structural triage workflow):
# 1) Structural file candidates
ast-grep -p "function `$NAME(`$$$) { `$$$ }" --lang javascript --files-with-matches . `
| fzf --ansi --delimiter ":" `
--preview "bat --color=always --style=numbers --line-range=:220 {}"
# 2) Then run exact literal checks inside chosen files
rg -F "function " <chosen-file>
Wrapper policy:
pnpm search:code non-interactive and deterministic for agents.fzf as an optional terminal UX layer for humans doing investigative triage.pnpm search:structure or pnpm search:code "ast:..." for agent structural queries; use sg directly for manual structural audits.Sources:
https://scoop.sh/https://github.com/phiresky/ripgrep-all?tab=readme-ov-file#scoophttps://github.com/junegunn/fzf (interactive ripgrep + reload)https://junegunn.github.io/fzf/tips/ripgrep-integration/ (official rg+fzf pattern)https://github.com/phiresky/ripgrep-all (rga + fzf integration notes)https://github.com/phiresky/ripgrep-all/wiki/fzf-Integration (rga-fzf notes)https://ast-grep.github.io/guide/pattern-syntax.html (ast-grep pattern language)https://ast-grep.github.io/reference/cli.html (ast-grep CLI options)https://github.com/sharkdp/bat (fzf preview examples)When a session produces unexpected behavior, reduce the raw Claude Code debug log to signal-only lines:
pnpm debug:reduce # Auto-find most recent ~/.claude/debug/*.txt, copy to .tmp/, and reduce to signal-only lines
The reduced file lands at .tmp/<session-id>-reduced.txt. Kept lines include [ERROR], [WARN], failed/blocked/timeout messages, and stack traces. Repeated identical lines are collapsed.
You can also pass an explicit path:
pnpm reduce-debug-log -- .tmp/session-abc.txt
pnpm reduce-debug-log -- .tmp/session-abc.txt --output .tmp/session-abc.cleaned.txt
The debug-log-analysis skill (Skill({ skill: 'debug-log-analysis' })) documents the full structured workflow for working with these reduced logs.
476 active skills across 21 categories. Full details:
.claude/context/artifacts/catalogs/skill-catalog.mdInvoke any skill:
Skill({ skill: 'name' })
| Skill | Description |
|---|---|
| tdd | TDD with RED/GREEN/REFACTOR cycle |
| debugging | Systematic 4-phase root cause investigation |
| smart-debug | AI-assisted hypothesis ranking and structured instrumentation |
| debug-log-analysis | Structured debug log analysis for Claude Code sessions |
| ripgrep | Enhanced code search with ES module support |
| code-quality-expert | Clean code principles and refactoring |
| code-analyzer | Static analysis and complexity metrics |
| code-semantic-search | Semantic code search with vector index |
| code-structural-search | AST-based structural pattern matching |
| verification-before-completion | Evidence-based completion gate function |
| subagent-driven-development | Implementation via autonomous subagents with two-stage review |
| requesting-code-review | Dispatch structured two-stage code review |
| receiving-code-review | Process and act on code review feedback |
| best-practices-guidelines | Cross-cutting development best practices |
| Skill | Description |
|---|---|
| brainstorming | Structured ideation with convergence |
| plan-generator | Implementation plan generation |
| prd-generator | Product requirements document creation |
| architecture-review | System architecture analysis |
| complexity-assessment | Task complexity classification |
| diagram-generator | Mermaid diagram generation |
| wave-executor | EPIC-tier batch pipeline orchestration via fresh Bun processes |
| sparc-methodology | SPARC methodology workflow |
| spec-critique | Specification review and gap analysis |
| spec-gathering | Requirements elicitation |
| spec-init | Specification bootstrapping |
| dispatching-parallel-agents | Parallel agent dispatch patterns |
| ralph-loop | Autonomous iteration via Stop hook loop with verification gate |
| Skill | Description |
|---|---|
| security-architect | OWASP/STRIDE/AI threat modeling |
| auth-security-expert | OAuth 2.1 and JWT security patterns |
| static-analysis | Semgrep and CodeQL pipelines |
| variant-analysis | Vulnerability variant discovery |
| semgrep-rule-creator | Custom Semgrep rule authoring |
| binary-analysis-patterns | Binary analysis and reverse engineering |
| memory-forensics | Memory forensics workflows |
| differential-review | Security-focused diff review |
| insecure-defaults | Insecure default detection |
| content-security-scan | Content security scanning |
| audit-context-building | Security audit context assembly |
| fix-review | Security fix regression verification |
| yara-authoring | YARA rule authoring for threat detection |
| medusa-security | Medusa security patterns |
| Skill | Description |
|---|---|
| terraform-infra | Terraform IaC with safety controls |
| docker-compose | Docker Compose workflows |
| k8s-manifest-generator | Kubernetes manifest generation |
| sentry-monitoring | Sentry error monitoring setup |
| kafka-development-practices | Kafka patterns and best practices |
| monorepo-and-tooling | Monorepo setup and tooling |
| cloud-devops-expert | Cloud DevOps workflows |
| container-expert | Container orchestration patterns |
| Skill | Description |
|---|---|
| typescript-expert | TypeScript type systems and patterns |
| python-backend-expert | Python backend development |
| go-expert | Go idioms and patterns |
| nodejs-expert | Node.js patterns and tooling |
| java-expert | Java development |
| rust-expert | Rust safety patterns |
| php-expert | PHP development |
| elixir-expert | Elixir/OTP patterns |
| cpp | C++ development |
| poetry-rye-dependency-management | Python dependency management (Poetry/Rye) |
| modern-python | Modern Python with uv/ruff/ty |
| Skill | Description |
|---|---|
| react-expert | React patterns and hooks |
| nextjs-expert | Next.js App Router and RSC |
| svelte-expert | SvelteKit patterns |
| vue-expert | Vue 3 Composition API and Pinia |
| angular-expert | Angular patterns |
| astro-expert | Astro framework |
| qwik-expert | Qwik resumability patterns |
| solidjs-expert | SolidJS fine-grained reactivity |
| graphql-expert | GraphQL schema and resolvers |
| htmx-expert | HTMX hypermedia patterns |
| webmcp-browser-tools | WebMCP browser-side tool exposure to AI agents |
| starknet-react-rules | StarkNet React blockchain integration |
| drizzle-orm-rules | Drizzle ORM patterns |
| convex-development-general | Convex backend development |
| Skill | Description |
|---|---|
| vercel-deploy | Zero-auth Vercel deployment for 20+ frameworks |
| vercel-ai-sdk-best-practices | Vercel AI SDK streaming patterns |
| web-perf | 5-phase Core Web Vitals audit workflow |
| next-upgrade | Next.js upgrade migration |
| next-cache-components | Next.js caching strategies |
| shadcn-ui | shadcn/ui component integration |
| enhance-prompt | AI prompt enhancement patterns |
| Skill | Description |
|---|---|
| ios-expert | iOS SwiftUI development |
| android-expert | Android Compose development |
| flutter-expert | Flutter cross-platform development |
| expo-framework-rule | Expo framework patterns |
| tauri-native-api-integration | Tauri native API integration |
| mobile-first-design-rules | Mobile-first design patterns |
| nativewind-and-tailwind-css-compatibility | NativeWind Tailwind compatibility |
| nativescript | NativeScript patterns |
| Skill | Description |
|---|---|
| database-architect | Database schema design |
| database-expert | Database query optimization |
| data-expert | Data engineering patterns |
| text-to-sql | Natural language to SQL conversion |
| large-data-with-dask | Large dataset processing with Dask |
| Skill | Description |
|---|---|
| doc-generator | Technical documentation generation |
| writing-skills | TDD applied to skill authoring |
| readme | README generation patterns |
| summarize-changes | Change summary generation |
| markitdown-converter | Convert files to Markdown (PDF, DOCX, XLSX, images, audio) |
| Skill | Description |
|---|---|
| commit-validator | Conventional commit validation |
| git-expert | Advanced Git workflows |
| github-ops | GitHub operations and PR workflows |
| finishing-a-development-branch | Branch completion checklist |
| using-git-worktrees | Isolated development workspaces |
| smart-revert | Safe revert with impact analysis |
| Skill | Description |
|---|---|
| research-synthesis | Multi-source research and synthesis |
| skill-creator | Create new skills |
| skill-updater | Update existing skills to production-ready status |
| agent-creator | Create new agents |
| agent-updater | Update existing agents |
| workflow-creator | Create new workflows |
| workflow-updater | Update existing workflows |
| hook-creator | Create new hooks |
| template-creator | Create new templates |
| schema-creator | Create new schemas |
| rule-creator | Create new rules |
| command-creator | Create new commands |
| tool-creator | Create new framework tools |
| artifact-integrator | Integrate artifacts into framework |
| artifact-updater | Update existing artifacts |
| Skill | Description |
|---|---|
| context-compressor | Context window compression |
| token-saver-context-compression | Search-aware context compression with MemoryRecord |
| memory-quality-auditor | Memory file quality audit |
| session-handoff | Cross-session handoff artifacts |
| task-management-protocol | Task tracking and structured handoff |
| track-management | Work unit lifecycle management |
| context-degradation | Context degradation detection |
| framework-context | Framework context loading |
| recommend-evolution | Framework evolution recommendations |
| assimilate | External repository assimilation |
| creation-feasibility-gate | Pre-creation feasibility check |
| compliance-policy-check | Policy compliance validation |
| troubleshooting-regression | Regression diagnosis and fix verification |
| memory-search | Semantic memory search |
| insight-extraction | Knowledge extraction from context |
| Skill | Description |
|---|---|
| checklist-generator | Quality checklist generation |
| proactive-audit | Proactive framework audit after pipeline changes |
| response-rater | Agent response quality rating |
| test-generator | Automated test code generation |
| accessibility | Accessibility audit and fixes |
| eval-harness-updater | Evaluation harness maintenance |
| qa-workflow | Systematic QA validation with fix loops |
| agent-evaluation | Agent capability evaluation |
| strict-user-requirements-adherence | Requirements traceability |
| property-based-testing | Property-based test generation |
| behavioral-loop-detection | Detect agent behavioral loops via Jaccard similarity scoring |
| judge-verification | Independent LLM judge evaluation with 4-dimension scoring |
| error-recovery-escalation | 5-level structured error recovery: retry → nudge → replan → fallback → force-done |
| Skill | Description |
|---|---|
| thinking-tools | Structured self-reflection checkpoints |
| sequential-thinking | Dynamic step-by-step hypothesis reasoning |
| consensus-voting | Multi-perspective decision voting |
| swarm-coordination | Multi-agent swarm patterns |
| interactive-requirements-gathering | Guided requirements elicitation |
| planning-with-files | File-based planning patterns |
| context-driven-development | Context-aware development workflow |
| pipeline-reflection-ux | Pipeline reflection UX patterns |
| Skill | Description |
|---|---|
| jira-pm | Jira project management |
| linear-pm | Linear project management |
| medusa | Medusa e-commerce platform |
| dynamic-api-integration | Dynamic API integration patterns |
| project-onboarding | Project onboarding workflow |
| github-mcp | GitHub MCP integration |
| arxiv-mcp | arXiv paper retrieval |
| slack-notifications | Slack notification patterns |
| gemini-cli-security | Gemini CLI security audit patterns |
| Skill | Description |
|---|---|
| incident-runbook-templates | Incident runbook templates |
| on-call-handoff-patterns | On-call handoff protocols |
| postmortem-writing | Blameless postmortem writing |
| Skill | Description |
|---|---|
| scientific-skills | Scientific computing (parent with 139 sub-skills) |
| Skill | Description |
|---|---|
| advanced-elicitation | Advanced prompt elicitation techniques |
| ai-ml-expert | AI/ML patterns and best practices |
| agent-tool-design | Agent tool API design |
| api-development-expert | REST API development patterns |
| ask-questions-if-underspecified | Requirements clarification |
| sharp-edges | Known codebase hazard patterns |
| webapp-testing | Playwright browser automation testing |
| stale-module-pruner | Stale module detection and pruning |
| skill-discovery | Skill discovery and selection |
| code-style-validator | Programmatic AST-based style validation |
| dry-principle | DRY enforcement patterns |
| async-operations | Async/await patterns and anti-patterns |
If Claude Code subagents crash immediately on spawn with an API size limit error (e.g., "Prompt is too long" or saturating the 200,000 token limit without executing), ensure that the .claudeignore file is present in the repository root.
By default, the Claude Code CLI actively scopes any massive Markdown files (CHANGELOG.md, README.md, GETTING_STARTED.md) and data directories located in the repository root into its invisible system context payload. The .claudeignore file securely blocks this eager-loading behavior, freeing up an estimated 65,000+ tokens and preventing instant crashes.
.claude/context/ stores runtime artifacts and persistent operational memory..tmp/ contains temporary/debug outputs and should not be treated as product documentation.