npx skills add https://github.com/bdambrosio/Cognitive_workbench --skill mc-observe-entitiesCLI を使用してこのスキルをインストールし、ワークスペースで SKILL.md ワークフローの使用を開始します。
A research framework for autonomous agents with incremental planning, persistent memory, and tool use.
The chat-mode subproject (src/chat/) is becoming the primary interface. It now provides:
process_text, web search, fetch_text (full-page extraction), respond. Per-iteration trace written before any post-turn LLM work, plus a live CLI status line ("thinking…" → "using search…") that overwrites in place so the user sees progress during long LLM calls.memories collection with categorized recall (fact / preference / commitment), auto-RAG injection at turn start, and post-turn reflection that suppresses writes from hypothetical / roleplay / counterfactual frames. Discourse update + reflection run in a background single-worker executor so the response publishes without waiting on slow LLM-bound side effects.concerns collection separate from memories. Three categories (one_shot / durable / derived) with independent per-concern firing parameters generated by reflection: cadence_days (firing rhythm), lifetime_days (decay tau), instruction (the action to take). Lifecycle is lazy: weight decays toward satisfied; recurrence-detection at write time promotes one_shot → durable on re-emission and revives satisfied concerns when the user returns to the topic. Surfaced in the system prompt as an "Active concerns" block, distinct from memories.concern: <text> — impulse: <instruction>. Engagement clears the firing cycle. Phase C (impulse as parallel input to the ReAct loop) is deferred until we see whether reflection's auto-generated impulses are sensible in real sessions.api_key field naming an env var triggers Bearer-auth POST to any OpenAI-compatible endpoint (MIMO, OpenRouter, OpenAI, hosted vLLM, …); legacy server shortcuts still work.The full executive-node architecture described below — continuous OODA planner, incremental planner, cognitive graph, sensors — remains operational, but future development is shifting to chat mode. Sensors are now the main piece still to be ported before executive-mode components can be retired — concerns landed in chat as of this update (Phase A storage + Phase B-display firing). Cognitive graph integration of the chat trace and entity modeling are likely later moves; world model and tool model are less relevant given chat's small fixed tool set. This is a stated intention, not a present-day rewrite; the executive path is fully usable today and is what runs by default for jill-infospace*.yaml scenarios. Chat mode runs from jill-chat*.yaml scenarios via the same launcher.
Cognitive Workbench is experimental research software for studying LLM-based cognitive architectures. It prioritizes inspectable agent behavior and fast iteration over stability.
Two coupled loops sit at the core. A continuous OODA planner maintains strategic context across cycles — choosing at each turn whether to submit a goal, update a concern, ask, say, reflect, or sleep — rather than resetting reasoning every tick. When it launches a goal, an incremental planner takes over and interleaves LLM reasoning with tool execution, generating one step at a time and adapting to real results. Every OODA event, decision, action, and outcome is recorded as a typed node in a persistent cognitive graph that serves as both long-term memory and a reflective computational trace.
User: "goal: Find recent papers on multi-agent coordination"
│
┌──────────▼──────────┐
│ OODA Planner │ Continuous strategic loop — context persists
│ (ooda_planner.py) │ across cycles. Actions: submit-goal,
│ │ update-concern, say, ask, reflect, sleep, ...
│ ┌───────────────┐ │ Event-action history with progressive rollup
│ │ Observe/Orient│ │
│ │ Decide → Act │ │ Writes every stage into the cognitive graph
│ └───────────────┘ │
└──────────┬──────────┘
│ submit-goal
┌──────────▼──────────┐
│ Incremental Planner │ Stage 0: Retrieve context (FAISS + graph)
│ │ Stage 1: Analyze + select tools
│ ┌───────────────┐ │ Stage 2: Generate code → Execute → Evaluate
│ │ Reason → Act │──│──────► repeat until done
│ │ ← Observe │ │
│ └───────────────┘ │ Reflect: learn from execution trace
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ Infospace Executor │ Primitives + Tools
│ │ Notes + Collections + Relations
│ search-web, say, │ FAISS semantic search
│ create-note, ... │ Persistent memory
└─────────────────────┘
┌─────────────────────┐
│ Cognitive Graph │ Typed nodes (event, assessment, decision,
│ (cognitive_graph.py)│ goal_launch/outcome, concern_change, ...)
│ │ FAISS-backed semantic search + BFS subgraph
│ │ expansion; idle-time consolidation
└─────────────────────┘
consolidation nodes (explorer guide)active → satisfied (with per-concern revisit timer) → back to active when the timer expires; abandoned is the only terminal state. Homeostatic time-pressure keeps seeded concerns alive; the OODA planner's update-concern action adjusts weight/status/notes directly, and an LLM-activation path lets derived-concern reasoning nominate without waiting for an activation thresholdgoal: prefix; schedule them for manual, automatic, recurring, or daily-at-time executiongit clone https://github.com/bdambrosio/Cognitive_workbench.git
cd Cognitive_workbench
python3 -m venv zenoh_venv
source zenoh_venv/bin/activate
pip install -r requirements.txt
Option A — Local GPU (SGLang):
scenarios/jill-infospace.yaml and set sgl_model_path to your preferred model. SGLang can be finicky, sorry, but use of @function makes reasoning loop so much faster.scenarios/jill-infospace-vllm.yaml and set vllm_model_path to your preferred model.Option B — Cloud API (no GPU needed):
export OPENROUTER_API_KEY="sk-or-v1-..." # from openrouter.ai
Alt Model for semantic processing:
Some tools, like refine, extract-struct, filter-semantic, assess, perform complex semantic processing of text (e.g. extracting field from json). If your basic llm isn't up to the task, you can provide a heavier weight model for these to use:
alt_llm_config:
openrouter_model_path: "qwen/qwen3-235b-a22b-2507"
source zenoh_venv/bin/activate
cd src
python3 launcher.py ../scenarios/jill-infospace.yaml --cli --resource-browser
# Or with the web UI:
python3 launcher.py ../scenarios/jill-infospace.yaml --ui --resource-browser
# Or for OpenRouter:
python3 launcher.py ../scenarios/jill-infospace-openrouter.yaml --ui --resource-browser
The browse tool requires the agent-browser CLI (Rust binary, not a Python package):
cargo install agent-browser # if you have Rust/cargo
# or download a prebuilt binary from https://github.com/vercel-labs/agent-browser/releases
Skip this if you don't need browser automation — all other tools work without it.
Open http://localhost:3000 and submit a goal via the + Goal button:
Find and summarize recent papers on transformer architectures
See Getting Started for full setup details, environment variables, and troubleshooting.
The system provides three web-facing components plus an optional browser extension. See the UI Guide for full details.
The default view is an interactive D3 force-directed graph centered on the agent. Nodes represent the agent, its goals, concerns, notes, and variable bindings — sized and colored by activation level. Click any node to inspect it in the side panel.
The bottom dock bar provides controls for chat, goal entry, execution control (stop, continuous, LLM toggle), and links to the other UI components.
An OODA pulse overlay shows the agent's cognitive cycle in real time — expanding colored rings indicate Observe (blue), Orient (yellow), Decide (orange), and Act (green) phases.
A text-oriented alternative with a scrollable action log, character sidebar with tabs (Plan, Bindings, Goals, Plans, State, Schedule, Tasks), and direct text input for goals and chat.
Browse, view, edit, and delete Notes, Collections, and Concerns — the agent's working memory. Two-panel layout with a resource list and content viewer. The Concerns tab shows user and derived concerns with activation, weight, revisit interval, and status/delete actions (replaces the previous standalone Task Manager).
A Chrome extension that captures page visits and feeds them to the agent via the browser-visits sensor. Install by loading the browser_extension/ directory as an unpacked extension.
inform) it assembles live context — concerns, goals, sensors, cognitive-graph slices, character/capabilities — and emits one JSON action per cycle. Chat and alerts take a fast path but receive a summary of ongoing OODA activity for awarenesssubmit-goal, the Executive Node hands off to the Incremental Planner, which retrieves context (FAISS + entity-augmented + cognitive-graph subgraph), then loops:
search-web, stock-price, create-note, etc.)mentions edges that improve retrieval over timeupdate-concern rather than waiting for activation triage/done, /next, /bye). ToM covers every peer (trust, competence, goals, emotional state); the Companion Model runs only for the user and captures the "how are they right now" picture that shapes engagement style| Scenario | Mode | World | Backend |
|---|---|---|---|
jill-chat.yaml |
Chat (primary) | Chat-only world | OpenAI-compatible local server |
jill-chat-vllm.yaml |
Chat | Chat-only world | vLLM (local GPU) |
jill-chat-mimo.yaml |
Chat | Chat-only world | MIMO cloud (unified api_key form) |
jill-infospace.yaml |
Executive (legacy) | Core infospace | SGLang (local GPU) |
jill-infospace-openrouter.yaml |
Executive (legacy) | Core infospace | OpenRouter (cloud) |
jill-infospace-anthropic.yaml |
Executive (legacy) | Core infospace | Anthropic Claude |
jill-infospace-openai.yaml |
Executive (legacy) | Core infospace | OpenAI |
jill-infospace-vllm.yaml |
Executive (legacy) | Core infospace | vLLM (local GPU) |
jill-fs.yaml |
Executive | File system | SGLang |
jill-fs-openrouter.yaml |
Executive | File system | OpenRouter (cloud) |
jill-minecraft.yaml |
Executive | Minecraft 3D world | SGLang |
jill-osworld.yaml |
Executive | Desktop automation | SGLang |
jill-scienceworld.yaml |
Executive | Science simulation | SGLang |
jack-and-jill.yaml |
Executive | Multi-agent | SGLang |
See Configuration for details on each.
Cognitive_workbench/
├── README.md # This file
├── BACKGROUND.md # Research philosophy
├── requirements.txt # Python dependencies
├── docs/ # Detailed documentation
├── scenarios/ # Scenario YAML files + runtime data
├── browser_extension/ # Chrome extension for page visit tracking
└── src/
├── launcher.py # Entry point — dispatches by scenario `mode`
├── chat/ # Chat-mode subproject (becoming primary)
│ └── chat_loop.py # ReAct loop + status line, memories + concerns collections,
│ # reflection (frame-aware), recurrence promotion, concern firing,
│ # fetch_text, unified cloud LLM, background post-turn executor
├── executive_node.py # Main tick coordinator, fast-path chat, goal lifecycle (legacy)
├── ooda_planner.py # Continuous OODA planner (legacy)
├── incremental_planner.py # Inner goal planner (legacy)
├── infospace_executor.py # Primitives + tool execution
├── infospace_resource_manager.py # Notes/Collections/Relations + FAISS (shared by chat and executive)
├── entity_index.py # NER extraction, entity index, graph integration
├── cognitive_graph.py # Typed event graph — reflective computational trace
├── conversation_store.py # Dialog lifecycle, archival, session backfill
├── discourse.py # Theory of Mind + Companion Model templates
├── world_model.py # Bayesian recency-weighted knowledge
├── fastapi_action_display.py # Web UI (Activation Field + Classic)
├── resource_browser.py # Resource Browser UI
├── goal_scheduler.py # Autonomous goal scheduling
├── concern_triage.py # Concern nomination paths (activation, orient, LLM)
├── user_concern_model.py # User concerns with recurring lifecycle
├── derived_concern_model.py # Agent-derived concerns + revisit timers
├── sensor_runner.py # Sensor scheduling and execution
├── sensors/ # Sensor implementations
│ ├── browser-visits/ # Browser page visit sensor
│ └── rss-watcher/ # RSS feed monitor
├── tools/ # Core tools (search-web, run-script, etc.)
├── world-tools/ # World-specific tools (minecraft, fs, etc.)
├── static/ui/ # Activation Field frontend (HTML/JS/CSS)
├── scripts/ # Shell scripts for run-script tool
└── utils/ # Shared utilities
| Document | Description |
|---|---|
| Getting Started | Installation, credentials, LLM backend setup, first run |
| Architecture | Core cognitive architecture — incremental planner, OODA loop, infospace memory |
| OODA as Incremental Planner | Continuous strategic planner: action schemas, context assembly, event-action history (decisions log) |
| Cognitive Graph Spec | Typed nodes/edges, FAISS semantic index, consolidation, reflective trace (explorer) |
| Concerns Architecture | User + derived concerns, revisit lifecycle, homeostatic pressure, triage paths (user concern model) |
| UI Guide | Activation Field, Classic UI, Resource Browser (with Concerns tab), sensors |
| Goals & Scheduling | Goal submission (goal: prefix), scheduled goals, daily-at-time, autonomous execution |
| Envisioning & QC | Conversational envisioning, reflection, failure recovery, missing affordance monitoring |
| Tools & Primitives | Infospace primitives, tool catalog, run-script, plan tools |
| Configuration | Scenario YAML reference, available scenarios, directory structure |
| Tool Development | Creating new tools (Skill.md + tool.py) |
| Background | Research motivation and philosophy |
| Contributor Guidelines | Code style, testing, commit conventions |
See src/AGENTS.md for repository guidelines, code style, and commit conventions.
MIT License — see LICENSE.