A long-running AI agent loop. Ralph automates software development tasks by iteratively working through a task list until completion.
npx skills add https://github.com/pageai-pro/ralph-loop --skill prd-creator使用 CLI 安装这个技能,并在你的工作区中直接复用对应的 SKILL.md 工作流。
📚 Full docs: ralphloop.sh
Ralph is a long-running AI agent loop. Ralph automates software development tasks by iteratively working through a task list until completion. This allows for long running agent loops, effectively enabling AI to code for days at a time.
This is an implementation that actually works, containing a hackable script so you can configure it to your env and favorite agentic AI CLI. It's set up by default to use Claude Code in Docker Sandboxes, and supports multiple Docker Sandboxes agents.
I recommend using a CLI to bootstrap your project with the necessary tools and dependencies, e.g.:
npx @tanstack/cli create lib --add-ons eslint,form,tanstack-query,nitro --no-git
If you must start from a blank slate, which is not recommended, see Starting from scratch. You can also go for a more barebone start by running
npx create-vite@latest src --template react-ts
Run this in your project's directory to install Ralph.
npx @pageai/ralph-loop
Docs: ralphloop.sh
Use the prd-creator skill to generate a PRD from your requirements.
Open up Claude Code or Cursor etc., switch to plan mode, and prompt it with your requirements. Like so:
Use the prd-creator skill to help me create a PRD and task list for the below requirements.
An app is already set up with React, Tailwind CSS and TypeScript.
Requirements:
- A SaaS product that helps users manage their finances.
- Target audience: Small business owners and freelancers.
- Core features:
- Track income and expenses.
- Create and send invoices.
- Track payments and receipts.
- Generate reports and insights.
- Connect to bank accounts and credit cards.
- Connect to accounting software.
- Connect to payment processors.
- Use the shadcn/ui library for components.
- Integrate with Stripe for payments.
- Use Supabase for database.
- You can find env variables in the .env.example file: SUPABASE_URL, SUPABASE_PUBLISHABLE_KEY, STRIPE_SECRET_KEY, etc. are available in the runtime.
// etc.
Check out the video for a more realistic example on how to write requirements.
.env and add it to .gitignore/docs and mention them in the requirementsThen, follow the Skill's instructions and verify the PRD and then tasks.
It is highly recommended that you review individual task requirements before starting the loop. Review EACH TASK INDIVIDUALLY.
Install and sign in to Docker Sandboxes first. See Docker's AI overview and Docker Sandboxes docs for the official setup flow.
Authenticate your chosen Docker Sandboxes agent before running Ralph. Claude is the default:
./ralph.sh --login
Ralph prints all supported login commands, then opens the selected agent inside the correctly named sandbox. To log in to a different supported agent, pass --agent:
./ralph.sh --login --agent codex
Follow the instructions to log in to that agent. Ralph names sandboxes as ralph-<agent>-<current-dir>-<hash8>, for example ralph-claude-my-app-a1b2c3d4. To print the exact sandbox name without starting the loop, run:
./ralph.sh --print-name
./ralph.sh --print-name --agent cursor
Once the sandbox exists (after the first --login or after Ralph runs at least once), publish your dev server port to the host with ./ralph.sh --ports. The port is captured from the dev-server URL you set during install and stored in RALPH_DEFAULT_PORTS in ralph.sh:
./ralph.sh --ports
./ralph.sh --ports --agent codex
👉 Answer "Yes" to Bypass Permissions mode, that's the exact reason why you are using the Docker sandbox.
Alternative: API keys — you can configure API keys for supported agents instead of interactive login, but this is usually more expensive and not recommended. See Docker's supported agents docs for per-agent setup details.
If you want to use a different agentic CLI, see Running with a different agentic CLI.
./ralph.sh -n 50 # Run Ralph Loop with 50 iterations
✍️ Note: the first iteration will be spent on ensuring the named sandbox environment is set up correctly. Expect 5 minutes to complete.
# Run the agent loop (default: 10 iterations)
./ralph.sh
# Run with custom iteration limit
./ralph.sh 5
./ralph.sh -n 5
./ralph.sh --max-iterations 5
# Run exactly one iteration
./ralph.sh --once
# Run with a different supported agent
./ralph.sh --agent codex
./ralph.sh -a cursor -n 5
# Log in to an agent inside Ralph's deterministic sandbox
./ralph.sh --login
./ralph.sh --login --agent cursor
# Publish the configured dev port to the selected agent sandbox
./ralph.sh --ports
./ralph.sh --ports --agent codex
# Print the deterministic sandbox name
./ralph.sh --print-name
./ralph.sh --print-name --agent codex
# Pass extra options to the selected agent
./ralph.sh --agent codex -- --model gpt-5.5
./ralph.sh -a gemini -- --model pro
# Show help
./ralph.sh --help
NB: you might need to run
chmod +x ralph.shto make the script executable.
The default "mode" is "implementation". Depending on your use case, you might want to change
.agent/PROMPT.mdto a different mode, e.g. "refactor", "review", "test" etc.
⚠️ If you want to use a different language or testing framework, see below.
This script assumes the following are installed:
If you'd like to use a different language, testing framework etc. please adjust .agent/PROMPT.md to reflect your setup, server ports and startup commands etc.
👉 The loop is controlled by this prompt, which will be sent to the agent each iteration.
Each iteration, Ralph will:
.agent/tasks.json.agent/tasks/TASK-{ID}.jsonThis was kept hackable so you can make it your own.
The script follows the original concepts of the Ralph Wiggum Loop, working with fresh contexts and providing clear verifiable feedback.
It also works generically with any task set.
In some cases, you might notice the agent is having trouble, slowed down or struggling to overcome a blocker.
While the loop is running, you can edit the .agent/STEERING.md file to add critical work that needs to be done before the loop can continue.
The agent will check this file each iteration and if it finds any critical work, it will skip tasks and complete the critical work first.
The ralph.sh script is designed to be hackable.
It is configured to use Claude Code in Docker Sandboxes by default, and you can select another supported agent with --agent.
Each run uses a deterministic sandbox name in the form ralph-<agent>-<current-dir>-<hash8>, so different agents get separate sandboxes for the same project. On exit or double Ctrl+C, Ralph stops only the named sandbox for the current invocation.
NB: skills are supported by all major agentic AI CLIs via symlinks.
Ralph uses semantic tags to communicate status:
<promise>COMPLETE</promise> - All tasks finished successfully<promise>BLOCKED:reason</promise> - Agent needs human help<promise>DECIDE:question</promise> - Agent needs a decision| Code | Meaning |
|---|---|
| 0 | COMPLETE - All tasks finished |
| 1 | MAX_ITERATIONS - Reached limit |
| 2 | BLOCKED - Needs human help |
| 3 | DECIDE - Needs human decision |
.agent/
├── PROMPT.md # Prompt sent to Agent each iteration
├── tasks.json # Task lookup table (required)
├── tasks/ # Individual task specs (TASK-{ID}.json)
├── prd/
│ ├── PRD.md # Product requirements document
│ └── SUMMARY.md # Short project overview sent to Agent each iteration
├── logs/
│ └── LOG.md # Progress log (auto-created)
├── history/ # Iteration output logs
└── skills/ # Shared skills (source of truth)
As Ralph implements tasks, you might notice that you want some tweaks, features or even bug fixes.
To do so, you need to continue using the prd-creator skill to update the PRD and task list.
For example:
I would like to expand the PRD. Use the prd-creator skill to create these tasks:
- there is a bug in X that should be fixed by doing Y
- create a new feature that implement Z
etc.
It's fine to add multiple tasks at once.
Skills are reusable agent capabilities that provide specialized knowledge and workflows. The canonical source is .agent/skills/, which is symlinked to multiple agent tool directories for compatibility.
| Skill | Description |
|---|---|
component-refactoring |
Patterns for splitting and refactoring React components |
e2e-tester |
End-to-end testing workflows |
frontend-code-review |
Code quality and performance review guidelines |
frontend-testing |
Unit and integration testing patterns |
prd-creator |
Create PRDs and task breakdowns for Ralph |
skill-creator |
Create new skills |
vercel-react-best-practices |
React/Next.js performance patterns |
mysql |
MySQL/InnoDB schema, indexing, query tuning, and ops |
postgres |
PostgreSQL best practices and query optimization |
web-design-guidelines |
UI/UX design principles |
Skills are symlinked from .agent/skills/ to multiple locations for cross-tool compatibility:
# Source of truth
.agent/skills/
├── component-refactoring/
├── e2e-tester/
├── postgres/
├── ...
# Symlinks -> .agent/skills/*
.agents/skills/*
.claude/skills/*
.codex/skills/*
.cursor/skills/*
Docker Sandboxes runs coding agents in isolated microVMs. Ralph uses the standalone sbx CLI and starts agents with a deterministic --name so the sandbox can be reused and stopped cleanly.
Use ./ralph.sh --login to authenticate inside the correct named sandbox. Use ./ralph.sh --print-name if you only need the deterministic sandbox name for debugging.
Useful Docker references:
If you are using Playwright, here is a recommended configuration:
import { defineConfig, devices } from '@playwright/test';
/**
* See https://playwright.dev/docs/test-configuration.
*/
export default defineConfig({
testDir: './tests',
fullyParallel: true,
globalTimeout: 30 * 60 * 1000,
forbidOnly: !!process.env.CI,
retries: process.env.CI ? 2 : 1,
workers: process.env.CI ? 3 : 6,
reporter: 'html',
use: {
baseURL: 'http://localhost:3000',
trace: 'on-first-retry',
},
// NB: only chromium will run in Docker (arm64).
projects: [
{
name: 'chromium',
use: { ...devices['Desktop Chrome'] },
}
],
});
If you are using Vitest, here is a recommended configuration:
import { defineConfig } from "vitest/config";
import react from "@vitejs/plugin-react";
import path from "path";
export default defineConfig({
plugins: [react()],
test: {
environment: "node",
globals: true,
include: ["lib/**/*.test.ts", "lib/**/*.test.tsx"],
// setupFiles: ['./vitest.setup.ts'], // Include this if using Next.js
},
resolve: {
alias: {
"@": path.resolve(__dirname),
},
},
});
If you are using Next.js, you'll also need a vitest.setup.ts file to mock the next/image and next/link components.
import '@testing-library/jest-dom/vitest'
import { vi } from 'vitest'
import React from 'react'
// If using Next.js, mock next/image
vi.mock('next/image', () => ({
default: ({ src, alt, ...props }: { src: string; alt: string }) => {
return React.createElement('img', { src, alt, ...props })
},
}))
// If using Next.js, mock next/link
vi.mock('next/link', () => ({
default: ({
children,
href,
...props
}: {
children: React.ReactNode
href: string
}) => {
return React.createElement('a', { href, ...props }, children)
},
}))
Ralph can run several Docker Sandboxes agents without editing the script:
./ralph.sh --agent claude # default
./ralph.sh --agent codex
./ralph.sh --agent copilot
./ralph.sh --agent cursor
./ralph.sh --agent gemini
./ralph.sh --agent opencode
You can also pass agent-specific options after Ralph's -- separator:
./ralph.sh --agent codex -- --model gpt-5.3-codex
./ralph.sh --agent gemini -- --model pro
Ralph currently supports: claude, codex, copilot, cursor, gemini, and opencode.
See Docker's full list of supported agentic AI CLIs in Docker's docs.
Because the agent name is part of Ralph's sandbox name, switching agents creates or reuses that agent's own sandbox for the project.
For AI to actually verify its implementation and for the loop to work, you need a way to verify it.
To that end, at the minimum you'll need an end-to-end test framework and a unit test framework.
For example, you can use the following commands to install Playwright and Vitest:
npm i @playwright/test vitest jsdom typescript eslint prettier -D
# If using React, also recommend installing:
npm i @vitejs/plugin-react @testing-library/dom @testing-library/jest-dom @testing-library/react @testing-library/user-event -D
It is recommended that you add skills for your specific language and framework. See skills.sh to discover existing skills.
You might be wondering... if this is not a Docker container, how can you see what's going on inside Docker!?
How to debug/install things?
That's quite straightforward.
You first need to run:
sbx ls
Note down the name of the sandbox for the agent you ran, e.g. ralph-claude-my-app-a1b2c3d4. Ralph also prints this name when it starts.
And then you can run bash into any of the sandboxes like so:
sbx exec -it <sandbox-name> bash # e.g. sbx exec -it ralph-claude-my-app-a1b2c3d4 bash
And you have full control over the sandbox, just like a regular container. You can install packages, run commands, etc.
You can also run the agent CLI inside the sandbox (make sure to navigate to the project directory first).
sbx exec -it <sandbox-name> bash
cd /path/to/your/project # this is the same path as the path in the root machine, e.g. /Users/your-username/Documents/your-project
claude # or codex, copilot, cursor, gemini, opencode
Re-running ./ralph.sh, ./ralph.sh --login, or ./ralph.sh --ports is safe at any point — Ralph probes sbx ls and emits sbx run --name ... <agent> . only when the sandbox doesn't exist yet, and sbx run <sandbox-name> to reattach afterwards. The first form is create-only and would otherwise error with --name can only be used when creating a new sandbox. See RALPH.md for the full lifecycle.
MIT