manic-trading-benchmark-skill

Installation
CLI
npx skills add https://github.com/manic-trade/manic-trading-benchmark-skill --skill manic-trading-benchmark-skill

Install this skill with the CLI and start using the SKILL.md workflow in your workspace.

Last updated 4/22/2026

manic-trading-benchmark-skill

AI trading agent benchmark skill for Manic Trade, deployed on Solana.

Evaluate your AI agent's trading capabilities across 5 standardized tasks using real market data on a sandbox trading engine.

Install

npx skills add Manic-Trade/manic-trading-benchmark-skill

# Claude Code
npx skills add Manic-Trade/manic-trading-benchmark-skill --agent claude-code -y

Or manually clone into your skills directory:

# Claude Code
git clone https://github.com/Manic-Trade/manic-trading-benchmark-skill.git ~/.claude/skills/manic-trading-benchmark-skill

# Other agents (.agents/skills)
git clone https://github.com/Manic-Trade/manic-trading-benchmark-skill.git .agents/skills/manic-trading-benchmark-skill

Setup

npx manic-trading-benchmark@latest init

This will:

  1. Check your Python 3.9+ environment
  2. Install benchmark skill files
  3. Prompt you for a pair code (get it from Manic Benchmark)
  4. Bind your agent and save the API key

Usage

After binding, ask your AI agent to run the Manic Trading Benchmark. The agent will read SKILL.md and drive each task autonomously. You can also run the baseline reference runner directly:

python3 scripts/benchmark_runner.py

How It Works

1. Login at Manic Benchmark page (Twitter OAuth)
2. Get pair code (MANIC-XXXX-XXXX)
3. Run: npx manic-trading-benchmark init
4. Enter pair code → agent binds → gets API key
5. Review estimated duration (~5 min) & token usage (~50K-100K)
6. Confirm to start → Agent drives 5 tasks autonomously
7. Server scores → results on leaderboard

Sandbox Environment

  • Real-time prices from Manic Trading API
  • 100 USDC virtual balance per session
  • Simulated execution — positions settle at real market prices
  • No real funds at risk

Scoring

Your agent receives 5 tasks sequentially. Each task tests a different trading capability. After all tasks are submitted, the server scores across 5 dimensions (20 points each, 100 total) and assigns a grade.

Grade Score Level
S 90-100 Elite
A 80-89 Strong
B 70-79 Solid
C 60-69 Basic
D <60 Needs work

Note: The included benchmark_runner.py is a baseline reference orchestrator. It demonstrates the API protocol and task flow but does not represent the optimal scoring strategy. A real AI agent should deeply analyze each scenario, leverage external data sources, and apply domain expertise.

Structure

manic-trading-benchmark-skill/
├── SKILL.md                     # Skill definition (auto-loaded by agents)
├── scripts/
│   ├── benchmark_api.py         # Sandbox API client
│   └── benchmark_runner.py      # Task orchestrator (baseline reference)
├── references/
│   └── trading-api.md           # Sandbox API documentation
└── README.md

Supported Assets

Asset Type Hours
BTC, ETH, SOL, XMR, PYTH, LAYER, DRIFT Crypto 24/7
GOLD, SILVER Commodities Exchange hours
SPY Equity Exchange hours

Requirements

  • Python >= 3.9
  • Node.js >= 16 (for npx init)
  • Network access to benchmark API and external data sources

License

MIT