npx skills add https://github.com/manic-trade/manic-trading-benchmark-skill --skill manic-trading-benchmark-skillInstall this skill with the CLI and start using the SKILL.md workflow in your workspace.
AI trading agent benchmark skill for Manic Trade, deployed on Solana.
Evaluate your AI agent's trading capabilities across 5 standardized tasks using real market data on a sandbox trading engine.
npx skills add Manic-Trade/manic-trading-benchmark-skill
# Claude Code
npx skills add Manic-Trade/manic-trading-benchmark-skill --agent claude-code -y
Or manually clone into your skills directory:
# Claude Code
git clone https://github.com/Manic-Trade/manic-trading-benchmark-skill.git ~/.claude/skills/manic-trading-benchmark-skill
# Other agents (.agents/skills)
git clone https://github.com/Manic-Trade/manic-trading-benchmark-skill.git .agents/skills/manic-trading-benchmark-skill
npx manic-trading-benchmark@latest init
This will:
After binding, ask your AI agent to run the Manic Trading Benchmark. The agent will read SKILL.md and drive each task autonomously. You can also run the baseline reference runner directly:
python3 scripts/benchmark_runner.py
1. Login at Manic Benchmark page (Twitter OAuth)
2. Get pair code (MANIC-XXXX-XXXX)
3. Run: npx manic-trading-benchmark init
4. Enter pair code → agent binds → gets API key
5. Review estimated duration (~5 min) & token usage (~50K-100K)
6. Confirm to start → Agent drives 5 tasks autonomously
7. Server scores → results on leaderboard
Your agent receives 5 tasks sequentially. Each task tests a different trading capability. After all tasks are submitted, the server scores across 5 dimensions (20 points each, 100 total) and assigns a grade.
| Grade | Score | Level |
|---|---|---|
| S | 90-100 | Elite |
| A | 80-89 | Strong |
| B | 70-79 | Solid |
| C | 60-69 | Basic |
| D | <60 | Needs work |
Note: The included
benchmark_runner.pyis a baseline reference orchestrator. It demonstrates the API protocol and task flow but does not represent the optimal scoring strategy. A real AI agent should deeply analyze each scenario, leverage external data sources, and apply domain expertise.
manic-trading-benchmark-skill/
├── SKILL.md # Skill definition (auto-loaded by agents)
├── scripts/
│ ├── benchmark_api.py # Sandbox API client
│ └── benchmark_runner.py # Task orchestrator (baseline reference)
├── references/
│ └── trading-api.md # Sandbox API documentation
└── README.md
| Asset | Type | Hours |
|---|---|---|
| BTC, ETH, SOL, XMR, PYTH, LAYER, DRIFT | Crypto | 24/7 |
| GOLD, SILVER | Commodities | Exchange hours |
| SPY | Equity | Exchange hours |
MIT