seek

Personal hybrid search engine — BM25 + vector (qwen3-vl-embedding) for markdown notes, Claude Code & Codex conversations

التثبيت
CLI
npx skills add https://github.com/ethan-huo/seek --skill seek

قم بتثبيت هذه المهارة باستخدام واجهة سطر الأوامر (CLI) وابدأ في استخدام سير عمل SKILL.md في مساحة عملك.

آخر تحديث 4/22/2026

seek

Personal hybrid search engine for markdown notes, Claude Code conversations, and Codex conversations. BM25 full-text + vector semantic search with multimodal embedding.

Why

AI agents lose context between sessions. seek indexes everything — your notes, every Claude Code conversation, every Codex session, including screenshots — so agents can recall what you discussed weeks ago.

Install

# requires: go 1.24+, CGO
make build
ln -sf $(pwd)/seek /usr/local/bin/seek

As a skill (Claude Code, Codex, Cursor, etc.):

bunx skills add ethan-huo/seek

Setup

# Configure embedding API (DashScope / OpenAI / custom)
seek auth login

# Add your collections
seek add /path/to/notes --name mynotes    # markdown
seek add --claude                          # Claude Code conversations
seek add --codex                           # Codex conversations
seek add --images /path/to/images -n pics  # image files

# Generate embeddings
seek embed

Usage

# Hybrid search (BM25 + vector, recommended)
seek search "how to deploy the gateway"

# BM25 keyword search (fast, no API call)
seek search "ECONNREFUSED port 3000" --lex

# Vector semantic search (meaning-based)
seek search "functional programming architecture" --vec

# Incremental sync + embed new content
seek sync && seek embed

Automation

Background service — periodic sync + embed via launchd (macOS):

seek service start              # every 1 hour (default)
seek service start -i 1800      # every 30 minutes
seek service stop
seek service status

AI tool hooks — auto-sync after every conversation:

seek hooks install              # adds Stop hook to Claude Code
seek hooks uninstall

This writes a Stop hook into ~/.claude/settings.json so seek sync runs automatically when Claude finishes a conversation. Combined with the background service (which handles embed), your index stays current without manual intervention.

How It Works

Indexingseek sync scans collections incrementally. Markdown files are tracked by content hash. Claude/Codex JSONL files are append-only, tracked by line count. Base64 images in conversations are extracted to ~/.cache/seek/images/.

Embeddingseek embed generates vectors via qwen3-vl-embedding (multimodal). Text and images share the same vector space. Supports DashScope Batch API (50% cheaper) for bulk indexing.

Search — Three modes:

  • --lex: SQLite FTS5 BM25 ranking
  • --vec: Cosine similarity against stored embeddings
  • Default (hybrid): RRF fusion combining both

Storage — SQLite database at ~/.cache/seek/index.db. Config at ~/.config/seek/config.yaml.

Collections

Type Source What's indexed
markdown Any directory .md files, FTS + chunks + embeddings
claude ~/.claude/projects/ All Claude Code conversations + screenshots
codex ~/.codex/ All Codex sessions + screenshots
images Any directory Image files (png/jpg/webp) with VL embedding

Built With

License

MIT