openviking-memory von volcengine/openviking

OpenViking: The Context Database for AI Agents

👋 Join our Community

Overview

Challenges in Agent Development

In the AI era, data is abundant, but high-quality context is hard to come by. When building AI Agents, developers often face these challenges:

Fragmented Context: Memories are in code, resources are in vector databases, and skills are scattered, making them difficult to manage uniformly.
Surging Context Demand: An Agent's long-running tasks produce context at every execution. Simple truncation or compression leads to information loss.
Poor Retrieval Effectiveness: Traditional RAG uses flat storage, lacking a global view and making it difficult to understand the full context of information.
Unobservable Context: The implicit retrieval chain of traditional RAG is like a black box, making it hard to debug when errors occur.
Limited Memory Iteration: Current memory is just a record of user interactions, lacking Agent-related task memory.

The OpenViking Solution

OpenViking is an open-source Context Database designed specifically for AI Agents.

We aim to define a minimalist context interaction paradigm for Agents, allowing developers to completely say goodbye to the hassle of context management. OpenViking abandons the fragmented vector storage model of traditional RAG and innovatively adopts a "file system paradigm" to unify the structured organization of memories, resources, and skills needed by Agents.

With OpenViking, developers can build an Agent's brain just like managing local files:

Filesystem Management Paradigm → Solves Fragmentation: Unified context management of memories, resources, and skills based on a filesystem paradigm.
Tiered Context Loading → Reduces Token Consumption: L0/L1/L2 three-tier structure, loaded on demand, significantly saving costs.
Directory Recursive Retrieval → Improves Retrieval Effect: Supports native filesystem retrieval methods, combining directory positioning with semantic search to achieve recursive and precise context acquisition.
Visualized Retrieval Trajectory → Observable Context: Supports visualization of directory retrieval trajectories, allowing users to clearly observe the root cause of issues and guide retrieval logic optimization.
Automatic Session Management → Context Self-Iteration: Automatically compresses content, resource references, tool calls, etc., in conversations, extracting long-term memory, making the Agent smarter with use.

Quick Start

Prerequisites

Before starting with OpenViking, please ensure your environment meets the following requirements:

Python Version: 3.10 or higher
Rust Toolchain: Cargo (Required for building RAGFS and CLI components from source)
C++ Compiler: GCC 9+ or Clang 11+ (Required for building core extensions)
Operating System: Linux, macOS, Windows
Network Connection: A stable network connection is required (for downloading dependencies and accessing model services)

1. Installation

Python Package

pip install openviking --upgrade --force-reinstall

Rust CLI (Optional)

curl -fsSL https://raw.githubusercontent.com/volcengine/OpenViking/main/crates/ov_cli/install.sh | bash

Or build from source:

cargo install --git https://github.com/volcengine/OpenViking ov_cli

2. Model Preparation

OpenViking requires the following model capabilities:

VLM Model: For image and content understanding
Embedding Model: For vectorization and semantic retrieval

Supported VLM Providers

OpenViking supports multiple VLM providers:

Provider	Description	Setup
`volcengine`	Volcengine Doubao Models	Volcengine Console
`openai`	OpenAI Official API	OpenAI Platform
`openai-codex`	Codex VLM	Use `openviking-server init`
`kimi`	Kimi Code Membership	Use `openviking-server init`
`glm`	GLM Coding Plan	Use `openviking-server init`

Provider-Specific Notes

Volcengine (Doubao)

Volcengine supports both model names and endpoint IDs. Using model names is recommended for simplicity:

{
  "vlm": {
    "provider": "volcengine",
    "model": "doubao-seed-2-0-pro-260215",
    "api_key": "your-api-key",
    "api_base": "https://ark.cn-beijing.volces.com/api/v3"
  }
}

You can also use endpoint IDs (found in Volcengine ARK Console:

{
  "vlm": {
    "provider": "volcengine",
    "model": "ep-20241220174930-xxxxx",
    "api_key": "your-api-key",
    "api_base": "https://ark.cn-beijing.volces.com/api/v3"
  }
}

OpenAI

Use OpenAI's official API:

{
  "vlm": {
    "provider": "openai",
    "model": "gpt-4o",
    "api_key": "your-api-key",
    "api_base": "https://api.openai.com/v1"
  }
}

You can also use a custom OpenAI-compatible endpoint:

{
  "vlm": {
    "provider": "openai",
    "model": "gpt-4o",
    "api_key": "your-api-key",
    "api_base": "https://your-custom-endpoint.com/v1"
  }
}

OpenAI Codex (OAuth)

Use this provider when you want OpenViking to call Codex VLM through your ChatGPT/Codex OAuth session instead of a standard OpenAI API key:

openviking-server init
# choose OpenAI Codex when prompted
openviking-server doctor

{
  "vlm": {
    "provider": "openai-codex",
    "model": "gpt-5.3-codex",
    "api_base": "https://chatgpt.com/backend-api/codex",
    "temperature": 0.0,
    "max_retries": 2
  }
}

💡 Tip:

openai-codex does not require vlm.api_key when Codex OAuth is available

OpenViking stores its own Codex auth state at ~/.openviking/codex_auth.json

openviking-server doctor validates that the current Codex auth is usable

Kimi Coding (Subscription)

Use this provider when you want OpenViking to call the dedicated Kimi Coding subscription endpoint directly:

openviking-server init
# choose Kimi Coding when prompted
openviking-server doctor

{
  "vlm": {
    "provider": "kimi",
    "model": "kimi-code",
    "api_key": "your-kimi-subscription-api-key",
    "api_base": "https://api.kimi.com/coding",
    "temperature": 0.0,
    "max_retries": 2
  }
}

💡 Tip:

kimi applies the recommended Kimi Coding defaults automatically, including the default Kimi Coding user agent

kimi-code and kimi-coding are accepted aliases for the provider name

kimi-code is normalized to Kimi's upstream coding model automatically

GLM Coding Plan (Subscription)

Use this provider when you want OpenViking to call Z.AI's OpenAI-compatible Coding Plan endpoint directly:

openviking-server init
# choose GLM Coding Plan when prompted
openviking-server doctor

{
  "vlm": {
    "provider": "glm",
    "model": "glm-4.6v",
    "api_key": "your-zai-api-key",
    "api_base": "https://api.z.ai/api/coding/paas/v4",
    "temperature": 0.0,
    "max_retries": 2
  }
}

💡 Tip:

glm, zhipu, zai, z-ai, and z.ai all resolve to the same first-class GLM provider

The default endpoint is the Coding Plan endpoint, not the general Z.AI endpoint

Use a vision-capable model such as glm-4.6v or glm-5v-turbo for multimodal parsing

3. Environment Configuration

Quick Setup for Local Models (Ollama)

If you want to run OpenViking with local models via Ollama, the interactive setup wizard handles everything automatically:

openviking-server init

The wizard will:

Detect and install Ollama if needed
Recommend and pull suitable embedding and VLM models for your hardware
Generate a ready-to-use ov.conf configuration file

To validate your setup at any time:

openviking-server doctor

doctor checks local prerequisites (config file, Python version, embedding/VLM provider connectivity, disk space) without requiring a running server.

For cloud API providers (Volcengine, OpenAI, Gemini, etc.), continue with the manual configuration below.

Server Configuration Template

The recommended first-time flow is:

openviking-server init
openviking-server doctor

If you choose OpenAI Codex inside openviking-server init, the wizard can import existing Codex auth or start the Codex sign-in flow for you.

If you prefer manual configuration, create ~/.openviking/ov.conf, remove the comments before copy:

{
  "storage": {
    "workspace": "/home/your-name/openviking_workspace"
  },
  "log": {
    "level": "INFO",
    "output": "stdout"                 // Log output: "stdout" or "file"
  },
  "embedding": {
    "dense": {
      "api_base" : "<api-endpoint>",   // API endpoint address
      "api_key"  : "<your-api-key>",   // Model service API Key
      "provider" : "<provider-type>",  // Provider type: "volcengine" or "openai" (currently supported)
      "dimension": 1024,               // Vector dimension
      "model"    : "<model-name>"      // Embedding model name (e.g., doubao-embedding-vision-251215 or text-embedding-3-large)
    },
    "max_concurrent": 10,              // Max concurrent embedding requests (default: 10)
    "text_source": "content_only",     // Text file vectorization source: content_only|summary_first|summary_only
    "max_input_tokens": 4096           // Max estimated raw text tokens sent to embedding
  },
  "vlm": {
    "api_base" : "<api-endpoint>",     // API endpoint address
    "api_key"  : "<your-api-key>",     // Model service API Key (optional for openai-codex)
    "provider" : "<provider-type>",    // Provider type (volcengine, openai, openai-codex, kimi, glm, etc.)
    "model"    : "<model-name>",       // VLM model name (e.g., doubao-seed-2-0-pro-260215 or gpt-4-vision-preview)
    "max_concurrent": 100              // Max concurrent LLM calls for semantic processing (default: 100)
  }
}

Note: For embedding models, supported providers are volcengine (Doubao), openai, azure, jina, ollama, voyage, dashscope, minimax, cohere, vikingdb, gemini (requires pip install "google-genai>=1.0.0"), litellm, and local. For VLM models, common providers include volcengine, openai, openai-codex, kimi, and glm.

Server Configuration Examples

👇 Expand to see the configuration example for your model service:

Example 1: Using Volcengine (Doubao Models)

{
  "storage": {
    "workspace": "/home/your-name/openviking_workspace"
  },
  "log": {
    "level": "INFO",
    "output": "stdout"                 // Log output: "stdout" or "file"
  },
  "embedding": {
    "dense": {
      "api_base" : "https://ark.cn-beijing.volces.com/api/v3",
      "api_key"  : "your-volcengine-api-key",
      "provider" : "volcengine",
      "dimension": 1024,
      "model"    : "doubao-embedding-vision-251215"
    },
    "max_concurrent": 10
  },
  "vlm": {
    "api_base" : "https://ark.cn-beijing.volces.com/api/v3",
    "api_key"  : "your-volcengine-api-key",
    "provider" : "volcengine",
    "model"    : "doubao-seed-2-0-pro-260215",
    "max_concurrent": 100
  }
}

Example 2: Using OpenAI Models

{
  "storage": {
    "workspace": "/home/your-name/openviking_workspace"
  },
  "log": {
    "level": "INFO",
    "output": "stdout"                 // Log output: "stdout" or "file"
  },
  "embedding": {
    "dense": {
      "api_base" : "https://api.openai.com/v1",
      "api_key"  : "your-openai-api-key",
      "provider" : "openai",
      "dimension": 3072,
      "model"    : "text-embedding-3-large"
    },
    "max_concurrent": 10
  },
  "vlm": {
    "api_base" : "https://api.openai.com/v1",
    "api_key"  : "your-openai-api-key",
    "provider" : "openai",
    "model"    : "gpt-4-vision-preview",
    "max_concurrent": 100
  }
}

Example 3: Using Google Gemini Embedding

Install the required package first:

pip install "google-genai>=1.0.0"

{
  "storage": {
    "workspace": "/home/your-name/openviking_workspace"
  },
  "embedding": {
    "dense": {
      "provider": "gemini",
      "api_key": "your-google-api-key",
      "model": "gemini-embedding-2-preview",
      "dimension": 3072
    },
    "max_concurrent": 10
  },
  "vlm": {
    "api_base" : "https://api.openai.com/v1",
    "api_key"  : "your-openai-api-key",
    "provider" : "openai",
    "model"    : "gpt-4o",
    "max_concurrent": 100
  }
}

Get your Google API key at https://aistudio.google.com/apikey

Example 4: Using Volcengine Embedding + Codex VLM

Use openviking-server init and choose OpenAI Codex, then run openviking-server doctor.

{
  "storage": {
    "workspace": "/home/your-name/openviking_workspace"
  },
  "embedding": {
    "dense": {
      "api_base" : "https://ark.cn-beijing.volces.com/api/v3",
      "api_key"  : "your-volcengine-api-key",
      "provider" : "volcengine",
      "dimension": 1024,
      "model"    : "doubao-embedding-vision-251215"
    }
  },
  "vlm": {
    "api_base" : "https://chatgpt.com/backend-api/codex",
    "provider" : "openai-codex",
    "model"    : "gpt-5.3-codex",
    "max_concurrent": 100
  }
}

Set Server Configuration Environment Variable

After creating the configuration file, set the environment variable to point to it (Linux/macOS):

export OPENVIKING_CONFIG_FILE=~/.openviking/ov.conf # by default

On Windows, use one of the following:

PowerShell:

$env:OPENVIKING_CONFIG_FILE = "$HOME/.openviking/ov.conf"

Command Prompt (cmd.exe):

set "OPENVIKING_CONFIG_FILE=%USERPROFILE%\.openviking\ov.conf"

💡 Tip: You can also place the configuration file in other locations, just specify the correct path in the environment variable.

CLI/Client Configuration Examples

👇 Expand to see the configuration example for your CLI/Client:

Example: ovcli.conf for visiting localhost server

{
  "url": "http://localhost:1933",
  "timeout": 60.0
}

After creating the configuration file, set the environment variable to point to it (Linux/macOS):

export OPENVIKING_CLI_CONFIG_FILE=~/.openviking/ovcli.conf # by default

On Windows, use one of the following:

PowerShell:

$env:OPENVIKING_CLI_CONFIG_FILE = "$HOME/.openviking/ovcli.conf"

Command Prompt (cmd.exe):

set "OPENVIKING_CLI_CONFIG_FILE=%USERPROFILE%\.openviking\ovcli.conf"

4. Run Your First Example

📝 Prerequisite: Ensure you have completed the configuration (ov.conf and ovcli.conf) in the previous step.

Now let's run a complete example to experience the core features of OpenViking.

Launch Server

openviking-server doctor
openviking-server

If you configured provider=openai-codex, openviking-server doctor already validates Codex auth.

or you can run in background

nohup openviking-server > /data/log/openviking.log 2>&1 &

Run the CLI

ov status
ov add-resource https://github.com/volcengine/OpenViking # --wait
ov ls viking://resources/
ov tree viking://resources/volcengine -L 2
# wait some time for semantic processing if not --wait
ov find "what is openviking"
ov grep "openviking" --uri viking://resources/volcengine/OpenViking/docs/zh

Congratulations! You have successfully run OpenViking 🎉

VikingBot Quick Start

VikingBot is an AI agent framework built on top of OpenViking. Here's how to get started:

# Option 1: Install VikingBot from PyPI (recommended for most users)
pip install "openviking[bot]"

# Option 2: Install VikingBot from source (for development)
uv pip install -e ".[bot]"

# Start OpenViking server with Bot enabled
openviking-server --with-bot

# In another terminal, start interactive chat
ov chat

If you use the official Docker image, vikingbot is already bundled in the image and starts by default together with the OpenViking server and console UI. You can disable it at runtime with either --without-bot or -e OPENVIKING_WITH_BOT=0.

Server Deployment Details

For production environments, we recommend running OpenViking as a standalone HTTP service to provide persistent, high-performance context support for your AI Agents.

🚀 Deploy OpenViking on Cloud:
To ensure optimal storage performance and data security, we recommend deploying on Volcengine Elastic Compute Service (ECS) using the veLinux operating system. We have prepared a detailed step-by-step guide to get you started quickly.

👉 View: Server Deployment & ECS Setup Guide

OpenClaw Context Plugin Details

Test Dataset: Effect testing based on LoCoMo10 (https://github.com/snap-research/locomo) long-range dialogues (1,540 cases in total after removing category5 without ground truth)
Experimental Groups: Since users may not disable OpenClaw's native memory when using OpenViking, we added experimental groups with native memory enabled or disabled
OpenViking Version: 0.1.18
Model: seed-2.0-code
Evaluation Script: https://github.com/ZaynJarvis/openclaw-eval/tree/main

Experimental Group	Task Completion Rate	Cost: Input Tokens (Total)
OpenClaw(memory-core)	35.65%	24,611,530
OpenClaw + LanceDB (-memory-core)	44.55%	51,574,530
OpenClaw + OpenViking Plugin (-memory-core)	52.08%	4,264,396
OpenClaw + OpenViking Plugin (+memory-core)	51.23%	2,099,622

Experimental Conclusions:
After integrating OpenViking:

With native memory enabled: 43% improvement over original OpenClaw with 91% reduction in input token cost; 15% improvement over LanceDB with 96% reduction in input token cost.
With native memory disabled: 49% improvement over original OpenClaw with 83% reduction in input token cost; 17% improvement over LanceDB with 92% reduction in input token cost.

👉 View: OpenClaw Context Plugin

👉 View: OpenCode Memory Plugin Example

👉 View: Claude Code Memory Plugin Example

Core Concepts

After running the first example, let's dive into the design philosophy of OpenViking. These five core concepts correspond one-to-one with the solutions mentioned earlier, together building a complete context management system:

1. Filesystem Management Paradigm → Solves Fragmentation

We no longer view context as flat text slices but unify them into an abstract virtual filesystem. Whether it's memories, resources, or capabilities, they are mapped to virtual directories under the viking:// protocol, each with a unique URI.

This paradigm gives Agents unprecedented context manipulation capabilities, enabling them to locate, browse, and manipulate information precisely and deterministically through standard commands like ls and find, just like a developer. This transforms context management from vague semantic matching into intuitive, traceable "file operations". Learn more: Viking URI | Context Types

viking://
├── resources/              # Resources: project docs, repos, web pages, etc.
│   ├── my_project/
│   │   ├── docs/
│   │   │   ├── api/
│   │   │   └── tutorials/
│   │   └── src/
│   └── ...
├── user/                   # User: personal preferences, habits, etc.
│   └── memories/
│       ├── preferences/
│       │   ├── writing_style
│       │   └── coding_habits
│       └── ...
└── agent/                  # Agent: skills, instructions, task memories, etc.
    ├── skills/
    │   ├── search_code
    │   ├── analyze_data
    │   └── ...
    ├── memories/
    └── instructions/

2. Tiered Context Loading → Reduces Token Consumption

Stuffing massive amounts of context into a prompt all at once is not only expensive but also prone to exceeding model windows and introducing noise. OpenViking automatically processes context into three levels upon writing:

L0 (Abstract): A one-sentence summary for quick retrieval and identification.
L1 (Overview): Contains core information and usage scenarios for Agent decision-making during the planning phase.
L2 (Details): The full original data, for deep reading by the Agent when absolutely necessary.

Learn more: Context Layers

viking://resources/my_project/
├── .abstract               # L0 Layer: Abstract (~100 tokens) - Quick relevance check
├── .overview               # L1 Layer: Overview (~2k tokens) - Understand structure and key points
├── docs/
│   ├── .abstract          # Each directory has corresponding L0/L1 layers
│   ├── .overview
│   ├── api/
│   │   ├── .abstract
│   │   ├── .overview
│   │   ├── auth.md        # L2 Layer: Full content - Load on demand
│   │   └── endpoints.md
│   └── ...
└── src/
    └── ...

3. Directory Recursive Retrieval → Improves Retrieval Effect

Single vector retrieval struggles with complex query intents. OpenViking has designed an innovative Directory Recursive Retrieval Strategy that deeply integrates multiple retrieval methods:

Intent Analysis: Generate multiple retrieval conditions through intent analysis.
Initial Positioning: Use vector retrieval to quickly locate the high-score directory where the initial slice is located.
Refined Exploration: Perform a secondary retrieval within that directory and update high-score results to the candidate set.
Recursive Drill-down: If subdirectories exist, recursively repeat the secondary retrieval steps layer by layer.
Result Aggregation: Finally, obtain the most relevant context to return.

This "lock high-score directory first, then refine content exploration" strategy not only finds the semantically best-matching fragments but also understands the full context where the information resides, thereby improving the globality and accuracy of retrieval. Learn more: Retrieval Mechanism

4. Visualized Retrieval Trajectory → Observable Context

OpenViking's organization uses a hierarchical virtual filesystem structure. All context is integrated in a unified format, and each entry corresponds to a unique URI (like a viking:// path), breaking the traditional flat black-box management mode with a clear hierarchy that is easy to understand.

The retrieval process adopts a directory recursive strategy. The trajectory of directory browsing and file positioning for each retrieval is fully preserved, allowing users to clearly observe the root cause of problems and guide the optimization of retrieval logic. Learn more: Retrieval Mechanism

5. Automatic Session Management → Context Self-Iteration

OpenViking has a built-in memory self-iteration loop. At the end of each session, developers can actively trigger the memory extraction mechanism. The system will asynchronously analyze task execution results and user feedback, and automatically update them to the User and Agent memory directories.

User Memory Update: Update memories related to user preferences, making Agent responses better fit user needs.
Agent Experience Accumulation: Extract core content such as operational tips and tool usage experience from task execution experience, aiding efficient decision-making in subsequent tasks.

This allows the Agent to get "smarter with use" through interactions with the world, achieving self-evolution. Learn more: Session Management

Advanced Reading

Documentation

For more details, please visit our Full Documentation.

Community & Team

For more details, please see: About Us

Join the Community

OpenViking is still in its early stages, and there are many areas for improvement and exploration. We sincerely invite every developer passionate about AI Agent technology:

Light up a precious Star for us to give us the motivation to move forward.
Visit our Website to understand the philosophy we convey, and use it in your projects via the Documentation. Feel the change it brings and give us feedback on your truest experience.
Join our community to share your insights, help answer others' questions, and jointly create an open and mutually helpful technical atmosphere:
- 📱 Lark Group: Scan the QR code to join → View QR Code
- 💬 WeChat Group: Scan the QR code to add assistant → View QR Code
- 🎮 Discord: Join Discord Server
- 🐦 X (Twitter)：Follow us
Become a Contributor, whether submitting a bug fix or contributing a new feature, every line of your code will be an important cornerstone of OpenViking's growth.

Let's work together to define and build the future of AI Agent context management. The journey has begun, looking forward to your participation!

Star Trend

Security and privacy

This project takes security seriously.
For vulnerability reporting and supported versions, see SECURITY.md

License

The OpenViking project uses different licenses for different components:

Main Project: AGPLv3 - see the LICENSE file for details
crates/ov_cli: Apache 2.0 - see the LICENSE for details
examples: Apache 2.0 - see the LICENSE for details
third_party: Respective original licenses of third-party projects