Memory library for building stateful agents
npx skills add https://github.com/plastic-labs/honcho --skill honcho-integrationInstala esta habilidad con la CLI y comienza a usar el flujo de trabajo SKILL.md en tu espacio de trabajo.
Honcho is an open source memory library with a managed service for building stateful
agents. Use it with any model, framework, or architecture. It enables agents to build
and maintain state about any entity--users, agents, groups, ideas, and more. And because
it's a continual learning system, it understands entities that change over time. Using
Honcho as your memory system will earn your agents higher retention, more trust, and
help you build data moats to out-compete incumbents.
Honcho has defined the Pareto Frontier of Agent Memory. Watch the video, check out our evals page, and read the blog post for more detail.
With Honcho you can easily setup your application's workflow, save your
interaction history, and leverage the reasoning it does to inform the behavior of
your agents
Typescript examples are available in our docs.
# Python
pip install honcho-ai
uv add honcho-ai
poetry add honcho-ai
Workspace, Peers, Session, and send Messagesfrom honcho import Honcho
# 1. Initialize your Honcho client
honcho = Honcho(workspace_id="my-app-testing")
# 2. Initialize peers
alice = honcho.peer("alice")
tutor = honcho.peer("tutor")
# 3. Create a session and add messages
session = honcho.session("session-1")
# Adding messages from a peer will automatically add them to the session
session.add_messages(
[
alice.message("Hey there — can you help me with my math homework?"),
tutor.message("Absolutely. Send me your first problem!"),
]
)
### 1. Use the chat endpoint to ask questions about your users in natural language
response = alice.chat("What learning styles does the user respond to best?")
### 2. Use session context to continue a conversation with an LLM
context = session.context(summary=True, tokens=10_000)
# Convert to a format to send to OpenAI and get the next message
openai_messages = context.to_openai(assistant=tutor)
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4",
messages=openai_messages
)
### 3. Search for similar messages
results = alice.search("Math Homework")
### 4. Get a session-scoped representation of a peer
alice_representation = session.representation(alice)
This is a simple example of how you can use Honcho to build a chatbot and
leverage insights to personalize the agent's behavior.
Sign up at app.honcho.dev to get started with a managed version of Honcho.
Learn more ways to use Honcho on our developer docs.
Read about the design philosophy and history of the project on our blog.
The Honcho project is split between several repositories with this one hosting
the core service logic. This is implemented as a FastAPI server/API to store
data about an application's state.
There are also client SDKs implemented in the sdks/ directory with support
for Python and TypeScript.
Examples on how to use the SDK are located within each SDK folder and in the
SDK Reference
There are also documented examples of how to use the core SDKs in the
API Reference section of
the documentation.
Sign up for an account at
https://app.honcho.dev and get started with $100 free credits. When you sign up you'll be prompted to
join an organization which will have a dedicated instance of Honcho.
Provision API keys and change your base url to point to
https://api.honcho.dev
Additionally, Honcho can be self-hosted for testing and evaluation purposes. See
the Local Development section below for details on how to set up a local
version of Honcho.
Below is a guide on setting up a local environment for running the Honcho
Server.
This guide was made using a M3 Macbook Pro. For any compatibility issues
on different platforms, please raise an Issue.
Honcho is developed using python and uv.
The minimum python version is 3.10
The minimum uv version is 0.5.0
Once the dependencies are installed on the system run the following steps to get
the local project setup.
git clone https://github.com/plastic-labs/honcho.git
We recommend using a virtual environment to isolate the dependencies for Honcho
from other projects on the same system. uv will create a virtual environment
when you sync your dependencies in the project.
cd honcho
uv sync
This will create a virtual environment and install the dependencies for Honcho.
The default virtual environment will be located at honcho/.venv. Activate the
virtual environment via:
source honcho/.venv/bin/activate
Honcho utilizes Postgres for its database with
pgvector. An easy way to get started with a postgres database is to create a project
with Supabase
Alternatively, a docker-compose template is available with a sample database configuration.
To use Docker:
cp docker-compose.yml.example docker-compose.yml
docker compose up -d database
Honcho uses a .env file for managing runtime environment variables. A
.env.template file is included for convenience. Several of the configurations
are not required and are only necessary for additional logging, monitoring, and
security.
Below are the required configurations:
DB_CONNECTION_URI= # Connection uri for a postgres database (with postgresql+psycopg prefix)
# LLM Provider API Keys
LLM_GEMINI_API_KEY= # API Key for Google Gemini (used for deriver, summary, and dialectic minimal/low by default)
LLM_ANTHROPIC_API_KEY= # API Key for Anthropic (used for dialectic medium/high/max and dream by default)
LLM_OPENAI_API_KEY= # API Key for OpenAI (used for embeddings when EMBED_MESSAGES=true)
Note that the
DB_CONNECTION_URImust have the prefixpostgresql+psycopgto
function properly. This is a requirement brought bysqlalchemy
The template has the additional functionality disabled by default. To ensure
that they are disabled you can verify the following environment variables are
set to false:
AUTH_USE_AUTH=false
SENTRY_ENABLED=false
If you set AUTH_USE_AUTH to true you will need to generate a JWT secret. You can
do this with the following command:
python scripts/generate_jwt_secret.py
This will generate a JWT secret and print it to the console. You can then set
the AUTH_JWT_SECRET environment variable. This is required for AUTH_USE_AUTH:
AUTH_JWT_SECRET=<generated_secret>
With the database set up and environment variables configured, run the migrations
to create the necessary tables:
uv run alembic upgrade head
This will create all tables for Honcho including workspaces, peers, sessions,
messages, and the queue system.
With everything set up, you can now launch a local instance of Honcho. In addition to the database, two
components need to be running:
Start the API server:
uv run fastapi dev src/main.py
This is a development server that will reload whenever code is changed.
Start a background worker (deriver):
In a separate terminal, run:
uv run python -m src.deriver
The deriver generates representation, summaries, peer cards, and manages dreaming tasks. You can increase the number of deriver's to improve runtime efficiency.
Honcho uses pre-commit hooks to ensure code quality and consistency across the project. These hooks automatically run checks on your code before each commit, including linting, formatting, type checking, and security scans.
To set up pre-commit hooks in your development environment:
uv add --dev pre-commit
uv run pre-commit install \
--hook-type pre-commit \
--hook-type commit-msg \
--hook-type pre-push
This will install hooks for pre-commit, commit-msg, and pre-push stages.
The pre-commit configuration includes:
You can run the hooks manually on all files without making a commit:
uv run pre-commit run --all-files
Or run specific hooks:
uv run pre-commit run ruff --all-files
uv run pre-commit run basedpyright --all-files
As mentioned earlier a docker-compose template is included for running Honcho.
As an alternative to running Honcho locally it can also be run with the compose
template.
The docker-compose template is set to use an environment file called .env.
You can also copy the .env.template and fill with the appropriate values.
Copy the template and update the appropriate environment variables before
launching the service:
cd honcho
cp .env.template .env
# update the file with openai key and other wanted environment variables
cp docker-compose.yml.example docker-compose.yml
docker compose up
The API can also be deployed on fly.io. Follow the Fly.io
Docs to setup your environment and the
flyctl.
A sample fly.toml is included for convenience.
Note: The fly.toml does not include launching a Postgres database. This must
be configured separately
Once flyctl is set up use the following commands to launch the application:
cd honcho
flyctl launch --no-deploy # Follow the prompts and edit as you see fit
cat .env | flyctl secrets import # Load in your secrets
flyctl deploy # Deploy with appropriate environment variables
Honcho uses a flexible configuration system that supports both TOML files and environment variables. Configuration values are loaded in the following priority order (highest to lowest):
.env file (for local development)config.toml fileCopy the example configuration file to get started:
cp config.toml.example config.toml
Then modify the values as needed. The TOML file is organized into sections:
[app] - Application-level settings (log level, session limits, embedding settings, namespace)[db] - Database connection and pool settings[auth] - Authentication configuration[cache] - Redis cache configuration[llm] - LLM provider API keys and general settings[deriver] - Background worker settings and representation configuration[peer_card] - Peer card generation settings[dialectic] - Dialectic API configuration with per-level reasoning settings[summary] - Session summarization settings[dream] - Dream processing configuration (including specialist models and surprisal settings)[webhook] - Webhook configuration[metrics] - Prometheus pull-based metrics[telemetry] - CloudEvents telemetry for analytics[vector_store] - Vector store configuration (pgvector, turbopuffer, or lancedb)[sentry] - Error tracking and monitoring settingsAll configuration values can be overridden using environment variables. The environment variable names follow this pattern:
{SECTION}_{KEY} for top-level section settings__ inside {KEY} for nested settings{KEY} for app-level settingsExamples:
DB_CONNECTION_URI - Database connection stringAUTH_JWT_SECRET - JWT secret keyDERIVER_MODEL_CONFIG__TRANSPORT - Transport for the background deriverSUMMARY_MODEL_CONFIG__MODEL - Summary model overrideDIALECTIC_LEVELS__low__MODEL_CONFIG__MODEL - Model for low reasoning levelLOG_LEVEL - Application log levelMETRICS_ENABLED - Enable Prometheus metricsTELEMETRY_ENABLED - Enable CloudEvents telemetryWhen a configuration value is set in multiple places, Honcho uses this priority:
This allows you to:
config.toml for base configuration.env files for local development without modifying config.tomlIf you have this in config.toml:
[db]
CONNECTION_URI = "postgresql+psycopg://localhost/honcho_dev"
POOL_SIZE = 10
You can override just the connection URI in production:
export DB_CONNECTION_URI="postgresql+psycopg://prod-server/honcho_prod"
The application will use the production connection URI while keeping the pool size from config.toml.
The functionality of Honcho can be split into two different services: Storage
and Insights.
Honcho uses an entity-centric model where both users and agents are represented as "peers". This unified approach enables:
Honcho contains several different primitives used for storing application and
peer data. This data is used for managing conversations, modeling peer
identity, building RAG applications, and more.
The philosophy behind Honcho is to provide a platform that is peer-centric and
easily scalable from a single user to a million.
Below is a mapping of the different primitives and their relationships.
Workspaces
├── Peers ←──────────────────┐
│ ├── Sessions │
│ └── Collections │
│ └── Documents │
│ │
│ │
└── Sessions ←───────────────┤ (many-to-many)
├── Peers ───────────────┘
└── Messages (session-level)
Relationship Details:
Users familiar with APIs such as the OpenAI Assistants API will be familiar with
much of the mapping here.
This is the top level construct of Honcho. Developers can register different
Workspaces for different assistants, agents, AI enabled features, etc. It is a way to
isolate data between use cases and provide multi-tenant capabilities.
Within a Workspace everything revolves around a Peer. The Peer object
represents any participant in the system - whether human users or AI agents.
This unified model enables complex multi-participant interactions.
The Session object represents a set of interactions between Peers within a
Workspace. Other applications may refer to this as a thread or conversation.
Sessions can involve multiple peers with configurable observation settings.
The Message represents an atomic data unit that can exist at two levels:
All messages are labeled by their source peer and can be processed
asynchronously to update their representations. This flexible design allows for
both conversational interactions and broader data ingestion for personality
modeling.
At a high level a Collection is a named group of Documents. Developers
familiar with RAG based applications will be familiar with these. Collections
store vector embedded data that developers and agents can retrieve against using
functions like cosine similarity.
Collections are also used internally by Honcho while creating representations of peers.
As stated before a Document is vector embedded data stored in a Collection.
The reasoning functionality of Honcho is built on top of the Storage service. As
Messages and Sessions are created for Peers, Honcho will asynchronously
reason about peer psychology to derive facts about them and store them
in reserved Collections.
A high level summary of the pipeline is as follows:
representation: To update representations of Peerssummary: To create summaries of SessionsHoncho exposes several different ways to retrieve data from the system to best
serve the needs of any given application.
In long-running conversations with an LLM, the context window can fill up
quickly. To address this, Honcho provides a context
endpoint that returns a combination of messages, conclusions, summaries from a
session up to a provided token limit.
Use this to keep sessions going indefinitely. If you'd like to see this in action, try out Honcho Chat.
There are several search endpoints that let developers query messages at the
Workspace, Session, or Peer level using a hybrid search strategy.
Requests can include advanced filters to further refine
the results.
The flagship interface for using these insights is through
the Chat Endpoint.
This is a regular API endpoint (/peers/{peer_id}/chat) that takes natural language requests to get data
about the Peer. This robust design lets us use this single endpoint for all
cases where extra personalization or information about the Peer is necessary.
A developer's application can treat Honcho as an oracle to the Peer and
consult it when necessary. Some examples of how to leverage the Dialectic
API include:
PeerPeers behaviorFor low-latency use cases,
Honcho provides access to a representation endpoint that
returns a static document with insights about a Peer in the context of a
particular session.
Use this to quickly add context to a prompt without having to wait for an LLM
response.
We welcome contributions to Honcho! Please read our Contributing Guide for details on our development process, coding conventions, and how to submit pull requests.
Honcho is licensed under the AGPL-3.0 License. Learn more at the License file