📄 Building Production-Ready AI Agents with Scalable Long-Term Memory →

New Memory Algorithm (April 2026)

Benchmark	Old	New	Tokens	Latency p50
LoCoMo	71.4	91.6	7.0K	0.88s
LongMemEval	67.8	93.4	6.8K	1.09s
BEAM (1M)	—	64.1	6.7K	1.00s
BEAM (10M)	—	48.6	6.9K	1.05s

All benchmarks run on the same production-representative model stack. Single-pass retrieval (one call, no agentic loops).

What changed:

Single-pass ADD-only extraction -- one LLM call, no UPDATE/DELETE. Memories accumulate; nothing is overwritten.
Agent-generated facts are first-class -- when an agent confirms an action, that information is now stored with equal weight.
Entity linking -- entities are extracted, embedded, and linked across memories for retrieval boosting.
Multi-signal retrieval -- semantic, BM25 keyword, and entity matching scored in parallel and fused.

See the migration guide for upgrade instructions. The evaluation framework is open-sourced so anyone can reproduce the numbers.

Research Highlights

91.6 on LoCoMo -- +20 points over the previous algorithm
93.4 on LongMemEval -- +26 points, with +53.6 on assistant memory recall
64.1 on BEAM (1M) -- production-scale memory evaluation at 1M tokens
Read the full paper

Introduction

Mem0 ("mem-zero") enhances AI assistants and agents with an intelligent memory layer, enabling personalized AI interactions. It remembers user preferences, adapts to individual needs, and continuously learns over time—ideal for customer support chatbots, AI assistants, and autonomous systems.

Key Features & Use Cases

Core Capabilities:

Multi-Level Memory: Seamlessly retains User, Session, and Agent state with adaptive personalization
Developer-Friendly: Intuitive API, cross-platform SDKs, and a fully managed service option

Applications:

AI Assistants: Consistent, context-rich conversations
Customer Support: Recall past tickets and user history for tailored help
Healthcare: Track patient preferences and history for personalized care
Productivity & Gaming: Adaptive workflows and environments based on user behavior

🚀 Quickstart Guide

Choose between our hosted platform or self-hosted package:

Hosted Platform

Get up and running in minutes with automatic updates, analytics, and enterprise security.

Sign up on Mem0 Platform
Embed the memory layer via SDK or API keys

Self-Hosted (Open Source)

Install the sdk via pip:

pip install mem0ai

For enhanced hybrid search with BM25 keyword matching and entity extraction, install with NLP support:

pip install mem0ai[nlp]
python -m spacy download en_core_web_sm

Install sdk via npm:

npm install mem0ai

CLI

Manage memories from your terminal:

npm install -g @mem0/cli   # or: pip install mem0-cli

mem0 init
mem0 add "Prefers dark mode and vim keybindings" --user-id alice
mem0 search "What does Alice prefer?" --user-id alice

See the CLI documentation for the full command reference.

Basic Usage

Mem0 requires an LLM to function, with gpt-5-mini from OpenAI as the default. However, it supports a variety of LLMs; for details, refer to our Supported LLMs documentation.

Mem0 uses text-embedding-3-small from OpenAI as the default embedding model. For best results with hybrid search (semantic + keyword + entity boosting), we recommend using at least Qwen 600M or a comparable embedding model. See Supported Embeddings for configuration details.

First step is to instantiate the memory:

from openai import OpenAI
from mem0 import Memory

openai_client = OpenAI()
memory = Memory()

def chat_with_memories(message: str, user_id: str = "default_user") -> str:
    # Retrieve relevant memories
    relevant_memories = memory.search(query=message, filters={"user_id": user_id}, top_k=3)
    memories_str = "\n".join(f"- {entry['memory']}" for entry in relevant_memories["results"])

    # Generate Assistant response
    system_prompt = f"You are a helpful AI. Answer the question based on query and memories.\nUser Memories:\n{memories_str}"
    messages = [{"role": "system", "content": system_prompt}, {"role": "user", "content": message}]
    response = openai_client.chat.completions.create(model="gpt-5-mini", messages=messages)
    assistant_response = response.choices[0].message.content

    # Create new memories from the conversation
    messages.append({"role": "assistant", "content": assistant_response})
    memory.add(messages, user_id=user_id)

    return assistant_response

def main():
    print("Chat with AI (type 'exit' to quit)")
    while True:
        user_input = input("You: ").strip()
        if user_input.lower() == 'exit':
            print("Goodbye!")
            break
        print(f"AI: {chat_with_memories(user_input)}")

if __name__ == "__main__":
    main()

For detailed integration steps, see the Quickstart and API Reference.

🔗 Integrations & Demos

ChatGPT with Memory: Personalized chat powered by Mem0 (Live Demo)
Browser Extension: Store memories across ChatGPT, Perplexity, and Claude (Chrome Extension)
Langgraph Support: Build a customer bot with Langgraph + Mem0 (Guide)
CrewAI Integration: Tailor CrewAI outputs with Mem0 (Example)

📚 Documentation & Support

Full docs: https://docs.mem0.ai
Community: Discord · X (formerly Twitter)
Contact: [email protected]

Citation

We now have a paper you can cite:

@article{mem0,
  title={Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory},
  author={Chhikara, Prateek and Khant, Dev and Aryan, Saket and Singh, Taranjeet and Yadav, Deshraj},
  journal={arXiv preprint arXiv:2504.19413},
  year={2025}
}

⚖️ License

Apache 2.0 — see the LICENSE file for details.

mem0