Integration

Long-term memory for Open WebUI

Open WebUI is the self-hosted UI for local LLMs (Ollama, vLLM, llama.cpp, whatever you're running). It speaks Streamable HTTP MCP or OpenAPI, not SSE. The Open WebUI team's own `mcpo` proxy bridges Engram's SSE endpoint to OpenAPI so it can be added through the External Tools panel.

Get an API key (free) All integrations →

Install

Three steps: sign up for an Engram API key, paste a BYOK LLM-provider key on /models, then drop the snippet below into Open WebUI.

Three steps to memory in your agent

Sign up. Free, no card. You'll land on a Getting Started page that walks the next two steps.
Add your LLM key. Engram is BYOK. Paste an OpenAI / Anthropic / Groq / Together / Fireworks key and we'll route every extraction and query call through your provider. You pay your provider directly. We never see your inference.
Paste the snippet below into your agent and restart it. Use Authorization: Bearer <api-key>with the API key from your portal.

engram-open-webui: via the mcpo SSE→OpenAPI bridge

Open WebUI's External Tools panel speaks Streamable HTTP MCP or OpenAPI; Engram is SSE-only. The official answer is mcpo, a tiny proxy that exposes any MCP server as an OpenAPI HTTP server. Source: github.com/lumetra-io/engram-open-webui.

Install mcpo and drop in a config that points it at Engram:

Terminal

pip install mcpo

~/.config/mcpo/engram.json

{
  "mcpServers": {
    "engram": {
      "type": "sse",
      "url": "https://mcp.lumetra.io/mcp/sse",
      "headers": {
        "Authorization": "Bearer <api-key>"
      }
    }
  }
}

Run the bridge on a local port:

Terminal

mcpo --port 8001 --config ~/.config/mcpo/engram.json

In Open WebUI: Admin Settings → External Tools → + Add Server, pick OpenAPI, set URL to http://localhost:8001/engram. The six Engram tools show up under External Tools and can be enabled per model.

What you can do once memory's wired in

Give your local Llama / Qwen / Mistral instance the same memory layer your cloud agents use
Recall conversations from earlier in the day when GPU memory pressure forced you to swap models
Pull up notes from past Open WebUI sessions across browser restarts
Run a fully local stack (Ollama + Open WebUI + Engram) with no third-party model calls in the loop

FAQ

Why does Open WebUI need a bridge?

Its External Tools panel speaks Streamable HTTP MCP or OpenAPI, and Engram's hosted MCP is SSE-only today. `mcpo` is Open WebUI's official answer: it exposes any MCP server as an OpenAPI HTTP server on a local port.

Where does the bridge run?

Wherever Open WebUI can reach it on localhost. Most users `pip install mcpo` and run it as a systemd service or background process pointing at Engram's SSE URL. The bridge must stay running for the tools to work.

OpenAPI tab or MCP tab when adding to Open WebUI?

Pick **OpenAPI**, URL `http://localhost:8001/engram`. `mcpo` is a one-way MCP→OpenAPI bridge: it does *not* expose a downstream Streamable HTTP MCP endpoint, so the MCP (Streamable HTTP) tab is the wrong choice for this setup.

Related integrations

Ship durable memory in Open WebUI today

Free tier: 10K memories and 50K retrievals per month. No credit card. Same Engram backend powers all 41 integrations, so memories you write from one client are immediately queryable from the rest.

Start free See pricing