Integration
Long-term memory for Open WebUI
Open WebUI is the self-hosted UI for local LLMs (Ollama, vLLM, llama.cpp, whatever you're running). It speaks Streamable HTTP MCP or OpenAPI, not SSE. The Open WebUI team's own `mcpo` proxy bridges Engram's SSE endpoint to OpenAPI so it can be added through the External Tools panel.
Install
Three steps: sign up for an Engram API key, paste a BYOK LLM-provider key on /models, then drop the snippet below into Open WebUI.
Three steps to memory in your agent
- Sign up. Free, no card. You'll land on a Getting Started page that walks the next two steps.
- Add your LLM key. Engram is BYOK. Paste an OpenAI / Anthropic / Groq / Together / Fireworks key and we'll route every extraction and query call through your provider. You pay your provider directly. We never see your inference.
- Paste the snippet below into your agent and restart it. Use
Authorization: Bearer <api-key>with the API key from your portal.
engram-open-webui: via the mcpo SSE→OpenAPI bridge
Open WebUI's External Tools panel speaks Streamable HTTP MCP or OpenAPI; Engram is SSE-only. The official answer is mcpo, a tiny proxy that exposes any MCP server as an OpenAPI HTTP server. Source: github.com/lumetra-io/engram-open-webui.
- Install
mcpoand drop in a config that points it at Engram: - Run the bridge on a local port:
- In Open WebUI: Admin Settings → External Tools → + Add Server, pick OpenAPI, set URL to
http://localhost:8001/engram. The six Engram tools show up under External Tools and can be enabled per model.
pip install mcpo{
"mcpServers": {
"engram": {
"type": "sse",
"url": "https://mcp.lumetra.io/mcp/sse",
"headers": {
"Authorization": "Bearer <api-key>"
}
}
}
}mcpo --port 8001 --config ~/.config/mcpo/engram.jsonWhat you can do once memory's wired in
- Give your local Llama / Qwen / Mistral instance the same memory layer your cloud agents use
- Recall conversations from earlier in the day when GPU memory pressure forced you to swap models
- Pull up notes from past Open WebUI sessions across browser restarts
- Run a fully local stack (Ollama + Open WebUI + Engram) with no third-party model calls in the loop
FAQ
Why does Open WebUI need a bridge?
Its External Tools panel speaks Streamable HTTP MCP or OpenAPI, and Engram's hosted MCP is SSE-only today. `mcpo` is Open WebUI's official answer: it exposes any MCP server as an OpenAPI HTTP server on a local port.
Where does the bridge run?
Wherever Open WebUI can reach it on localhost. Most users `pip install mcpo` and run it as a systemd service or background process pointing at Engram's SSE URL. The bridge must stay running for the tools to work.
OpenAPI tab or MCP tab when adding to Open WebUI?
Pick **OpenAPI**, URL `http://localhost:8001/engram`. `mcpo` is a one-way MCP→OpenAPI bridge: it does *not* expose a downstream Streamable HTTP MCP endpoint, so the MCP (Streamable HTTP) tab is the wrong choice for this setup.
Related integrations
Ship durable memory in Open WebUI today
Free tier: 10K memories and 50K retrievals per month. No credit card. Same Engram backend powers all 41 integrations, so memories you write from one client are immediately queryable from the rest.