Integration
Long-term memory for LiveKit Agents
LiveKit Agents are stateless by design. Every voice call starts blank, the caller re-explains themselves, and the agent loses continuity. `livekit-plugins-engram` adds a one-line memory backend so the agent remembers what the caller told you last Tuesday, recalls it mid-sentence, and explains *why* it surfaced any given fact.
Install
Three steps: sign up for an Engram API key, paste a BYOK LLM-provider key on /models, then drop the snippet below into LiveKit Agents.
Three steps to memory in your agent
- Sign up. Free, no card. You'll land on a Getting Started page that walks the next two steps.
- Add your LLM key. Engram is BYOK. Paste an OpenAI / Anthropic / Groq / Together / Fireworks key and we'll route every extraction and query call through your provider. You pay your provider directly. We never see your inference.
- Paste the snippet below into your agent and restart it. Use
Authorization: Bearer <api-key>with the API key from your portal.
livekit-plugins-engram: voice + realtime AI
One-line memory backend for LiveKit Agent. Use from lifecycle hooks (record what the user just said, recall relevant context before the LLM responds) or hand the model function tools. Source: github.com/lumetra-io/engram-livekit.
- Install:
- Export your API key:
- Pattern 1: record + recall in
on_user_turn_completed:
pip install livekit-plugins-engramexport ENGRAM_API_KEY="<api-key>"from livekit.agents import Agent
from livekit.plugins.engram import Engram
memory = Engram(bucket="caller-default")
class Receptionist(Agent):
async def on_user_turn_completed(self, chat_ctx, new_message):
text = new_message.text_content
if not text: return
await memory.astore_memory(text)
recall = await memory.aquery_memory(text)
if recall.get("answer"):
chat_ctx.add_message(role="system", content=f"Relevant memory: {recall['answer']}")What you can do once memory's wired in
- Voice receptionist that remembers caller preferences across separate calls
- Realtime customer-support agent with memory of prior tickets and decisions
- Voice assistant for a single user across sessions: a personal assistant that remembers you
- Multi-tenant voice deployment where each customer's bucket is isolated by `bucket=f'caller-{caller_id}'`
FAQ
Pattern 1 (lifecycle hooks) or Pattern 2 (model tools), which should I use?
Pattern 1 is automatic: every user turn gets recorded and recalled without the model having to call a tool. Pattern 2 gives the model control over store and recall. Combine them if you want both a passive transcript and active curation.
Does this work with the realtime API models (GPT-4o realtime, Gemini live)?
Yes, with a caveat: per LiveKit's docs, to use `on_user_turn_completed` with a realtime model you must configure turn detection to run **in your agent** instead of inside the realtime model. The plugin's lifecycle hooks attach to `Agent` either way, so just keep that turn-detection config in mind when setting up a realtime pipeline.
What about privacy when voice transcripts go into memory?
Treat the bucket like any voice-recording store. Engram's data isn't used to train models, but you should align retention with your voice-recording policy.
Related integrations
Ship durable memory in LiveKit Agents today
Free tier: 10K memories and 50K retrievals per month. No credit card. Same Engram backend powers all 41 integrations, so memories you write from one client are immediately queryable from the rest.