Integration

Long-term memory for Pipecat

Pipecat is the open-source framework for voice and realtime AI pipelines (STT → LLM → TTS with frame-level orchestration), and every call starts blank. `EngramMemoryProcessor` is a `FrameProcessor` that sits between the user input and the LLM context aggregator: it stores every finalized user turn, and it injects relevant prior memories as a silent `skip_tts=True` `TextFrame` right before each LLM call.

Get an API key (free) All integrations →

Install

Three steps: sign up for an Engram API key, paste a BYOK LLM-provider key on /models, then drop the snippet below into Pipecat.

Three steps to memory in your agent

Sign up. Free, no card. You'll land on a Getting Started page that walks the next two steps.
Add your LLM key. Engram is BYOK. Paste an OpenAI / Anthropic / Groq / Together / Fireworks key and we'll route every extraction and query call through your provider. You pay your provider directly. We never see your inference.
Paste the snippet below into your agent and restart it. Use Authorization: Bearer <api-key>with the API key from your portal.

pipecat-engram: voice + realtime memory FrameProcessor

EngramMemoryProcessor sits between the user input and the LLM context aggregator in your Pipecat pipeline. Stores every finalized user turn; injects recalled memory as a silent skip_tts=True TextFrame right before each LLM call. Source: github.com/lumetra-io/engram-pipecat.

Install (PyPI release pending; install from source for now):

Terminal

pip install git+https://github.com/lumetra-io/engram-pipecat

Export your API key:

Terminal

export ENGRAM_API_KEY="<api-key>"

Slot EngramMemoryProcessor into your pipeline (between user input and the LLM context aggregator):

Python

from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.task import PipelineTask
from pipecat.pipeline.runner import PipelineRunner
from pipecat_engram import EngramMemoryProcessor

memory = EngramMemoryProcessor(
    bucket="my_agent",
    recall_prefix="Relevant memory: ",
)

pipeline = Pipeline([
    transport.input(),               # mic / WebRTC / etc.
    stt,                             # e.g. DeepgramSTTService
    memory,                          # <-- EngramMemoryProcessor
    context_aggregator.user(),
    llm,                             # e.g. OpenAILLMService
    tts,
    transport.output(),
    context_aggregator.assistant(),
])

await PipelineRunner().run(PipelineTask(pipeline))

What you can do once memory's wired in

Voice receptionist that remembers caller preferences across separate calls
Realtime customer-support agent with memory of prior conversations and decisions
Mix with Pipecat's existing transports (WebRTC, Daily, LiveKit); memory is transport-agnostic
Per-caller buckets via `EngramMemoryProcessor(bucket=f'caller-{caller_id}')`

FAQ

Where does the processor sit in the pipeline?

Between the user input (STT or text) and `context_aggregator.user()`. That way the LLM context aggregator sees the recall `TextFrame` just before each user turn, but it doesn't go through TTS (because `skip_tts=True`).

Does it work in text-only pipelines?

Yes. `EngramMemoryProcessor` also handles plain `TextFrame`s flowing downstream from the user side, so STT and TTS are optional.

What does `recall_prefix` do?

It's the literal prefix prepended to the recalled memory string before injection (default `'Relevant memory: '`). Customize it to match how your prompt expects context to be framed.

Related integrations

Ship durable memory in Pipecat today

Free tier: 10K memories and 50K retrievals per month. No credit card. Same Engram backend powers all 41 integrations, so memories you write from one client are immediately queryable from the rest.

Start free See pricing