Integration
Long-term memory for Agency Swarm
VRSEN's agency-swarm runs multi-agent agencies with explicit communication flows. Two memory gaps show up in practice: thread state evaporates on process restart, and agents can't share curated facts. The Engram recipe covers both. `make_engram_callbacks()` handles passive thread persistence, and `StoreMemoryTool` / `QueryMemoryTool` give you active in-thread memory.
Install
Three steps: sign up for an Engram API key, paste a BYOK LLM-provider key on /models, then drop the snippet below into Agency Swarm.
Three steps to memory in your agent
- Sign up. Free, no card. You'll land on a Getting Started page that walks the next two steps.
- Add your LLM key. Engram is BYOK. Paste an OpenAI / Anthropic / Groq / Together / Fireworks key and we'll route every extraction and query call through your provider. You pay your provider directly. We never see your inference.
- Paste the snippet below into your agent and restart it. Use
Authorization: Bearer <api-key>with the API key from your portal.
engram-agency-swarm: thread persistence + in-thread tools
Two patterns. Pattern 1: passive thread persistence viasave_threads_callback / load_threads_callback; every turn round-trips through Engram. Pattern 2: StoreMemoryTool / QueryMemoryTool the agents call mid-run. Use both. Source: github.com/lumetra-io/engram-agency-swarm.
- Clone and install (or vendor
engram_agency_swarm.py): - Wire both patterns into your
Agency:
git clone https://github.com/lumetra-io/engram-agency-swarm
cd engram-agency-swarm
pip install -e .
export ENGRAM_API_KEY="<api-key>"
export ENGRAM_BUCKET="my-agency"from agency_swarm import Agency, Agent
from engram_agency_swarm import (
make_engram_callbacks,
StoreMemoryTool,
QueryMemoryTool,
)
save_cb, load_cb = make_engram_callbacks(bucket="my-agency")
ceo = Agent(name="CEO", instructions="...", tools=[StoreMemoryTool, QueryMemoryTool])
worker = Agent(name="Worker", instructions="...", tools=[StoreMemoryTool, QueryMemoryTool])
agency = Agency(
ceo,
communication_flows=[(ceo, worker)],
save_threads_callback=save_cb,
load_threads_callback=load_cb,
)What you can do once memory's wired in
- Pattern 1: pass the callbacks to `Agency(...)` and the entire flat-list thread state round-trips through Engram, so process restarts don't lose work.
- Pattern 2: give every agent the two `BaseTool` subclasses so they can curate shared facts mid-run.
- Combine both: the callbacks capture the full transcript, the tools capture extracted, agent-curated knowledge. One bucket, two layers.
- Per-agency buckets for multi-tenant agency deployments
FAQ
Do I need both patterns?
In production, we recommend yes. Callbacks give you passive durability (no agent has to remember to persist), and tools give you active curation (the agent decides what's worth keeping). They share one bucket.
Where does the bucket come from?
`make_engram_callbacks(bucket='my-agency')` for callbacks, and the `ENGRAM_BUCKET` env var for the tools (since `BaseTool` instances are class-level and don't accept a bucket arg directly).
Why `BaseTool` and not `@function_tool`?
Since agency-swarm's v1 migration onto the OpenAI Agents SDK, `@function_tool` is the recommended path and `BaseTool` is the legacy/compat surface. We ship `BaseTool` subclasses because they fit cleanly with the callback-driven thread persistence. If you'd rather use `@function_tool`-decorated callables, the same Engram client can sit underneath with a few lines of glue.
Is this PyPI-installable?
Not yet. Clone the repo and `pip install -e .`, or vendor `engram_agency_swarm.py`. Recipe-style integration for now, with PyPI release coming.
Related integrations
Ship durable memory in Agency Swarm today
Free tier: 10K memories and 50K retrievals per month. No credit card. Same Engram backend powers all 41 integrations, so memories you write from one client are immediately queryable from the rest.