Glossary

AI agent memory, defined.

Plain-English definitions of the terms that come up most often across the Engram docs, blog, and pricing. Each entry is Engram-aware where that matters, vendor-neutral where it doesn't.

BM25

A classical ranking function for exact-token lexical retrieval.

BM25 ("Best Match 25") scores how well a document matches a query based on term frequency, document length, and inverse document frequency. It is the lexical-retrieval workhorse — fast, deterministic, and unbeatable at exact recall (emails, IDs, file paths, error strings, model names). Engram runs BM25 alongside vector search and a knowledge graph, fuses the scores with reciprocal rank fusion, and reranks with a cross-encoder before returning results.

Bucket

A namespace inside a tenant for grouping related memories.

A bucket is the user-facing scope for memories inside Engram. A single tenant can have many buckets — one per project, one per agent, one per end-user, whatever shape your product needs. Retrieval is bucket-scoped by default, and buckets are never a pricing dimension, so you can make as many as you need. Buckets compose with optional agent_id and run_id filters for finer isolation inside multi-agent tenants.

Explainability

Returning a trace of why each memory was recalled, not just the memory.

An explainable memory system, on every recall, returns the supporting evidence: which engines fired, what they matched on, the graph facts that contributed, the per-memory scores, and the canonical profile the composer saw. When an agent answers wrong, the trace tells you whether retrieval missed the right memory or the composer misread what it was given. "cosine: 0.87" is not a debug log entry; "BM25 matched on 'invoice'; graph linked it to the customer entity" is.

Knowledge graph

Entities and the labelled relationships between them.

A knowledge graph stores facts as subject–predicate–object triples ("Alice owns the payments migration", "Sam pairs with Priya") and lets retrieval traverse those relationships at query time. It is what makes relational questions ("who is the owner of X?", "what depends on Y?") tractable — vector search alone collapses relations into similarity. Engram extracts triples at write time and uses them as the third retrieval engine alongside BM25 and vectors.

MCP (Model Context Protocol)

An open standard for connecting AI applications to external systems.

MCP is the open protocol introduced by Anthropic in late 2024 and adopted across Claude Code, Cursor, Windsurf, ChatGPT Connectors, OpenCode, OpenClaw, and many other AI clients. It standardises how a host application discovers and calls remote tools, resources, and prompts. Engram exposes its memory tools (store_memory, query_memory, list_buckets, etc.) over MCP at https://mcp.lumetra.io, so any MCP-compatible client can use it with a config-file install. See the MCP specification for the full protocol.

Memory (in Engram)

A persisted, retrievable record about a user, conversation, or project.

In Engram, a "memory" is a stored row representing a fact, decision, preference, or outcome the agent should remember across sessions. Each memory carries its raw content, an embedding for vector recall, normalised text for BM25 recall, optional extracted graph triples, and scoping metadata (tenant, bucket, optional agent/run). Memories persist until explicitly deleted; recency-aware ranking demotes older facts without removing them.

Profile (canonical user profile)

A structured summary of a user, regenerated after meaningful updates.

The canonical profile is a structured JSON summary of who a user is and what is known about them, generated by a single LLM call over the conversation history and cached on the bucket. The composer reads it on every query so the agent has a coherent picture of the user without re-summarising 500 memories per turn. Triggered on first query for a fresh bucket and on explicit POST /v1/buckets/{id}/profile/regenerate calls.

Rerank (cross-encoder rerank)

A second-pass scoring step that reorders candidate memories by relevance.

After BM25, vectors, and the knowledge graph each return their top-k candidates, a cross-encoder model scores every candidate against the query directly — not by similarity in an embedding space, but by reading the query and the candidate together and predicting relevance. Reranking is the single highest-leverage retrieval step for accuracy and is what closes the gap between "the right memory was somewhere in the top 50" and "the right memory is at position 1 or 2".

Retrieval (one retrieval)

A single end-to-end query against the memory layer.

In Engram's pricing and metering, "one retrieval" is one call to the memory layer — typically an MCP query_memory tool call or a POST /v1/query HTTP request — regardless of how many internal engines run. Retrievals are the dominant cost driver of an agent-memory deployment and are metered separately from stored memories.

Tenant

The top-level isolation boundary for a customer.

A tenant is the highest-level scoping axis in Engram — typically one tenant per customer account. Cross-tenant isolation is enforced as the leading filter on every retrieval query, backed by foreign-key cascades on tenant deletion. Inside a tenant, buckets carve out per-project or per-user namespaces.

Vector search

Retrieval by similarity in a learned embedding space.

Vector search encodes content into high-dimensional vectors (via a model like a sentence-transformer) and returns the nearest neighbours of a query vector in that space. It catches paraphrases that BM25 misses ("how do I cancel my plan?" vs "what's the unsubscribe flow?") but stumbles on exact-token recall and relational queries. Engram uses pgvector for its vector engine.