Integration

Long-term memory for smolagents

smolagents is HuggingFace's lightweight agent framework: `CodeAgent`, `ToolCallingAgent`, fast iteration. Memory between `agent.run()` calls is the obvious gap. `engram_tools(bucket=...)` returns two `Tool` instances that pass straight into any smolagents agent.

Install

Three steps: sign up for an Engram API key, paste a BYOK LLM-provider key on /models, then drop the snippet below into smolagents.

Three steps to memory in your agent

  1. Sign up. Free, no card. You'll land on a Getting Started page that walks the next two steps.
  2. Add your LLM key. Engram is BYOK. Paste an OpenAI / Anthropic / Groq / Together / Fireworks key and we'll route every extraction and query call through your provider. You pay your provider directly. We never see your inference.
  3. Paste the snippet below into your agent and restart it. Use Authorization: Bearer <api-key>with the API key from your portal.

engram-smolagents: durable memory for HuggingFace's lightweight agent framework

Two smolagents Tool instances bound to one Engram bucket. Plug into any CodeAgent or ToolCallingAgent and the agent gains persistent cross-session memory. Source: github.com/lumetra-io/engram-smolagents.

  1. Install:
  2. Terminal
    pip install lumetra-engram smolagents
  3. Vendor engram_smolagents.py (~50 LOC) and export your API key:
  4. Terminal
    curl -fsSL https://raw.githubusercontent.com/lumetra-io/engram-smolagents/main/engram_smolagents.py -o engram_smolagents.py
    export ENGRAM_API_KEY="<api-key>"
  5. Pass the tools to a CodeAgent:
  6. Python
    from smolagents import CodeAgent, LiteLLMModel
    from engram_smolagents import engram_tools
    
    agent = CodeAgent(
        tools=engram_tools(bucket="my-agent"),
        model=LiteLLMModel(model_id="deepseek/deepseek-chat"),
    )
    
    agent.run("Remember that I prefer dark mode and use vim for everything.")

What you can do once memory's wired in

  • Add memory to a `CodeAgent` for repeated code-writing tasks where context accumulates over time
  • Pin a per-user bucket so the agent's preferences stay consistent across sessions
  • Pair with smolagents' `LiteLLMModel` to run fully local: Engram for memory, local model for inference, all your data on-prem
  • Use as a research-agent backbone, one bucket per investigation, and have the agent recall past findings before continuing

FAQ

Does this work with both `CodeAgent` and `ToolCallingAgent`?

Yes. Both accept `tools=[...]`, and the two Engram tools register the same way regardless of agent type.

Why `forward` instead of `_run`?

smolagents' tool convention is `forward(...)` on `Tool` subclasses. The Engram tools follow that convention, so you don't need to do anything special.

Is it PyPI-installable?

Vendor `engram_smolagents.py` from the repo (~50 LOC). PyPI coming.

Ship durable memory in smolagents today

Free tier: 10K memories and 50K retrievals per month. No credit card. Same Engram backend powers all 41 integrations, so memories you write from one client are immediately queryable from the rest.