Every LLM call starts blind. SteelSpine's memory proxy injects persistent entity context into every prompt. Works with Ollama, LM Studio, llama.cpp, vLLM, and any OpenAI-compatible endpoint. No SDK. No framework changes. Memory lives on your machine.
CA$29.99/mo after trial · No vendor lock-in · Memory stays on your machine
LLMs are stateless by default. Every conversation, every agent run, every NPC interaction starts from zero context. You either stuff growing context windows (slow + expensive) or you build a memory layer yourself (months of engineering, lots of edge cases).
The third option is a transparent proxy that handles memory automatically. Mem0 and Letta
proved this category exists. SteelSpine's memory-agent implements the same pattern
with two structural advantages: local-first storage (memory lives in
~/.prime/entities/ on your machine, not a vendor's cloud) and
cryptographic audit chain (every memory mutation is signed and verifiable).
Three minutes to set up. One URL change. No SDK install.
steelspine start
# Memory proxy now listening on http://localhost:11435
# Auto-detects local LLM (Ollama, LM Studio, llama.cpp, vLLM)
# Instead of:
# OLLAMA_HOST=http://localhost:11434
# Use:
export OLLAMA_HOST=http://localhost:11435
# Or for any OpenAI-compatible client:
export OPENAI_BASE_URL=http://localhost:11435/v1
That's it. The proxy auto-extracts entities from each conversation turn, persists them to
~/.prime/entities/<name>.json, and injects relevant context into the next
prompt automatically. Your agent code does not change. Your framework does not change.
Same LLM. Same prompt. The proxy injects entity context automatically. Same one URL change, different conversational quality.
No SDK install. No framework changes. The proxy speaks OpenAI's chat completions API, so any client (LangChain, LlamaIndex, custom, terminal) that points at it gets memory automatically. Same with LM Studio, llama.cpp, vLLM, and any OpenAI-compatible endpoint.
All entity memory is stored at ~/.prime/entities/ on your filesystem. No cloud
sync. No vendor servers. No data residency complications. Air-gapped deployments are
first-class. Memory survives vendor outages, account terminations, and ToS changes.
Memory mutations append to the same HMAC-SHA256 + Ed25519 hash chain that powers SteelSpine's compliance audit. Tampering with stored memory breaks the chain. Useful for regulated environments where AI-modified data has audit-trail requirements.
The proxy speaks OpenAI's chat completions API. Any client (LangChain, LlamaIndex, custom, terminal) that points at it gets memory automatically. Works with Ollama, LM Studio, llama.cpp, vLLM, OpenAI itself, Anthropic via gateway, anything that speaks the same wire protocol.
Memory does not just persist; it has a replay surface. Reconstruct what an entity knew at
any point in time with state-at <event_id>. Branch alternative memory
states for what-if exploration. Debug memory bugs deterministically.
The memory category exists. SteelSpine is the local-first + audit-grade entry. Honest comparison:
| Property | Mem0 | Letta | Zep | SteelSpine Memory |
|---|---|---|---|---|
| Pricing entry | $19/mo retail, $249/mo Pro | Open source (self-host) | Free tier + credits | CA$29.99/mo + free trial |
| Local-first storage | Cloud-first | Self-hosted | Cloud-first | Local-first default |
| Cryptographic audit chain | No | No | No | HMAC + Ed25519 signed |
| SDK required | SDK | SDK / Python framework | SDK | No SDK — transparent proxy |
| Setup steps | npm install + API key | Self-host stack | API key + integration | One env var |
| Works with any OpenAI-compatible LLM | SDK-specific | Framework-specific | SDK-specific | Any OpenAI-compatible endpoint |
| Replay / branch memory state | No | No | No | replay-branch, state-at |
| Bundled with audit + replay infrastructure | Memory only | Memory only | Memory only | Full SteelSpine product |
Mem0 has $24M Series A and 48k GitHub stars. Strong product. SteelSpine differentiates on local-first + cryptographic audit + drop-in transparent proxy + bundled with full audit/replay stack. Same category, structurally different positioning.
One tier. Memory plus the full SteelSpine stack (debug, replay, verify, branch, OTEL, MCP) included.
Multi-seat team or enterprise deployment? See DevOps tier or compliance tier.