Memory System
How Baker Street's vector memory and observational memory give the agent long-term recall.
Memory System
Baker Street's memory system gives the agent persistent, semantic recall across every conversation. Unlike stateless chat interfaces that forget everything between sessions, Baker Street remembers who you are, what you work on, and what you prefer.
Two-Layer Architecture
Memory operates in two layers that work together.
Vector Memory (Qdrant + Voyage AI)
The primary memory layer stores factual knowledge as vector embeddings in Qdrant. When you send a message, the Brain:
- Embeds the message using the Voyage AI embedding model
- Searches Qdrant for memories with high cosine similarity
- Injects relevant memories into the system prompt before calling Claude
This means the agent always has context about you. If you mentioned your Kubernetes cluster runs k3s two weeks ago, the agent will recall that when you ask about cluster operations today.
Automatic deduplication prevents memory bloat. When a new memory is stored, Baker Street checks for existing memories above 92% cosine similarity. If a near-duplicate exists, the new memory merges with the existing one rather than creating a duplicate entry.
Observational Memory (SQLite)
The second layer captures higher-level patterns. After each conversation, the Brain's observer process (running Haiku for efficiency) extracts structured observations:
- Decisions the user made
- Preferences they expressed
- Facts about their environment
- Issues they are tracking
- Procedures they described
A reflector periodically compresses these observations into abstract knowledge. This gives the agent a layered understanding: raw memories at the bottom, synthesized knowledge at the top.
Six Memory Categories
Memories are organized into six categories to keep retrieval focused:
| Category | Description | Example |
|---|---|---|
conversation | Context from past interactions | "User asked about Helm chart upgrades" |
fact | Factual knowledge about the user/world | "User's cluster runs k3s on 3 Raspberry Pis" |
preference | User preferences and opinions | "User prefers YAML over JSON for configs" |
procedure | How-to knowledge and workflows | "Deploy with: kubectl apply -k overlays/prod" |
reference | Reference material and documentation | "NATS JetStream consumer config format" |
reflection | Synthesized insights from the reflector | "User is building a homelab AI platform" |
When the Brain searches for relevant context, it can weight categories differently based on the conversation topic.
Memory Tools
Claude has direct access to memory through built-in tools:
memory_store-- Save a new memory with category and contentmemory_search-- Search for memories by semantic similaritymemory_list-- List recent memories, optionally filtered by category
The agent decides autonomously when to store and retrieve memories based on conversation context. You can also explicitly ask: "Remember that my production cluster is on GKE" or "What do you remember about my setup?"
Configuration
Key memory settings are configured via environment variables on the Brain:
QDRANT_URL: "http://qdrant:6333"
QDRANT_COLLECTION: "baker-memories"
VOYAGEAI_API_KEY: "pa-..."
MEMORY_SIMILARITY_THRESHOLD: "0.92" # Deduplication threshold
MEMORY_SEARCH_LIMIT: "10" # Max results per search
EMBEDDING_MODEL: "voyage-2" # Voyage AI model
Memory in the Web UI
The Web UI includes a memory browser where you can view, search, and manage stored memories. You can see what the agent remembers, delete incorrect entries, and browse by category.