Memory System

How Baker Street's vector memory and observational memory give the agent long-term recall.

Memory System

Baker Street's memory system gives the agent persistent, semantic recall across every conversation. Unlike stateless chat interfaces that forget everything between sessions, Baker Street remembers who you are, what you work on, and what you prefer.

Two-Layer Architecture

Memory operates in two layers that work together.

Vector Memory (Qdrant + Voyage AI)

The primary memory layer stores factual knowledge as vector embeddings in Qdrant. When you send a message, the Brain:

Embeds the message using the Voyage AI embedding model
Searches Qdrant for memories with high cosine similarity
Injects relevant memories into the system prompt before calling Claude

This means the agent always has context about you. If you mentioned your Kubernetes cluster runs k3s two weeks ago, the agent will recall that when you ask about cluster operations today.

Automatic deduplication prevents memory bloat. When a new memory is stored, Baker Street checks for existing memories above 92% cosine similarity. If a near-duplicate exists, the new memory merges with the existing one rather than creating a duplicate entry.

Observational Memory (SQLite)

The second layer captures higher-level patterns. After each conversation, the Brain's observer process (running Haiku for efficiency) extracts structured observations:

Decisions the user made
Preferences they expressed
Facts about their environment
Issues they are tracking
Procedures they described

A reflector periodically compresses these observations into abstract knowledge. This gives the agent a layered understanding: raw memories at the bottom, synthesized knowledge at the top.

Six Memory Categories

Memories are organized into six categories to keep retrieval focused:

Category	Description	Example
`conversation`	Context from past interactions	"User asked about Helm chart upgrades"
`fact`	Factual knowledge about the user/world	"User's cluster runs k3s on 3 Raspberry Pis"
`preference`	User preferences and opinions	"User prefers YAML over JSON for configs"
`procedure`	How-to knowledge and workflows	"Deploy with: kubectl apply -k overlays/prod"
`reference`	Reference material and documentation	"NATS JetStream consumer config format"
`reflection`	Synthesized insights from the reflector	"User is building a homelab AI platform"

When the Brain searches for relevant context, it can weight categories differently based on the conversation topic.

Memory Tools

Claude has direct access to memory through built-in tools:

memory_store -- Save a new memory with category and content
memory_search -- Search for memories by semantic similarity
memory_list -- List recent memories, optionally filtered by category

The agent decides autonomously when to store and retrieve memories based on conversation context. You can also explicitly ask: "Remember that my production cluster is on GKE" or "What do you remember about my setup?"

Configuration

Key memory settings are configured via environment variables on the Brain:

QDRANT_URL: "http://qdrant:6333"
QDRANT_COLLECTION: "baker-memories"
VOYAGEAI_API_KEY: "pa-..."
MEMORY_SIMILARITY_THRESHOLD: "0.92"  # Deduplication threshold
MEMORY_SEARCH_LIMIT: "10"           # Max results per search
EMBEDDING_MODEL: "voyage-2"         # Voyage AI model

Memory in the Web UI

The Web UI includes a memory browser where you can view, search, and manage stored memories. You can see what the agent remembers, delete incorrect entries, and browse by category.