Baker Street

Memory System

How Baker Street's vector memory and observational memory give the agent long-term recall.

Memory System

Baker Street's memory system gives the agent persistent, semantic recall across every conversation. Unlike stateless chat interfaces that forget everything between sessions, Baker Street remembers who you are, what you work on, and what you prefer.

Two-Layer Architecture

Memory operates in two layers that work together.

Vector Memory (Qdrant + Voyage AI)

The primary memory layer stores factual knowledge as vector embeddings in Qdrant. When you send a message, the Brain:

  1. Embeds the message using the Voyage AI embedding model
  2. Searches Qdrant for memories with high cosine similarity
  3. Injects relevant memories into the system prompt before calling Claude

This means the agent always has context about you. If you mentioned your Kubernetes cluster runs k3s two weeks ago, the agent will recall that when you ask about cluster operations today.

Automatic deduplication prevents memory bloat. When a new memory is stored, Baker Street checks for existing memories above 92% cosine similarity. If a near-duplicate exists, the new memory merges with the existing one rather than creating a duplicate entry.

Observational Memory (SQLite)

The second layer captures higher-level patterns. After each conversation, the Brain's observer process (running Haiku for efficiency) extracts structured observations:

  • Decisions the user made
  • Preferences they expressed
  • Facts about their environment
  • Issues they are tracking
  • Procedures they described

A reflector periodically compresses these observations into abstract knowledge. This gives the agent a layered understanding: raw memories at the bottom, synthesized knowledge at the top.

Six Memory Categories

Memories are organized into six categories to keep retrieval focused:

CategoryDescriptionExample
conversationContext from past interactions"User asked about Helm chart upgrades"
factFactual knowledge about the user/world"User's cluster runs k3s on 3 Raspberry Pis"
preferenceUser preferences and opinions"User prefers YAML over JSON for configs"
procedureHow-to knowledge and workflows"Deploy with: kubectl apply -k overlays/prod"
referenceReference material and documentation"NATS JetStream consumer config format"
reflectionSynthesized insights from the reflector"User is building a homelab AI platform"

When the Brain searches for relevant context, it can weight categories differently based on the conversation topic.

Memory Tools

Claude has direct access to memory through built-in tools:

  • memory_store -- Save a new memory with category and content
  • memory_search -- Search for memories by semantic similarity
  • memory_list -- List recent memories, optionally filtered by category

The agent decides autonomously when to store and retrieve memories based on conversation context. You can also explicitly ask: "Remember that my production cluster is on GKE" or "What do you remember about my setup?"

Configuration

Key memory settings are configured via environment variables on the Brain:

QDRANT_URL: "http://qdrant:6333"
QDRANT_COLLECTION: "baker-memories"
VOYAGEAI_API_KEY: "pa-..."
MEMORY_SIMILARITY_THRESHOLD: "0.92"  # Deduplication threshold
MEMORY_SEARCH_LIMIT: "10"           # Max results per search
EMBEDDING_MODEL: "voyage-2"         # Voyage AI model

Memory in the Web UI

The Web UI includes a memory browser where you can view, search, and manage stored memories. You can see what the agent remembers, delete incorrect entries, and browse by category.