Observability

OpenTelemetry traces, Prometheus metrics, and structured logging for Baker Street.

Observability

Baker Street ships with an optional observability stack that deploys to a separate namespace. This gives you full visibility into how the agent reasons, where time is spent, and what is happening across the system.

Stack Components

The observability stack includes:

Component	Role
OpenTelemetry Collector	Receives OTLP from all services
Tempo	Distributed trace storage
Loki	Log aggregation
Prometheus	Metrics collection
Grafana	Dashboards and visualization

Deploy the stack with:

kubectl apply -k k8s/overlays/observability

Distributed Tracing

Every API response includes an X-Trace-Id header. Trace context propagates through NATS messages, so a single user request can be traced across:

User Request --> Brain --> NATS --> Worker --> Tool Execution

LLM calls are instrumented as spans with metadata including:

Model name and version
Token usage (input/output)
Tool call names and iteration count
Response latency

Tool executions appear as child spans. You can see exactly how the agent reasoned through a request: which memories it retrieved, which tools it called, how many iterations it needed, and where time was spent.

Prometheus Metrics

Baker Street exports metrics on a /metrics endpoint from both Brain and Worker services:

baker_requests_total -- total API requests by endpoint and status
baker_llm_calls_total -- LLM API calls by model and outcome
baker_llm_tokens_total -- token usage by model and direction (input/output)
baker_jobs_total -- jobs dispatched by type and status
baker_memory_operations_total -- memory store/search/delete operations
baker_extension_tools_active -- currently registered extension tools
baker_task_pods_active -- currently running task pods

Structured Logging

All services emit structured JSON logs with consistent fields:

{
  "level": "info",
  "service": "brain",
  "traceId": "abc123...",
  "spanId": "def456...",
  "msg": "Tool call completed",
  "tool": "memory_search",
  "duration_ms": 42
}

Logs are collected by Loki and correlated with traces using the shared traceId field. Click a trace in Grafana and see the corresponding logs inline.

Grafana Dashboards

Baker Street ships with pre-built Grafana dashboards:

Agent Overview -- request rate, response latency, error rate, active conversations
LLM Usage -- model distribution, token consumption, cost estimation, latency percentiles
Jobs and Workers -- job throughput, queue depth, worker utilization, failure rates
Memory -- store/search rates, deduplication hits, Qdrant collection stats
Extensions -- active extensions, tool call frequency, extension latency

Access Grafana by port-forwarding:

kubectl port-forward -n observability svc/grafana 3000:3000