Observability
OpenTelemetry traces, Prometheus metrics, and structured logging for Baker Street.
Observability
Baker Street ships with an optional observability stack that deploys to a separate namespace. This gives you full visibility into how the agent reasons, where time is spent, and what is happening across the system.
Stack Components
The observability stack includes:
| Component | Role |
|---|---|
| OpenTelemetry Collector | Receives OTLP from all services |
| Tempo | Distributed trace storage |
| Loki | Log aggregation |
| Prometheus | Metrics collection |
| Grafana | Dashboards and visualization |
Deploy the stack with:
kubectl apply -k k8s/overlays/observability
Distributed Tracing
Every API response includes an X-Trace-Id header. Trace context propagates through NATS messages, so a single user request can be traced across:
User Request --> Brain --> NATS --> Worker --> Tool Execution
LLM calls are instrumented as spans with metadata including:
- Model name and version
- Token usage (input/output)
- Tool call names and iteration count
- Response latency
Tool executions appear as child spans. You can see exactly how the agent reasoned through a request: which memories it retrieved, which tools it called, how many iterations it needed, and where time was spent.
Prometheus Metrics
Baker Street exports metrics on a /metrics endpoint from both Brain and Worker services:
baker_requests_total-- total API requests by endpoint and statusbaker_llm_calls_total-- LLM API calls by model and outcomebaker_llm_tokens_total-- token usage by model and direction (input/output)baker_jobs_total-- jobs dispatched by type and statusbaker_memory_operations_total-- memory store/search/delete operationsbaker_extension_tools_active-- currently registered extension toolsbaker_task_pods_active-- currently running task pods
Structured Logging
All services emit structured JSON logs with consistent fields:
{
"level": "info",
"service": "brain",
"traceId": "abc123...",
"spanId": "def456...",
"msg": "Tool call completed",
"tool": "memory_search",
"duration_ms": 42
}
Logs are collected by Loki and correlated with traces using the shared traceId field. Click a trace in Grafana and see the corresponding logs inline.
Grafana Dashboards
Baker Street ships with pre-built Grafana dashboards:
- Agent Overview -- request rate, response latency, error rate, active conversations
- LLM Usage -- model distribution, token consumption, cost estimation, latency percentiles
- Jobs and Workers -- job throughput, queue depth, worker utilization, failure rates
- Memory -- store/search rates, deduplication hits, Qdrant collection stats
- Extensions -- active extensions, tool call frequency, extension latency
Access Grafana by port-forwarding:
kubectl port-forward -n observability svc/grafana 3000:3000