Runtime Observability
alquimia-runtime emits OpenTelemetry traces and logs. Both are disabled by default and activated by setting environment variables before startup.
For the metrics signal (emitted by alquimia-core), see Core SDK Observability.
For the full picture of how all three signals relate, see Observability.
setup_telemetry()
Section titled “setup_telemetry()”def setup_telemetry() -> NoneInitialises the OTEL TracerProvider and LoggerProvider. Called automatically in the FastAPI lifespan startup — you do not call this manually when running alquimia-runtime.
- Idempotent — subsequent calls are no-ops if already initialised.
- Deferred — called in the lifespan, not at import time. This prevents OTEL from making network connections during test collection (QUAL-004).
- Conditional — if
OTEL_COLLECTOR_ENDPOINT_TRACESis not set, theTracerProvideris still created (so spans are valid objects) but noBatchSpanProcessoris attached — spans are discarded. IfOTEL_COLLECTOR_ENDPOINT_LOGSis also not set, noLoggerProvideris created and the loguru→OTEL bridge is not installed.
Environment variables
Section titled “Environment variables”| Variable | Default | Description |
|---|---|---|
OTEL_COLLECTOR_ENDPOINT_TRACES | — | OTLP HTTP endpoint for traces. Traces are created but discarded if unset. |
OTEL_COLLECTOR_ENDPOINT_LOGS | (same as traces) | OTLP HTTP endpoint for logs. Defaults to OTEL_COLLECTOR_ENDPOINT_TRACES. |
OTEL_ALQUIMIA_SERVICE_NAME | alquimia | service.name resource attribute on all spans and log records |
The two-endpoint split
Section titled “The two-endpoint split”Traces and logs are sent to separate OTEL endpoints. This is intentional — it allows you to route them to different collector pipelines:
OTEL_COLLECTOR_ENDPOINT_TRACES → Jaeger / Tempo (trace backend)OTEL_COLLECTOR_ENDPOINT_LOGS → Loki / OpenSearch (log backend)If you use a single OpenTelemetry Collector that handles both, set both variables to the same URL:
OTEL_COLLECTOR_ENDPOINT_TRACES=http://otel-collector:4318/v1/tracesOTEL_COLLECTOR_ENDPOINT_LOGS=http://otel-collector:4318/v1/logsTraces
Section titled “Traces”FastAPIInstrumentor
Section titled “FastAPIInstrumentor”FastAPIInstrumentor.instrument_app(app) is called once after the FastAPI app is created. It automatically creates an OTEL span for every HTTP request:
- Span name:
{HTTP_METHOD} {route_template}(e.g.,POST /event/infer/{assistant_id}) - Span attributes: HTTP method, URL, status code, route template
- Span lifecycle: starts when the request enters the ASGI stack, ends when the response is sent
No configuration is required. All routers — public and internal — are instrumented automatically.
CloudEvent spans (finalize_span middleware)
Section titled “CloudEvent spans (finalize_span middleware)”Internal CloudEvent requests (to /controller/process, /models/response, /tools/process, etc.) get an additional span that covers the full handler execution, not just the HTTP response time.
The finalize_span middleware is registered as the outermost middleware in the stack (Starlette reverses registration order, so the last-registered middleware runs first). This is a correctness constraint — do not reorder the middleware without understanding the span lifecycle.
# Execution order (outermost → innermost):# finalize_span → request_id_middleware → telemetry_metadata_middleware → CORSMiddleware → handlerThe middleware calls cloudevent_interceptor(request) to start a span for CloudEvent requests, then calls span.end() after the full response is returned. For non-CloudEvent requests, it is a pass-through.
Trace context propagation
Section titled “Trace context propagation”Trace context is propagated via the W3C traceparent header using TraceContextTextMapPropagator. When the master publishes a CloudEvent to Kafka, it injects the current trace context into the CloudEvent headers. The worker propagates it through the dispatch chain.
This means a single inference run — which involves multiple CloudEvent hops — produces a connected trace tree in your backend, with parent-child relationships between spans.
Loguru → OTEL bridge
Section titled “Loguru → OTEL bridge”When OTEL_COLLECTOR_ENDPOINT_LOGS is set, setup_loguru_otel_sink() installs a custom loguru sink that forwards every log record to the OTEL LoggerProvider.
Each OTEL log record carries:
Source location:
| Attribute | Value |
|---|---|
log.source.name | Logger name (Python module) |
code.filepath | Absolute path to the source file |
code.lineno | Line number |
code.function | Function name |
Span context (when a span is active):
| Attribute | Value |
|---|---|
trace_id | Active span’s trace ID |
span_id | Active span’s span ID |
trace_flags | Active span’s trace flags |
This enables log-to-trace correlation in backends like Grafana — click a log line to jump to the trace, or click a span to see its logs.
CommonAttributes (when a request is in flight):
All five CommonAttributes fields (assistant_id, agentspace_id, user_id, session_id, channel_id) are attached to every log record emitted during a request. This enables filtering logs by assistant_id or session_id without any manual enrichment.
Exception info (when the log record includes an exception):
| Attribute | Value |
|---|---|
exception.type | Exception class name |
exception.message | Exception message string |
Log level mapping
Section titled “Log level mapping”| Loguru level | OTEL severity number | OTEL severity text |
|---|---|---|
TRACE | 1 | TRACE |
DEBUG | 5 | DEBUG |
INFO | 9 | INFO |
SUCCESS | 9 | INFO |
WARNING | 13 | WARN |
ERROR | 17 | ERROR |
CRITICAL | 21 | FATAL |
Console output
Section titled “Console output”Regardless of whether OTEL logs are enabled, the runtime always writes structured logs to stdout:
2025-07-15 12:00:00 | INFO | routers.event:infer:42 - Inference request receivedFormat: {time} | {level} | {name}:{function}:{line} - {message}
Log level is controlled by LOGGING_LEVEL (default 20 = INFO). Set DEBUG=true to switch to DEBUG level.
telemetry_metadata_middleware
Section titled “telemetry_metadata_middleware”This middleware extracts CommonAttributes from incoming request headers and stores them in a ContextVar for the duration of the request:
async def telemetry_metadata_middleware(request: Request, call_next): clear_runtime_telemetry_metadata() context_metadata = CommonAttributes.model_validate(request.headers) if context_metadata: set_runtime_telemetry_metadata(context_metadata) try: return await call_next(request) finally: clear_runtime_telemetry_metadata()The stored CommonAttributes are used by:
- The loguru→OTEL bridge — attached to every log record
alquimia-core’sobserve()— used as metric dimensions
The ContextVar is cleared after every request, so there is no cross-request leakage.
Reading CommonAttributes in handlers
Section titled “Reading CommonAttributes in handlers”from telemetry import get_runtime_telemetry_metadata
context_metadata = get_runtime_telemetry_metadata()if context_metadata: print(context_metadata.assistant_id)X-Request-ID middleware
Section titled “X-Request-ID middleware”Every request gets an X-Request-ID header (OBS-001):
@app.middleware("http")async def request_id_middleware(request: Request, call_next): request_id = request.headers.get("X-Request-ID") or str(uuid.uuid4()) request.state.request_id = request_id response = await call_next(request) response.headers["X-Request-ID"] = request_id return response- If the client provides
X-Request-ID, it is propagated unchanged. - If not, a UUID v4 is generated.
- The ID is always echoed back in the response header.
- It is attached to
request.state.request_idfor use in handlers.
X-Request-ID is per-HTTP-request. Use it to correlate logs for a single HTTP call. Use task_id to correlate everything across an entire inference run (multiple HTTP calls + CloudEvent hops).
Startup warnings
Section titled “Startup warnings”The runtime emits non-fatal warnings at startup for common misconfigurations (OBS-002, OBS-003). These appear in logs as [STARTUP CONFIG] entries and do not prevent the service from starting.
| Code | Condition | Log message |
|---|---|---|
OBS-002 | VAULT_TOKEN is empty | OBS-002: VAULT_TOKEN is not set. Registry secret resolution via Vault will fail. |
OBS-002 | API_TOKEN is a default/test value | OBS-002: API_TOKEN appears to be a default/test value. |
OBS-003 | Any S3 blob setting is empty | OBS-003: Blob S3 storage is not fully configured. Missing/empty: [fields]. |
These are intentionally non-fatal. The goal is to surface misconfiguration early in logs rather than at first request time.
Middleware execution order
Section titled “Middleware execution order”Starlette reverses middleware registration order. The actual execution order (outermost → innermost) is:
1. finalize_span — starts/ends CloudEvent spans2. request_id_middleware — generates/propagates X-Request-ID3. telemetry_metadata_middleware — extracts CommonAttributes into ContextVar4. CORSMiddleware — handles CORS preflight5. Route handler — the actual endpoint logicThis order is a correctness constraint. finalize_span must be outermost so it covers the full request lifecycle including all other middleware. telemetry_metadata_middleware must run before the handler so CommonAttributes are available when the handler calls alquimia-core.
Related pages
Section titled “Related pages”- Observability — the full observability model
- Core SDK Observability — metrics reference
- Configuration Reference — all OTEL environment variables
Source
Section titled “Source”Alquimia-ai/alquimia-runtime—runtime/src/telemetry.py,runtime/src/main.py