Skip to content

Core SDK Observability

alquimia-core emits OpenTelemetry metrics for every significant event in the agent execution loop. Metrics are disabled by default — call setup_observability() once at application startup to enable them.

from alquimia.core.observability import setup_observability
setup_observability(
endpoint: str | None = None,
interval_millis: int | None = None,
meter_name: str | None = None,
service_name: str | None = None,
) -> None

Initialises the global OTEL MeterProvider. Call this once at application startup, not at import time.

  • Idempotent — subsequent calls are no-ops if the meter is already configured.
  • No-op if disabled — if endpoint is None and OTEL_COLLECTOR_ENDPOINT is not set, the function returns immediately without configuring anything.
  • No import-time side effects — importing alquimia.core.observability does not connect to any collector or mutate global state.
ParameterTypeDefaultDescription
endpointstr | NoneOTEL_COLLECTOR_ENDPOINT env varOTLP HTTP endpoint for metrics (e.g., http://otel-collector:4318/v1/metrics)
interval_millisint | NoneOTEL_EXPORTER_INTERVAL_MILLIS env var (default 5000)Export interval in milliseconds
meter_namestr | NoneOTEL_ALQUIMIA_METER_NAME env var (default alquimia-metrics)Meter name in the collector
service_namestr | NoneOTEL_ALQUIMIA_SERVICE_NAME env var (default alquimia)service.name resource attribute

All parameters default to environment variables read at module import time:

VariableDefaultDescription
OTEL_COLLECTOR_ENDPOINTMetrics endpoint. Metrics are disabled if this is not set.
OTEL_EXPORTER_INTERVAL_MILLIS5000Export interval in milliseconds
OTEL_ALQUIMIA_METER_NAMEalquimia-metricsMeter name
OTEL_ALQUIMIA_SERVICE_NAMEalquimiaservice.name resource attribute
OTEL_EXCLUDED_ATTRIBUTES""Comma-separated dimension keys to strip before export
main.py or app startup
import asyncio
from alquimia.core.observability import setup_observability
# Minimal — reads all config from environment variables
setup_observability()
# Explicit — override specific parameters
setup_observability(
endpoint="http://otel-collector:4318/v1/metrics",
interval_millis=10_000,
service_name="my-alquimia-app",
)

When running via alquimia-runtime, setup_telemetry() is called automatically in the FastAPI lifespan. You still need to call setup_observability() separately to enable metrics — the runtime’s setup_telemetry() only configures traces and logs.


AlquimiaObserver is the class that holds all 18 metric instruments and dispatches observations based on event type. It is instantiated lazily on the first call to observe() after setup_observability() has been called.

You do not need to interact with AlquimiaObserver directly. The module-level observe() function handles dispatch:

from alquimia.core.observability import observe
from alquimia.core.base import CommonAttributes
# Called internally by alquimia-core — you do not call this manually
observe(event, context_metadata)

observe() is called automatically by the evaluate() loop for every command and response event. No additional wiring is required.


All metrics are exported under the alquimia-metrics meter (configurable via OTEL_ALQUIMIA_METER_NAME).

Every metric carries a subset of the following dimensions, derived from CommonAttributes:

Dimension keySourceDescription
assistant_idCommonAttributes.assistant_idThe agent being invoked
agentspace_idCommonAttributes.agentspace_idThe agentspace
user_idCommonAttributes.user_idThe end user
session_idCommonAttributes.session_idThe conversation session
channel_idCommonAttributes.channel_idThe channel (optional)

Token and latency metrics additionally carry:

Dimension keySourceDescription
model_nameResponseMetadata.model_nameThe LLM model that produced the response

Tool and shield metrics additionally carry:

Dimension keySourceDescription
event_typetype(event).__name__The command class name (e.g., ServerToolExecution)
tool_nameevent.name or event.tool_nameThe tool or shield name

Empathy metrics additionally carry:

Dimension keySourceDescription
rule_idevent.control_idThe empathy rule that matched

These counters increment on every successful ResponseInferenceResponse or ShieldInferenceResponse that carries token usage metadata.

MetricTypeUnitDescription
alquimia.llm.completion_tokensCounter1Tokens in the LLM’s response
alquimia.llm.prompt_tokensCounter1Tokens in the prompt sent to the LLM
alquimia.llm.total_tokensCounter1Total tokens (prompt + completion)

Dimensions: standard + model_name

Note: If the LLM response does not include token usage metadata (some providers omit it), these counters are not incremented for that call. A DEBUG log is emitted: "Usage metadata not found. Skipping token consumption metrics."


These histograms record timing data from ResponseMetadata.token_usage on every successful LLM response.

MetricTypeUnitDescription
alquimia.llm.completion_timeHistogramsTime to generate the completion
alquimia.llm.prompt_timeHistogramsTime to process the prompt
alquimia.llm.queue_timeHistogramsTime the request spent waiting in the LLM provider’s queue
alquimia.llm.total_timeHistogramsEnd-to-end request time (queue + prompt + completion)

Dimensions: standard + model_name

Note: These are populated from provider-supplied metadata. Not all LLM providers return all timing fields. Fields that are None are skipped.


MetricTypeUnitDescription
alquimia.tool_invocations_totalCounter1Incremented when a tool execution command is emitted
alquimia.tool_errors_totalCounter1Incremented when a ToolExecutionResponse has status="error"

Dimensions: standard + event_type + tool_name

event_type values for tool invocations:

ValueMeaning
ServerToolExecutionMCP or Llama Stack tool call
BuiltinToolExecutionBuilt-in tool (plan_mode, etc.)
ClientToolExecutionClient-side tool call
A2AInferenceAgent-to-agent delegation (counted as a tool invocation)

MetricTypeUnitDescription
alquimia.shield_invocations_totalCounter1Incremented when a ShieldInference command is emitted
alquimia.shield_errors_totalCounter1Incremented when a ShieldInferenceResponse has status="error"

Dimensions: standard


MetricTypeUnitDescription
alquimia.response_invocations_totalCounter1Incremented when a ResponseInference command is emitted (i.e., each LLM call)
alquimia.response_errors_totalCounter1Incremented when a ResponseInferenceResponse has status="error"

Dimensions: standard


MetricTypeUnitDescription
alquimia.empathy_rules_matched_by_rule_totalCounter1Incremented when an EmpathyRuleMatchedResponse is produced

Dimensions: standard + rule_id

Use this metric to track which empathy rules fire most frequently. Group by rule_id and assistant_id to understand per-agent empathy behaviour.


MetricTypeUnitDescription
alquimia.assistant_executions_started_totalCounter1Incremented when an AssistantInference command is received
alquimia.assistant_executions_ended_totalCounter1Incremented when an AssistantInferenceResponse is produced

Dimensions: standard

The difference started - ended gives the number of in-flight inference runs at any point in time. A persistent gap indicates stuck or timed-out executions.


Use this variable to strip specific dimension keys from all metrics before export. This is useful for privacy or compliance requirements where certain identifiers must not leave the application boundary.

Terminal window
# Strip user_id and session_id from all metric dimensions
OTEL_EXCLUDED_ATTRIBUTES=user_id,session_id

The filter is applied at the FilteredCounter and FilteredHistogram wrapper level — the stripped keys never reach the OTEL exporter. The filter applies to all metrics uniformly; per-metric filtering is not supported.

Note: OTEL_EXCLUDED_ATTRIBUTES applies only to metrics. It does not affect traces or logs. Use OTEL collector-level processors to filter those signals.


Group alquimia.llm.total_tokens by assistant_id and model_name. This gives per-agent token consumption, which maps directly to LLM API cost.

alquimia.tool_errors_total / alquimia.tool_invocations_total

Group by tool_name to identify unreliable tools.

Rate of alquimia.assistant_executions_ended_total grouped by assistant_id.

Use alquimia.llm.total_time histogram with P50/P95/P99 percentiles grouped by model_name to compare LLM provider performance.