Skip to content

Event Model

Alquimia’s execution engine is event-driven. Every action the agent takes — calling an LLM, executing a tool, flushing memory — is represented as a typed Pydantic model that is emitted, dispatched, and consumed by the controller.

Events are split into two categories:

Commands — intent to do something. Emitted by the controller stages.

CommandTrigger
AssistantInferenceStart a new inference run
ResponseInferenceCall the LLM with the current conversation
ShieldInferenceRun a guard/classifier model
ServerToolExecutionExecute a tool on an MCP/Llama Stack server
ClientToolExecutionRequest the client to execute a tool
A2AInferenceDelegate to another agent
ToolSchemaDiscover tool schemas from a tool source
AgentDiscoveryDiscover available agents from the registry
HumanApprovalRequiredRequest human approval before tool execution
ContextFlushTrigger long-term memory summarization
ContextPersistencePersist the current conversation state

Responses — something that happened. Consumed by the controller stages.

ResponseProduced by
AssistantInferenceResponseFinal answer — terminates the loop
ResponseInferenceResponseLLM call result
ShieldInferenceResponseGuard model result
ToolExecutionResponseTool execution result
ToolSchemaResponseTool schema discovery result
AgentDiscoveryResponseAgent discovery result
HumanApprovalRequiredResponseHuman approval decision
ContextFlushResponseMemory summarization result
EmpathyRuleMatchedResponseAn empathy rule matched and short-circuited the normal response path

Every event type is registered with a CloudEvent type string via the @register_event decorator:

@register_event("com.alquimia.response.inference.v1")
class ResponseInference(BaseCommand):
query: str
conversation: Conversation
parameters: ResponseProfile

This enables serialization to/from the CloudEvents spec for distributed deployments. The runtime uses CloudEvent headers to route events between services.

Every command carries a control_id (UUID). The corresponding response carries the same control_id. The worklog uses (event_class, control_id) as a lookup key, enabling O(1) response matching.

# Command emitted by the stage
command = ResponseInference(
control_id="abc-123",
query="What is the capital of France?",
...
)
# Response consumed by the stage
response = ResponseInferenceResponse(
control_id="abc-123", # same ID
result=AIMessage(content="Paris"),
)

control_id is also the correlation key for exceptions — all ControllerException subclasses carry the control_id of the event that triggered the failure. See Exceptions Reference.

A single inference with one tool call:

AssistantInference
▼ (Preprocess stage)
ShieldInference ──────────────────► ShieldInferenceResponse
▼ (Process stage)
ToolSchema ───────────────────────► ToolSchemaResponse
ResponseInference ────────────────► ResponseInferenceResponse (tool_calls=[search])
ServerToolExecution ──────────────► ToolExecutionResponse (result="Paris")
ResponseInference ────────────────► ResponseInferenceResponse (content="Paris is...")
▼ (Answer stage)
ContextFlush ─────────────────────► ContextFlushResponse
ContextPersistence ───────────────► (no response, fire-and-forget)
AssistantInferenceResponse

Every event that passes through evaluate() is observed by AlquimiaObserver. The observer increments the appropriate metric counter or histogram based on the event type. All metrics carry the CommonAttributes dimensions (assistant_id, session_id, etc.) from the current inference context.

See Core SDK Observability for the full metrics reference.