Skip to content

Kubernetes & OpenShift

alquimia-runtime ships with production-ready Kubernetes manifests in k8s/ structured as Kustomize overlays. This guide covers deploying to vanilla Kubernetes and OpenShift / ROSA.

For local development with Docker Compose, see Docker Compose.


| Tool | Version | |---|---| | kubectl / oc | ≥ 1.27 / ≥ 4.13 | | kustomize | ≥ 5.0 (bundled in kubectl apply -k) |

The manifests target a namespace called alquimia-runtime. Create it first:

Terminal window
kubectl create namespace alquimia-runtime

The runtime does not provision its own backing services. These must already exist and be reachable from the cluster before applying the manifests:

| Service | Used by | Notes | |---|---|---| | Apache Kafka 3.x | Event bus | Master publishes, workers consume | | PostgreSQL 16 | Knowledge base, worklog, webhooks | | | Redis 7 | Conversation context, distributed locks | | | Qdrant | Vector store (master only) | | | HashiCorp Vault | Agent secret resolution | | | S3-compatible store | Blob storage (MinIO, AWS S3, Ceph) | | | OCI registry | Agent package publish/pull | |


k8s/
├── base/
│ ├── kustomization.yaml — base resource list
│ ├── master-deployment.yaml — master Deployment (API + s3-sync sidecar)
│ ├── worker-deployment.yaml — worker Deployment (Kafka consumer + s3-sync sidecar)
│ ├── service.yaml — ClusterIP + headless Services
└── overlays/
├── dev/
│ ├── kustomization.yaml — image tag, resource patches, generators
│ ├── .secrets/ — gitignored plaintext files (never commit these)
│ │ ├── config — app ConfigMap (env vars)
│ │ ├── postgres — POSTGRES_* credentials
│ │ ├── redis — REDIS_URL
│ │ ├── s3 — BLOB_S3_* credentials
│ │ ├── vault — VAULT_TOKEN
│ │ ├── auth — API_TOKEN
│ │ ├── kafka-signing — KAFKA_SIGNING_KEY
│ │ ├── qdrant — QDRANT_URL, QDRANT_API_KEY
│ │ └── .registryconfigjson — OCI pull credentials
│ └── patches/
│ ├── master-resource-limits.yaml
│ └── worker-resource-limits.yaml

The overlay reads plaintext key=value files from .secrets/. These files are gitignored and must be created manually.

Non-secret runtime environment variables:

ALQUIMIA_REGISTRY_SECRET_RESOLVER=vault
ALQUIMIA_OCI_REGISTRY_DEFAULT=<registry-host>:<port>
ORAS_INSECURE=true
ORAS_PLAIN_HTTP=true
VAULT_ADDR=http://<vault-host>:<port>
VAULT_MOUNT_POINT=secret
ALQUIMIA_REGISTRY_DIR=/var/lib/alquimia/registry
KAFKA_BOOTSTRAP_SERVERS=<broker-host>:<port>
DEBUG=false
IS_ALLOWED_CREDENTIALS=true
OTEL_ALQUIMIA_SERVICE_NAME=alquimia-runtime
REDIS_URL=redis://<host>:<port>/0
POSTGRES_HOST=<host>
POSTGRES_PORT=5432
POSTGRES_DB=alquimia
POSTGRES_USERNAME=<username>
POSTGRES_PASSWORD=<password>
POSTGRES_SCHEMA=postgresql
BLOB_S3_ENDPOINT_URL=http://<minio-or-s3-endpoint>
BLOB_S3_ACCESS_KEY=<access-key>
BLOB_S3_SECRET_KEY=<secret-key>
BLOB_S3_BUCKET_NAME=alquimia
BLOB_S3_REGION_NAME=us-east-1
BLOB_S3_SECURE=true
BLOB_S3_PROVIDER=Minio # See https://rclone.org/s3/ for supported providers
API_TOKEN=<strong-random-token>

Generate: python -c "import secrets; print(secrets.token_urlsafe(32))"

KAFKA_SIGNING_KEY=<64-char-hex>

This key must be identical on the master and all workers — it authenticates every Kafka event:

Terminal window
python -c "import secrets; print(secrets.token_hex(32))"
VAULT_TOKEN=<scoped-vault-token>

Create a scoped token:

Terminal window
vault token create -policy=alquimia-runtime -ttl=720h -renewable=true
QDRANT_URL=http://<qdrant-host>:<port>
QDRANT_API_KEY=

Leave QDRANT_API_KEY empty for unauthenticated Qdrant.

Standard Docker config.json for your OCI registry:

{
"auths": {
"<registry-host>": {
"username": "<username>",
"password": "<password>",
"auth": "<base64(username:password)>"
}
}
}

Generate the auth value: echo -n "username:password" | base64


  1. Verify secret files are in place

    Terminal window
    ls k8s/overlays/dev/.secrets/
    # auth config kafka-signing postgres qdrant redis s3 vault .registryconfigjson
  2. Preview the generated manifests (dry run)

    Terminal window
    kubectl kustomize k8s/overlays/dev/
  3. Apply to the cluster

    Terminal window
    kubectl apply -k k8s/overlays/dev/ -n alquimia-runtime
  4. Watch pods come up

    Terminal window
    kubectl get pods -n alquimia-runtime -w

    Expected steady state — each pod shows 2/2 because both the alquimia-runtime container and the s3-sync sidecar must be ready:

    alquimia-runtime-master-<hash> 2/2 Running 0 60s
    alquimia-runtime-worker-<hash> 2/2 Running 0 60s
    alquimia-runtime-worker-<hash> 2/2 Running 0 60s
  5. Verify health

    Terminal window
    kubectl exec -n alquimia-runtime \
    $(kubectl get pod -l role=master -n alquimia-runtime -o jsonpath='{.items[0].metadata.name}') \
    -- curl -sf http://localhost:8080/health/readiness
    # → "OK"

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: alquimia-runtime
namespace: alquimia-runtime
annotations:
# SSE streams need long-lived connections
nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
nginx.ingress.kubernetes.io/proxy-send-timeout: "300"
spec:
rules:
- host: api.alquimia.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: alquimia-runtime
port:
number: 80
tls:
- hosts:
- api.alquimia.example.com
secretName: alquimia-tls

| Resource | Value | |---|---| | Master replicas | 1 — serialised registry writes via TinyDB + Redlock | | Worker replicas | 2 (dev) — scale to match Kafka partition count | | Container port | 8080 | | Readiness probe | GET /health/readiness | | Liveness probe | GET /health/liveness | | Startup probe | GET /health/liveness (90 retries × 10 s) | | CPU request / limit | 500m / 1 (dev overlay) | | Memory request / limit | 1Gi / 2Gi | | Worker registry | emptyDir synced from S3 via s3-sync sidecar |


Workers are stateless Kafka consumers. All replicas share the same consumer group so Kafka distributes partitions across them automatically.

Terminal window
kubectl scale deployment alquimia-runtime-worker --replicas=4 -n alquimia-runtime

To make the change permanent, add replicas: 4 to k8s/overlays/dev/patches/worker-resource-limits.yaml. The upper bound for throughput is one replica per Kafka partition — additional replicas beyond that will be idle consumers.


  1. Build and push the new image

    Terminal window
    docker build -t alquimiaai/runtime:<new-tag> runtime/
    docker push alquimiaai/runtime:<new-tag>
  2. Update newTag in the overlay and re-apply

    Terminal window
    kubectl apply -k k8s/overlays/dev/ -n alquimia-runtime
  3. Run database migrations once the master pod is healthy

    Terminal window
    kubectl exec -n alquimia-runtime \
    $(kubectl get pod -l role=master -n alquimia-runtime -o jsonpath='{.items[0].metadata.name}') \
    -c alquimia-runtime \
    -- uv run alembic upgrade head

Terminal window
kubectl delete -k k8s/overlays/dev/ -n alquimia-runtime --ignore-not-found=true
# Delete the namespace entirely (removes PVCs too)
kubectl delete namespace alquimia-runtime

Workers log missing or invalid alquimiasignature and drop all events

KAFKA_SIGNING_KEY differs between master and workers. Regenerate and redeploy:

Terminal window
NEW_KEY=$(python -c "import secrets; print(secrets.token_hex(32))")
# Update .secrets/kafka-signing with the new key
kubectl apply -k k8s/overlays/dev/ -n alquimia-runtime
kubectl rollout restart deployment/alquimia-runtime-master deployment/alquimia-runtime-worker -n alquimia-runtime

GET /health/readiness returns 500

PostgreSQL or Redis is unreachable. Check POSTGRES_HOST in alquimia-postgres Secret and REDIS_URL in alquimia-redis Secret point to the correct in-cluster DNS names, and that network policies allow traffic from alquimia-runtime to those services.

Vault token expired — registry secret resolution fails

Terminal window
vault token renew <token>
# or create a new one:
NEW_TOKEN=$(vault token create -policy=alquimia-runtime -ttl=720h -field=token)
kubectl create secret generic alquimia-vault \
--from-literal=VAULT_TOKEN=$NEW_TOKEN \
-n alquimia-runtime \
--dry-run=client -o yaml | kubectl apply -f -
kubectl rollout restart deployment/alquimia-runtime-master -n alquimia-runtime