Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Kubernetes & Helm

Production deployment on Kubernetes using Helm charts.

Charts Overview

ChartDescription
syndb-hubHub deployment (API, UI, depends on syndb-clickhouse)
syndb-federation-nodeFederation node (syndb-node, depends on syndb-clickhouse)
syndb-clickhouseShared ClickHouse subchart (used by both hub and node)
syndb-etlETL batch jobs (download, prepare, import, graph-precompute)
nautilusUmbrella chart for the NRP Nautilus cluster deployment

Charts are located under infrastructure/helm/.

Hub Deployment

The hub chart deploys the full SynDB stack. Key values:

syndb-clickhouse:
  clusterName: syndb-hub
  shardRegions:
    - name: dc1
      region: dc1
      replicas: 3

api:
  image:
    repository: docker.io/caniko/syndb-api
    tag: "0.10.47"
  flightPort: 50051
  resources:
    requests:
      cpu: "1"
      memory: 2Gi

ui:
  image:
    repository: docker.io/caniko/syndb-ui
    tag: "0.10.47"

The chart also creates a remote_servers.xml ConfigMap for ClickHouse cluster topology.

Meilisearch on Nautilus

The Nautilus umbrella chart now deploys Meilisearch as an internal-only production dependency for /v1/search/fulltext.

  • Deployment shape: single-replica StatefulSet
  • Service type: ClusterIP
  • Default storage: rook-ceph-block
  • Default volume: 20Gi
  • Public ingress: none
  • Shared secret: syndb-api-secrets.meilisearch_api_key

The API and the reconcile CronJob both receive:

  • MEILISEARCH_URL=http://syndb-meilisearch:7700
  • MEILISEARCH_API_KEY from syndb-api-secrets

Meilisearch itself receives the same secret as MEILI_MASTER_KEY, with MEILI_NO_ANALYTICS=true.

Reconcile job

Nautilus also deploys an hourly CronJob that runs:

syndb data search reconcile

using the lightweight oci-syndb-cli image. This is the repair mechanism for index drift and missed write-side updates.

Rollout order

For a production cutover:

  1. land the code and image changes
  2. update syndb-api-secrets so it contains meilisearch_api_key
  3. deploy the Nautilus chart
  4. wait for /health to report configured Meilisearch
  5. run one manual reconcile job
  6. verify /v1/search/fulltext through the public API

The manual one-shot reconcile command inside the supported devshell is:

nix develop . -c env \
  POSTGRES_HOST=<host> \
  POSTGRES_READ_HOST=<read-host> \
  POSTGRES_PORT=<port> \
  POSTGRES_USERNAME=<user> \
  POSTGRES_PASSWORD=<password> \
  POSTGRES_PATH=<database> \
  MEILISEARCH_URL=http://syndb-meilisearch:7700 \
  MEILISEARCH_API_KEY=<key> \
  cargo run -p cli --features dataset -- dataset search reconcile

Node Deployment

Deploy a federation node at your institution:

syndb-clickhouse:
  clusterName: syndb-node
  shardRegions:
    - name: dc1
      region: dc1
      replicas: 2

nodeApi:
  enabled: true
  image: syndb-api-rust:latest
  flightPort: 50052
  libp2pPort: 4001
  hubMultiaddrs: "/ip4/<hub-ip>/udp/4001/quic-v1"
  federationPassword: "<shared-secret>"
  resources:
    requests:
      cpu: 500m
      memory: 512Mi

When nodeApi.enabled=true, the chart deploys:

  • A Deployment running syndb-node with Flight (TCP) and libp2p (UDP) ports
  • A Service exposing both ports
  • Environment variables auto-populated from values (cluster name, endpoints, passwords)

In Kubernetes, mDNS is disabled — use hubMultiaddrs for explicit hub discovery.

ETL Jobs

ETL runs through the syndb-etl chart values, primarily downloadJobs, prepareJobs, seed, and graphPrecompute:

syndb-etl:
  image:
    repository: docker.io/caniko/syndb-etl
    tag: "0.10.47"
  flight:
    enabled: true
    serverUrl: "http://syndb-api-service:80"
    port: "50051"
  downloadJobs:
    - pipeline: hemibrain
      emptyDirSizeLimit: 8Gi
      downloadResources:
        requests: { cpu: "500m", memory: "512Mi" }
        limits: { cpu: "600m", memory: "614Mi" }
  prepareJobs:
    - pipeline: hemibrain
      emptyDirSizeLimit: 25Gi
  graphPrecompute:
    enabled: true

Important: Kubernetes Jobs are immutable. Before running helm upgrade when resource values changed, delete failed or running ETL jobs:

nix develop . -c kubectl delete job -n syndb -l app=syndb-etl --field-selector status.successful!=1

Skip override semantics: when syndb ops k8s nautilus apply receives explicit syndb-etl.skipPipelines[...] flags, SynDB now unions them with both config/etl-skip.ron and the live skip set derived from current ETL Jobs. Manual skip flags are additive; they do not replace the detected live skip set.

emptyDir warning: emptyDir volumes default to tmpfs and count against the pod’s memory cgroup limit. Add expected emptyDir data size to the memory limit.

Applying Changes

nix develop . -c cargo run -p cli --features dev -- ops k8s nautilus apply

Or manually:

nix develop . -c helm upgrade --install syndb-nautilus infrastructure/helm/nautilus/ \
  -n syndb --create-namespace \
  -f infrastructure/helm/nautilus/values.yaml

Pending Helm Releases

SynDB now refuses to apply when syndb-nautilus is already in one of Helm’s pending states (pending-install, pending-upgrade, pending-rollback). This prevents a generic:

another operation (install/upgrade/rollback) is in progress

from landing after ETL reset work has already started.

If the pending revision is newer than 10 minutes, treat it as possibly active and inspect it first:

nix develop . -c helm status syndb-nautilus -n syndb
nix develop . -c helm history syndb-nautilus -n syndb

If the pending revision is older than 10 minutes, treat it as stale and roll back to the newest deployed revision before retrying the apply.

Current example from April 19, 2026:

  • revision 293 was stuck in pending-upgrade
  • Helm reported last_deployed = 2026-04-19T18:51:43.666197216+02:00
  • the newest deployed revision was 291

Recovery:

nix develop . -c helm rollback syndb-nautilus 291 -n syndb
nix develop . -c cargo run -p cli --features dev -- ops k8s nautilus apply

QueryFabric Rollout

The QueryFabric cutover adds two PostgreSQL metadata invariants that the API now enforces at startup:

  • every saved query must have query_text
  • every pending query job must have sql_plan

Use the SynDB devshell and either run the checks manually:

nix develop . -c syndb test queryfabric-full
nix develop . -c syndb test queryfabric-rollout

or use the convenience wrapper:

nix develop . -c syndb ops k8s nautilus deploy queryfabric

test-queryfabric-rollout checks the PostgreSQL environment described by the current POSTGRES_* / POSTGRES_READ_HOST variables and performs the same saved-query backfill step the API runs at startup. For production, point those variables at the target metadata database before running the preflight.

deploy-bump-queryfabric is a safe wrapper over deploy-bump: it runs the full local QueryFabric + SynDB validation path first, then the target-DB preflight, and only then publishes images and upgrades Helm on trunk.