Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Federation Architecture

Components

Hub

The hub runs the full SynDB stack and coordinates the federation:

ComponentRole
syndb-apiHTTP API (port 8080) + Arrow Flight (port 50051)
PostgreSQLUser accounts, dataset metadata, cluster registry, job queue, benchmarks
ClickHouseLocal data warehouse + remote() queries to nodes
S3/MinIOMesh files, job results, ETL staging
MeilisearchFull-text search index
HubRegistryActorlibp2p actor managing cluster registration and health
FederationHealthMonitorPeriodic health checks with circuit-breaker logic

Node

Nodes are lightweight — no PostgreSQL, no S3, no Meilisearch:

ComponentRole
syndb-nodeFederation daemon with Arrow Flight server (port 50052)
ClickHouseLocal data warehouse (HTTP port 8124, native port 9003/9440)
ClusterActorlibp2p actor handling hub communication

Networking: libp2p

Federation uses libp2p for peer-to-peer communication:

  • Transport: QUIC with built-in TLS 1.3 (encrypted, multiplexed)
  • Discovery: mDNS for LAN (zero-config), DHT for WAN
  • NAT traversal: Relay nodes for peers behind NAT
  • Actor model: kameo actors manage the swarm event loop

DHT Registration

Services register under well-known names in the DHT:

NameActor
syndb-hubHubRegistryActor
syndb-cluster:{name}ClusterActor

The ClusterActor on each node looks up syndb-hub in the DHT to find and register with the hub.

Actor Messages

The ClusterActor handles these message types:

MessageDirectionPurpose
HealthPingHub → NodePeriodic liveness check
SchemaSyncHub → NodePush DDL migrations
DatasetCatalogRequestHub → NodeDiscover datasets on node
GetFlightEndpointHub → NodeResolve Flight address for data transfer
AnalyticsQueryHub → NodeDelegated analytics computation
OntologySyncHub → NodePush ontology terms

Data Plane

Two mechanisms move data between hub and nodes:

ClickHouse remote()

For SQL queries, the hub compiles a remote('node-host:port', 'syndb', 'table', 'user', 'password') call that executes directly on the node’s ClickHouse and streams results back.

Arrow Flight (Internal)

For large result sets and non-SQL workloads (graph analysis, analytics), the hub delegates to the node’s internal Flight server (port 50052). Results stream back as Arrow IPC batches.

Schema Versioning

Each ClickHouse DDL migration has a version number. The hub tracks the current version and each node’s version:

  1. Hub receives a schema sync request (POST /v1/federation/schema/sync)
  2. Hub sends pending migrations to each active node via SchemaSync message
  3. Nodes apply migrations and report their new version
  4. Queries only route to nodes whose schema version is compatible

Health Monitoring

The FederationHealthMonitorActor runs on the hub:

StateMeaningQuery routing
HealthyResponds to pings, schema compatibleIncluded
DegradedResponds but slow or partially failingIncluded with lower priority
UnreachableFailed consecutive pingsExcluded
UnknownNewly registered, not yet checkedExcluded until first successful ping

Health transitions are logged and stored in PostgreSQL for audit.

Concurrency Model

  • Lock-free reads: The hub’s cluster registry uses papaya concurrent hash maps — reads never block, even under high query load
  • Actor isolation: Each cluster connection is managed by its own actor, preventing one slow node from blocking others
  • Supervisor trees: Actor failures are caught and restarted by the kameo supervisor