Federation Troubleshooting
Node Cannot Find Hub
Symptom: syndb federation init or syndb federation test hangs during hub discovery.
Causes and fixes:
| Cause | Fix |
|---|---|
| mDNS blocked by firewall | Open UDP port 5353 or set FEDERATION_ENABLE_MDNS=false and use explicit multiaddrs |
| Hub and node on different networks | Set FEDERATION_HUB_MULTIADDRS to the hub’s libp2p address (e.g., /ip4/hub-ip/udp/4001/quic-v1) |
| Hub not running | Verify hub process is up and listening on its libp2p port |
Registration Rejected
Symptom: "Invalid federation password" error.
Fix: Ensure SYNDB_FEDERATION_PASSWORD matches the hub’s FEDERATION_PASSWORD exactly. Check for trailing whitespace or newlines in environment variables.
Schema Version Mismatch
Symptom: Node excluded from federation queries; hub logs show schema incompatibility.
Fix:
# Check current schema
syndb federation status
# Sync to latest
syndb federation sync-schema
If sync fails, verify the node’s ClickHouse is reachable and the syndb database exists.
Health States
| State | Meaning | Action |
|---|---|---|
| Healthy | All checks pass | None |
| Degraded | Responds but slow or partially failing | Check ClickHouse load, disk space, network |
| Unreachable | Failed consecutive pings | Check firewall, ClickHouse process, network connectivity |
| Unknown | Newly registered | Wait for first health check cycle or trigger manual verify |
Trigger a manual health check from the hub:
curl -X POST -H "Authorization: Bearer $TOKEN" \
https://api.syndb.xyz/v1/federation/clusters/{id}/verify
Docker Compose Issues
Port Conflicts
The federation profile uses network_mode: host. Check for conflicts:
- Hub ClickHouse: HTTP 8123, native 9002
- Node ClickHouse: HTTP 8124, native 9003
- Federation Flight: 50052
- libp2p: UDP 4001
Node Fails to Start
Check that hub ClickHouse setup containers completed first:
docker compose --profile federation logs clickhouse-hub-fed-setup
docker compose --profile federation logs clickhouse-node-setup
These create the federation user on each ClickHouse instance. If they fail, the node cannot authenticate for remote() queries.
Connectivity Test Sequence
Run targeted tests to isolate the failure:
# 1. Test ClickHouse connectivity
curl -X POST -H "Authorization: Bearer $TOKEN" \
https://api.syndb.xyz/v1/federation/clusters/{id}/test/connectivity
# 2. Test schema compatibility
curl -X POST -H "Authorization: Bearer $TOKEN" \
https://api.syndb.xyz/v1/federation/clusters/{id}/test/schema
# 3. Test cross-cluster query
curl -X POST -H "Authorization: Bearer $TOKEN" \
https://api.syndb.xyz/v1/federation/clusters/{id}/test/query
Each test returns a pass/fail result with latency and error details. Work through them in order — later tests depend on earlier ones passing.