Keyboard shortcuts

Press ← or → to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

ADR-006: libp2p for Peer-to-Peer Federation

Date: 2024-03-10

Status: Accepted

Context

SynDB federation enables multiple institutions to share and query datasets without relying on a central broker or registry. Participating nodes may sit behind NATs, institutional firewalls, or cloud VPCs, so the networking layer must handle peer discovery, NAT traversal, and encrypted transport without requiring manual endpoint configuration.

A centralized hub-and-spoke model would create a single point of failure and raise data-sovereignty concerns for institutions that want to retain control over their datasets.

Decision

Use libp2p with the following configuration for federation networking:

  • QUIC transport for encrypted, multiplexed connections with built-in TLS 1.3.
  • mDNS for zero-configuration local/LAN peer discovery.
  • Relay nodes for NAT traversal when direct connections are not possible.
  • The federation swarm is managed by kameo actors (see ADR-002), with the swarm event loop running in a dedicated actor.
  • The node registry uses papaya lock-free concurrent hash maps for high-throughput reads without contention.

Consequences

Positive:

  • True decentralized federation: no central broker, no single point of failure.
  • QUIC provides encryption and multiplexing out of the box, eliminating the need for a separate TLS termination layer.
  • mDNS enables instant discovery in development and on-premise deployments without configuration.
  • Lock-free maps via papaya allow the node registry to scale to many concurrent readers without mutex contention.

Negative:

  • Distributed systems complexity: the federation must handle network partitions, partial failures, and eventually consistent peer state.
  • libp2p’s Rust implementation has a large dependency tree and can increase compile times.
  • NAT traversal via relay nodes adds latency and requires at least one publicly reachable relay to be available.
  • Debugging peer-to-peer networking issues is harder than debugging client-server HTTP calls.