Files
unibus/deploy/cluster/nodes.env
T
egutierrez 48a3d6be33 docs(0006g): cluster deploy material for magnus+homer+datardos (R3 HA)
Parameterized, NO-VPS-touched material to bring up unibus as a 3-node cluster.
The authoring agent ran none of it on a host; every remote-changing step is
marked HUMAN and deploy-cluster.sh defaults to a dry run.

deploy/cluster/:
- nodes.env — topology (cluster name, ports, per-node rows). Public IPs known
  (homer 141.94.69.66, datardos 51.91.100.142) pre-filled; magnus public IP and
  all WireGuard IPs are <PLACEHOLDER> for the human; scripts refuse to run while
  any remain.
- generate-cluster-certs.sh — mints a SEPARATE cluster route CA + a route cert per
  node (server+clientAuth, mutual routes) and a data-plane server cert per node
  signed by the reused client CA (../tls/ca.*); SAN = public + WG + hostname.
- membershipd-cluster.service — one unit, parameterized per node via
  /opt/unibus/cluster.env: enforce + per-subject ACL + TLS + --store kv,
  --cluster-pass-file (secret out of argv), Restart=always.
- deploy-cluster.sh — cross-build linux/amd64, generate each node's cluster.env
  (routes to the other two on the WG mesh, no userinfo), rsync + install (only
  with --yes); staggered start is manual.
- README.md — runbook: prerequisites, loopback bootstrap to seed the first admin
  into the KV (works around the user-CLI/KV chicken-and-egg), staggered bring-up,
  verify posture+quorum, scale R1->R3 in place, and the chaos test (left to 0003f
  on the real VPS).
- .gitignore — out/, build/, secrets/, *.key never committed.

bash -n passes on both scripts; go build/test unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 17:31:13 +02:00

45 lines
1.9 KiB
Bash

# Cluster topology for the unibus 3-node deployment (issue 0006g).
#
# This file is SOURCED by generate-cluster-certs.sh and deploy-cluster.sh.
#
# HUMAN: fill in every <PLACEHOLDER> with the real value before running the
# scripts. The public IPs known at authoring time are pre-filled; the WireGuard
# mesh IPs and magnus's public IP must be supplied. The scripts refuse to run
# while any <PLACEHOLDER> remains.
# Cluster identity (must be identical on every node).
CLUSTER_NAME="unibus"
# Route-secret username; the password is NOT here — it lives in a file (see
# CLUSTER_PASS_FILE in deploy-cluster.sh) so it never lands in argv or git.
CLUSTER_USER="unibus-cluster"
# KV/nonce replication factor. START AT 1 for the initial 1->3 rollout, then raise
# to 3 IN PLACE (see README "Scale to R3") once all three nodes have joined. Only
# set this to 3 here after the third node is up and you re-run the KV update.
KV_REPLICAS=1
# Ports (same on every node; the route port is server-to-server only).
NATS_CLIENT_PORT=4250
NATS_ROUTE_PORT=6250
HTTP_PORT=8470
# Remote install layout and SSH login user.
REMOTE_DIR="/opt/unibus"
SSH_USER="root"
# Which address family the inter-node routes use. "wg" builds --routes from the
# WireGuard mesh IPs (private server-to-server links, preferred); "public" uses
# the public IPs. The route layer is always mutual-TLS regardless.
ROUTE_NETWORK="wg"
# One row per node: NAME SSH_HOST PUBLIC_IP WG_IP
# NAME -> --server-name and the per-node cert filenames (unique).
# SSH_HOST -> the `ssh <SSH_HOST>` alias (see ~/.ssh/config).
# PUBLIC_IP -> public address; goes in the cert SANs (client-facing data plane).
# WG_IP -> WireGuard mesh address; cert SAN + route target when ROUTE_NETWORK=wg.
CLUSTER_NODES=(
"magnus magnus <MAGNUS_PUBLIC_IP> <MAGNUS_WG_IP>"
"homer homer 141.94.69.66 <HOMER_WG_IP>"
"datardos dd 51.91.100.142 <DATARDOS_WG_IP>"
)