48a3d6be33
Parameterized, NO-VPS-touched material to bring up unibus as a 3-node cluster. The authoring agent ran none of it on a host; every remote-changing step is marked HUMAN and deploy-cluster.sh defaults to a dry run. deploy/cluster/: - nodes.env — topology (cluster name, ports, per-node rows). Public IPs known (homer 141.94.69.66, datardos 51.91.100.142) pre-filled; magnus public IP and all WireGuard IPs are <PLACEHOLDER> for the human; scripts refuse to run while any remain. - generate-cluster-certs.sh — mints a SEPARATE cluster route CA + a route cert per node (server+clientAuth, mutual routes) and a data-plane server cert per node signed by the reused client CA (../tls/ca.*); SAN = public + WG + hostname. - membershipd-cluster.service — one unit, parameterized per node via /opt/unibus/cluster.env: enforce + per-subject ACL + TLS + --store kv, --cluster-pass-file (secret out of argv), Restart=always. - deploy-cluster.sh — cross-build linux/amd64, generate each node's cluster.env (routes to the other two on the WG mesh, no userinfo), rsync + install (only with --yes); staggered start is manual. - README.md — runbook: prerequisites, loopback bootstrap to seed the first admin into the KV (works around the user-CLI/KV chicken-and-egg), staggered bring-up, verify posture+quorum, scale R1->R3 in place, and the chaos test (left to 0003f on the real VPS). - .gitignore — out/, build/, secrets/, *.key never committed. bash -n passes on both scripts; go build/test unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
182 lines
8.3 KiB
Markdown
182 lines
8.3 KiB
Markdown
# unibus cluster — 3-node deploy runbook (issue 0006g)
|
|
|
|
This directory holds the material to bring up unibus as a **3-node cluster**
|
|
(`magnus` + `homer` + `datardos`) for real HA: with **R3** replication the control
|
|
plane (rooms/members/keys/users on JetStream KV + the anti-replay nonce bucket)
|
|
survives the loss of any one node (quorum 2/3).
|
|
|
|
> **The agent that authored this never touched a VPS.** Every step that changes a
|
|
> remote host is marked **HUMAN** and is executed by the operator. `deploy-cluster.sh`
|
|
> defaults to a dry run.
|
|
|
|
## Files
|
|
|
|
| File | What it is |
|
|
|---|---|
|
|
| `nodes.env` | Topology: cluster name, ports, and the per-node rows (name, ssh host, public IP, WG IP). **HUMAN fills the placeholders.** |
|
|
| `generate-cluster-certs.sh` | Mints a **separate cluster route CA** + a route cert per node, and a data-plane server cert per node signed by the **client CA** (`../tls/ca.*`). |
|
|
| `membershipd-cluster.service` | One systemd unit, parameterized per node by `/opt/unibus/cluster.env`. enforce + per-subject ACL + TLS + `--store kv`, `Restart=always`. |
|
|
| `deploy-cluster.sh` | Cross-builds the linux binary, generates each node's `cluster.env`, and (with `--yes`) rsyncs everything + installs the unit. Staggered start is manual. |
|
|
|
|
Generated keys/secrets (`out/`, `build/`, `secrets/`) are **gitignored** — they are
|
|
secret and never leave the operator's trusted machine except over the secure
|
|
rsync channel.
|
|
|
|
## Topology
|
|
|
|
| Node | SSH | Public IP | WireGuard IP | Role |
|
|
|---|---|---|---|---|
|
|
| magnus | `magnus` | `<MAGNUS_PUBLIC_IP>` | `<MAGNUS_WG_IP>` | seed (first up) |
|
|
| homer | `homer` | `141.94.69.66` | `<HOMER_WG_IP>` | replica |
|
|
| datardos | `dd` | `51.91.100.142` | `<DATARDOS_WG_IP>` (10.21.0.x) | replica |
|
|
|
|
The route layer (server-to-server) prefers the **WireGuard mesh**
|
|
(`ROUTE_NETWORK=wg`); the client data plane and the HTTP control plane are reached
|
|
over the public IPs. The route CA is **separate** from the client CA, so a client
|
|
cert can never be presented to the route port.
|
|
|
|
## Prerequisites (HUMAN, once)
|
|
|
|
1. **Fill `nodes.env`** — replace every `<PLACEHOLDER>` (magnus public IP, all WG
|
|
IPs). The scripts refuse to run while any remain.
|
|
2. **Client CA exists** — `../tls/ca.crt` + `../tls/ca.key`. If not, run
|
|
`../tls/generate-certs.sh` on the CA host (om) first. The cluster reuses this CA
|
|
for the data plane so existing clients keep trusting the bus.
|
|
3. **Mint cluster TLS**:
|
|
```bash
|
|
./generate-cluster-certs.sh # writes out/<name>/ ; --force to rotate the cluster CA
|
|
```
|
|
4. **Create the route secret** (out of argv, shared by all nodes):
|
|
```bash
|
|
mkdir -p secrets && openssl rand -hex 32 > secrets/cluster.pass
|
|
```
|
|
5. **SSH** to each node's SSH host as `root` works (`ssh magnus true`, `ssh dd true`, ...).
|
|
|
|
## Stage the nodes
|
|
|
|
```bash
|
|
./deploy-cluster.sh # DRY RUN — prints the full plan, touches nothing
|
|
./deploy-cluster.sh --yes # HUMAN: actually rsync + install the unit on all 3 nodes
|
|
```
|
|
|
|
This cross-builds `membershipd` (linux/amd64, `CGO_ENABLED=0`), writes each node's
|
|
`cluster.env` (its `NODE_NAME` and the `--routes` to the OTHER two nodes), and
|
|
ships the binary, the node's TLS material, the secret, the env file and the unit.
|
|
It does **not** start anything.
|
|
|
|
## Seed the first admin into the KV (HUMAN — loopback bootstrap)
|
|
|
|
The empty KV control plane has no users, and under `enforce` no external tool can
|
|
write the FIRST admin over NATS (it would need to be an admin already — a
|
|
chicken-and-egg). The `user` CLI also writes only to a local SQLite file, not the
|
|
KV. So the first admin is seeded on the seed node through a **loopback, no-auth
|
|
bootstrap** that populates the same JetStream store the cluster unit then reuses:
|
|
|
|
```bash
|
|
ssh root@magnus 'bash -s' <<'SEED'
|
|
set -euo pipefail
|
|
cd /opt/unibus
|
|
# a) Put the first admin into a local SQLite seed file.
|
|
./membershipd user add --db ./seed.db --handle root --sign-pub <ADMIN_SIGN_PUB_HEX> --role admin
|
|
# b) Bring up a TEMPORARY loopback, no-auth, single-node KV server on the cluster's
|
|
# own JetStream store dir (not exposed; bus-auth off is allowed on 127.0.0.1).
|
|
./membershipd --store kv --bus-auth off --bind 127.0.0.1 \
|
|
--nats-store ./local_files/jetstream --db ./seed.db >/tmp/seed-boot.log 2>&1 &
|
|
BOOT=$!; sleep 2
|
|
# c) Migrate the admin from SQLite into the replicated KV (loopback — no --ca needed).
|
|
./membershipd migrate-to-kv --db ./seed.db --nats-url nats://127.0.0.1:4250 --replicas 1
|
|
# d) Stop the bootstrap server. The KV buckets persist in ./local_files/jetstream.
|
|
kill "$BOOT"; wait "$BOOT" 2>/dev/null || true
|
|
rm -f ./seed.db
|
|
SEED
|
|
```
|
|
|
|
> The KV written here lives in `./local_files/jetstream`, which the cluster unit
|
|
> reuses (`--nats-store` default), so the admin is present when the enforce cluster
|
|
> starts. Additional users are added the same loopback way until a
|
|
> `user add --store kv` exists (see GAP in report 0009).
|
|
|
|
## Bring up (HUMAN — staggered)
|
|
|
|
Bring up the seed first, then the replicas one at a time, checking each joins.
|
|
|
|
```bash
|
|
# 1. Seed node (after the seed step above).
|
|
ssh root@magnus 'systemctl enable --now membershipd-cluster'
|
|
ssh root@magnus 'curl -fsS https://127.0.0.1:8470/healthz --cacert /opt/unibus/tls/ca.crt'
|
|
|
|
# 2. Replicas, one at a time.
|
|
ssh root@homer 'systemctl enable --now membershipd-cluster'
|
|
ssh root@datardos 'systemctl enable --now membershipd-cluster'
|
|
```
|
|
|
|
> Initial rollout runs at **R1** (`KV_REPLICAS=1` in `nodes.env`): the buckets live
|
|
> on the seed only. This is NOT HA yet — see "Scale to R3".
|
|
|
|
## Promote an existing single-node (SQLite) deployment (HUMAN, optional)
|
|
|
|
Instead of seeding fresh, you can migrate an existing single-node `unibus.db` into
|
|
the KV — **loopback only** (the allowlist would otherwise travel cleartext; the
|
|
command refuses a remote target without `--ca`). Use the same loopback-bootstrap
|
|
shape as the seed step (temporary `--bus-auth off` server on 127.0.0.1, then
|
|
`migrate-to-kv --db /opt/unibus/local_files/unibus.db`).
|
|
|
|
## Verify
|
|
|
|
```bash
|
|
# Posture on every node — all must be enforce+acl+tls+cluster, store=kv.
|
|
for h in magnus homer datardos; do
|
|
echo "== $h =="
|
|
ssh root@$h 'curl -fsS https://127.0.0.1:8470/healthz --cacert /opt/unibus/tls/ca.crt'
|
|
done
|
|
|
|
# Cluster + JetStream meta-group health (needs the `nats` CLI on a node):
|
|
ssh root@magnus 'nats --server nats://127.0.0.1:4250 server report jetstream'
|
|
ssh root@magnus 'nats --server nats://127.0.0.1:4250 server list' # 3 servers, routes up
|
|
```
|
|
|
|
A healthy cluster shows 3 routed servers and a JetStream meta-group with a leader.
|
|
|
|
## Scale to R3 (HUMAN — real HA)
|
|
|
|
Once all three nodes are up and routed, raise the replication factor of every
|
|
control-plane stream from 1 to 3 IN PLACE (no data loss), then flip `KV_REPLICAS=3`
|
|
in `nodes.env` so future (re)deploys keep it:
|
|
|
|
```bash
|
|
for s in KV_UNIBUS_users KV_UNIBUS_rooms KV_UNIBUS_members KV_UNIBUS_room_keys \
|
|
KV_UNIBUS_rooms_by_member KV_UNIBUS_nonces; do
|
|
ssh root@magnus "nats --server nats://127.0.0.1:4250 stream update $s --replicas 3 -f"
|
|
done
|
|
# (also OBJ_UNIBUS_blobs if the object store is in use)
|
|
```
|
|
|
|
Until this is done, R1 means the seed node is a **single point of failure for
|
|
authentication**: if it dies, the nonce/KV control plane is unreachable and every
|
|
authenticated request fails closed (auth DoS). R1 is a rollout step, not HA.
|
|
|
|
## Chaos test (HUMAN — requires the 3 live VPS; NOT run here)
|
|
|
|
Validate quorum tolerance after R3:
|
|
|
|
```bash
|
|
# Kill one node; the cluster keeps serving (quorum 2/3).
|
|
ssh root@datardos 'systemctl stop membershipd-cluster'
|
|
# -> clients fail over (multiple seed URLs); reads/writes still succeed.
|
|
ssh root@datardos 'systemctl start membershipd-cluster' # rejoins, catches up
|
|
|
|
# Kill two nodes; quorum is LOST — the control plane should fail CLOSED (deny),
|
|
# never fail open. Verify a request is rejected, not silently served.
|
|
```
|
|
|
|
This network-level chaos test (kill 1/3, kill 2/3, partition/split-brain) is part
|
|
of the deploy validation (issue 0003f) and runs against the real VPS — it is
|
|
deliberately out of scope for the authoring agent.
|
|
|
|
## Rollback
|
|
|
|
`membershipd` does not delete data. To revert a node to standalone SQLite, stop
|
|
the unit and start it without `--store kv`/`--cluster-name`; the KV buckets remain
|
|
for a later retry. To rotate the cluster CA, re-run `generate-cluster-certs.sh
|
|
--force` and re-stage (every node must get the new `cluster-ca.crt` together).
|