48a3d6be33
Parameterized, NO-VPS-touched material to bring up unibus as a 3-node cluster. The authoring agent ran none of it on a host; every remote-changing step is marked HUMAN and deploy-cluster.sh defaults to a dry run. deploy/cluster/: - nodes.env — topology (cluster name, ports, per-node rows). Public IPs known (homer 141.94.69.66, datardos 51.91.100.142) pre-filled; magnus public IP and all WireGuard IPs are <PLACEHOLDER> for the human; scripts refuse to run while any remain. - generate-cluster-certs.sh — mints a SEPARATE cluster route CA + a route cert per node (server+clientAuth, mutual routes) and a data-plane server cert per node signed by the reused client CA (../tls/ca.*); SAN = public + WG + hostname. - membershipd-cluster.service — one unit, parameterized per node via /opt/unibus/cluster.env: enforce + per-subject ACL + TLS + --store kv, --cluster-pass-file (secret out of argv), Restart=always. - deploy-cluster.sh — cross-build linux/amd64, generate each node's cluster.env (routes to the other two on the WG mesh, no userinfo), rsync + install (only with --yes); staggered start is manual. - README.md — runbook: prerequisites, loopback bootstrap to seed the first admin into the KV (works around the user-CLI/KV chicken-and-egg), staggered bring-up, verify posture+quorum, scale R1->R3 in place, and the chaos test (left to 0003f on the real VPS). - .gitignore — out/, build/, secrets/, *.key never committed. bash -n passes on both scripts; go build/test unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Running membershipd as a systemd user service
membershipd is the unibus control plane (rooms, members, sealed keys, blob
store) and, unless you point it at an external NATS with --nats-url, it also
runs the embedded NATS + JetStream data plane. Running it as a systemd user
service keeps it alive across logout/reboot and restarts it if it crashes.
The unit (unibus-membershipd.service) binds both planes to 0.0.0.0:
| Plane | Port | Reachable from |
|---|---|---|
| HTTP control | 8470 | LAN (http://<host-ip>:8470/healthz) |
| NATS data | 4250 | LAN (nats://<host-ip>:4250) |
Install (idempotent)
cd ~/fn_registry/projects/message_bus/apps/unibus
./deploy/install.sh
This builds the binary, symlinks the unit into ~/.config/systemd/user/,
reloads systemd, and enables + starts the service.
Manual steps (what install.sh does)
cd ~/fn_registry/projects/message_bus/apps/unibus
# 1. Build the pure-Go binary (no CGO).
CGO_ENABLED=0 go build -o membershipd ./cmd/membershipd
# 2. Link the unit into the systemd user directory.
mkdir -p ~/.config/systemd/user
ln -sf "$PWD/deploy/unibus-membershipd.service" ~/.config/systemd/user/unibus-membershipd.service
# 3. Reload, enable (start on login) and start now.
systemctl --user daemon-reload
systemctl --user enable --now unibus-membershipd.service
# (optional) survive logout without an active session:
# sudo loginctl enable-linger "$USER"
Operate
systemctl --user status unibus-membershipd.service # is it active?
systemctl --user restart unibus-membershipd.service # after a rebuild
systemctl --user stop unibus-membershipd.service
systemctl --user disable unibus-membershipd.service # stop starting on login
journalctl --user -u unibus-membershipd.service -f # follow logs
# Health (local and from another LAN host):
curl -fsS http://127.0.0.1:8470/healthz
curl -fsS http://<host-lan-ip>:8470/healthz
Notes
- Writable state (SQLite DB, JetStream store, blobs) lives under
local_files/relative toWorkingDirectory, which the unit sets to the app directory. - After editing the app code, rebuild (
CGO_ENABLED=0 go build -o membershipd ./cmd/membershipd) andsystemctl --user restart unibus-membershipd.service. - To run against an external NATS instead of the embedded one, append
--nats-url nats://<host>:4222toExecStartand re-rundaemon-reload+restart.
Clustering (HA) — see deploy/cluster/
The single-node service above is secure on its own. Running unibus as a
multi-node cluster has extra hardening rules (issues 0006a–0006f); the full
runbook and the generated material live in deploy/cluster/. Key points an
operator must know:
- Homogeneous posture (0006d). Every node MUST run
--bus-auth enforce(the binary refuses to join a cluster otherwise) and present mutual route TLS on a public bind./healthzpublishes each node'spostureso a monitor can flag a node that is notenforce+acl+tls. - Separate route CA (0006f). The cluster route layer authenticates nodes,
not bus users — sign the route certs with a dedicated cluster CA
(
--route-tls-ca), NOT the client data-plane CA (--tls-cert's CA). Keeping the two trust roots separate means a client cert can never be presented to the route port.deploy/cluster/generate-cluster-certs.shbuilds this CA. - Secret out of argv (0006f). Pass the route password via
--cluster-pass-fileor theUNIBUS_CLUSTER_PASSenv var, NOT--cluster-passor anats://user:pass@hostin--routes(both are visible inps/journald). When the secret comes from a file/env, list peers as bare--routes nats://<host>:6250and the binary injects the credentials. migrate-to-kvconfidentiality (0006f). The migration writes the allowlist (handles/roles/sign pubs) into KV. Run it only against a loopback nats-url, or pin TLS with--cafor a remote target — otherwise that metadata travels in cleartext. The binary refuses a remote target without--ca.- R1 is NOT HA (0006a/N3-DoS). With
--kv-replicas 1the control plane (including the nonce bucket) is a single point of failure: if the node owning the stream dies, every authenticated request fails closed (auth DoS). Real HA needs R3 (quorum 2/3): raise replicas in place withnats stream update --replicas 3once the third node has joined. Do not advertise R1 as HA.