Files

T

egutierrez 48a3d6be33 docs(0006g): cluster deploy material for magnus+homer+datardos (R3 HA)

Parameterized, NO-VPS-touched material to bring up unibus as a 3-node cluster.
The authoring agent ran none of it on a host; every remote-changing step is
marked HUMAN and deploy-cluster.sh defaults to a dry run.

deploy/cluster/:
- nodes.env — topology (cluster name, ports, per-node rows). Public IPs known
  (homer 141.94.69.66, datardos 51.91.100.142) pre-filled; magnus public IP and
  all WireGuard IPs are <PLACEHOLDER> for the human; scripts refuse to run while
  any remain.
- generate-cluster-certs.sh — mints a SEPARATE cluster route CA + a route cert per
  node (server+clientAuth, mutual routes) and a data-plane server cert per node
  signed by the reused client CA (../tls/ca.*); SAN = public + WG + hostname.
- membershipd-cluster.service — one unit, parameterized per node via
  /opt/unibus/cluster.env: enforce + per-subject ACL + TLS + --store kv,
  --cluster-pass-file (secret out of argv), Restart=always.
- deploy-cluster.sh — cross-build linux/amd64, generate each node's cluster.env
  (routes to the other two on the WG mesh, no userinfo), rsync + install (only
  with --yes); staggered start is manual.
- README.md — runbook: prerequisites, loopback bootstrap to seed the first admin
  into the KV (works around the user-CLI/KV chicken-and-egg), staggered bring-up,
  verify posture+quorum, scale R1->R3 in place, and the chaos test (left to 0003f
  on the real VPS).
- .gitignore — out/, build/, secrets/, *.key never committed.

bash -n passes on both scripts; go build/test unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-07 17:31:13 +02:00

cluster

docs(0006g): cluster deploy material for magnus+homer+datardos (R3 HA)

2026-06-07 17:31:13 +02:00

tls

feat(deploy/tls): self-signed CA + server cert generator

2026-06-07 12:44:13 +02:00

install.sh

feat(deploy): systemd user unit + install script for membershipd

2026-06-06 18:05:53 +02:00

README.md

fix(0006f): cluster secret out of argv, migrate-to-kv TLS guard, R1/CA docs (audit 0008 lows)

2026-06-07 17:24:46 +02:00

unibus-membershipd.service

feat(deploy): systemd user unit + install script for membershipd

2026-06-06 18:05:53 +02:00

README.md

Running membershipd as a systemd user service

membershipd is the unibus control plane (rooms, members, sealed keys, blob store) and, unless you point it at an external NATS with --nats-url, it also runs the embedded NATS + JetStream data plane. Running it as a systemd user service keeps it alive across logout/reboot and restarts it if it crashes.

The unit (unibus-membershipd.service) binds both planes to 0.0.0.0:

Plane	Port	Reachable from
HTTP control	8470	LAN (`http://<host-ip>:8470/healthz`)
NATS data	4250	LAN (`nats://<host-ip>:4250`)

Install (idempotent)

cd ~/fn_registry/projects/message_bus/apps/unibus
./deploy/install.sh

This builds the binary, symlinks the unit into ~/.config/systemd/user/, reloads systemd, and enables + starts the service.

Manual steps (what install.sh does)

cd ~/fn_registry/projects/message_bus/apps/unibus

# 1. Build the pure-Go binary (no CGO).
CGO_ENABLED=0 go build -o membershipd ./cmd/membershipd

# 2. Link the unit into the systemd user directory.
mkdir -p ~/.config/systemd/user
ln -sf "$PWD/deploy/unibus-membershipd.service" ~/.config/systemd/user/unibus-membershipd.service

# 3. Reload, enable (start on login) and start now.
systemctl --user daemon-reload
systemctl --user enable --now unibus-membershipd.service

# (optional) survive logout without an active session:
#   sudo loginctl enable-linger "$USER"

Operate

systemctl --user status  unibus-membershipd.service     # is it active?
systemctl --user restart unibus-membershipd.service     # after a rebuild
systemctl --user stop    unibus-membershipd.service
systemctl --user disable unibus-membershipd.service     # stop starting on login
journalctl --user -u unibus-membershipd.service -f      # follow logs

# Health (local and from another LAN host):
curl -fsS http://127.0.0.1:8470/healthz
curl -fsS http://<host-lan-ip>:8470/healthz

Notes

Writable state (SQLite DB, JetStream store, blobs) lives under local_files/ relative to WorkingDirectory, which the unit sets to the app directory.
After editing the app code, rebuild (CGO_ENABLED=0 go build -o membershipd ./cmd/membershipd) and systemctl --user restart unibus-membershipd.service.
To run against an external NATS instead of the embedded one, append --nats-url nats://<host>:4222 to ExecStart and re-run daemon-reload + restart.

Clustering (HA) — see `deploy/cluster/`

The single-node service above is secure on its own. Running unibus as a multi-node cluster has extra hardening rules (issues 0006a–0006f); the full runbook and the generated material live in deploy/cluster/. Key points an operator must know:

Homogeneous posture (0006d). Every node MUST run --bus-auth enforce (the binary refuses to join a cluster otherwise) and present mutual route TLS on a public bind. /healthz publishes each node's posture so a monitor can flag a node that is not enforce+acl+tls.
Separate route CA (0006f). The cluster route layer authenticates nodes, not bus users — sign the route certs with a dedicated cluster CA (--route-tls-ca), NOT the client data-plane CA (--tls-cert's CA). Keeping the two trust roots separate means a client cert can never be presented to the route port. deploy/cluster/generate-cluster-certs.sh builds this CA.
Secret out of argv (0006f). Pass the route password via --cluster-pass-file or the UNIBUS_CLUSTER_PASS env var, NOT --cluster-pass or a nats://user:pass@host in --routes (both are visible in ps/journald). When the secret comes from a file/env, list peers as bare --routes nats://<host>:6250 and the binary injects the credentials.
migrate-to-kv confidentiality (0006f). The migration writes the allowlist (handles/roles/sign pubs) into KV. Run it only against a loopback nats-url, or pin TLS with --ca for a remote target — otherwise that metadata travels in cleartext. The binary refuses a remote target without --ca.
R1 is NOT HA (0006a/N3-DoS). With --kv-replicas 1 the control plane (including the nonce bucket) is a single point of failure: if the node owning the stream dies, every authenticated request fails closed (auth DoS). Real HA needs R3 (quorum 2/3): raise replicas in place with nats stream update --replicas 3 once the third node has joined. Do not advertise R1 as HA.

README.md Unescape Escape

Running membershipd as a systemd user service

Install (idempotent)

Manual steps (what install.sh does)

Operate

Notes

Clustering (HA) — see deploy/cluster/

README.md

Clustering (HA) — see `deploy/cluster/`