1c9325104c
Add a dedicated UNIBUS_NATS_MONITOR=1 toggle that opens the embedded nats-server monitoring HTTP endpoint (127.0.0.1:8222, loopback only) so a local metrics scraper can read /varz, /connz and /jsz for server-level metrics (msgs/s, connections, KV bucket msgs, RAFT leader per stream, restarts). Previously the monitoring endpoint was only reachable via UNIBUS_NATS_DEBUG=1, which is coupled to the verbose nats-server debug log: enabling the endpoint also wrote routes/RAFT/room subjects to journald in clear, which regresses the hardened posture (issue 0007). The two concerns are now decoupled. The toggle computation is extracted to a pure function natsLogOpts(debugEnv, monitorEnv) (noLog, debug, trace, monitor): MONITOR=1 opens the endpoint while keeping the log quiet (NoLog true / Debug false). The inverse coupling is preserved for backward compatibility (DEBUG still implies MONITOR). The 127.0.0.1 bind stays hardcoded — the monitoring endpoint has no auth and must never be reachable from the network. Deploy wiring versioned: additive systemd drop-in membershipd-cluster.service.d/nats-monitor.conf (Environment=UNIBUS_NATS_MONITOR=1) plus a "NATS server metrics" section in the cluster README with the rolling activation runbook (magnus -> homer -> datardos) gated on R3 reconvergence (followers 2/2) between nodes. Tests: pure decoupling table (monitor on => log NOT debug; debug => monitor; default closed) + a real embedded server with MONITOR=1 asserting /varz answers 200 on loopback:8222, and a server without the flag with the endpoint closed. 100% additive: behavior is identical without the flag. Bump app.md 0.10.0 -> 0.11.0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Running membershipd as a systemd user service
membershipd is the unibus control plane (rooms, members, sealed keys, blob
store) and, unless you point it at an external NATS with --nats-url, it also
runs the embedded NATS + JetStream data plane. Running it as a systemd user
service keeps it alive across logout/reboot and restarts it if it crashes.
The unit (unibus-membershipd.service) binds both planes to 0.0.0.0:
| Plane | Port | Reachable from |
|---|---|---|
| HTTP control | 8470 | LAN (http://<host-ip>:8470/healthz) |
| NATS data | 4250 | LAN (nats://<host-ip>:4250) |
Install (idempotent)
cd ~/fn_registry/projects/message_bus/apps/unibus
./deploy/install.sh
This builds the binary, symlinks the unit into ~/.config/systemd/user/,
reloads systemd, and enables + starts the service.
Manual steps (what install.sh does)
cd ~/fn_registry/projects/message_bus/apps/unibus
# 1. Build the pure-Go binary (no CGO).
CGO_ENABLED=0 go build -o membershipd ./cmd/membershipd
# 2. Link the unit into the systemd user directory.
mkdir -p ~/.config/systemd/user
ln -sf "$PWD/deploy/unibus-membershipd.service" ~/.config/systemd/user/unibus-membershipd.service
# 3. Reload, enable (start on login) and start now.
systemctl --user daemon-reload
systemctl --user enable --now unibus-membershipd.service
# (optional) survive logout without an active session:
# sudo loginctl enable-linger "$USER"
Operate
systemctl --user status unibus-membershipd.service # is it active?
systemctl --user restart unibus-membershipd.service # after a rebuild
systemctl --user stop unibus-membershipd.service
systemctl --user disable unibus-membershipd.service # stop starting on login
journalctl --user -u unibus-membershipd.service -f # follow logs
# Health (local and from another LAN host):
curl -fsS http://127.0.0.1:8470/healthz
curl -fsS http://<host-lan-ip>:8470/healthz
Notes
- Writable state (SQLite DB, JetStream store, blobs) lives under
local_files/relative toWorkingDirectory, which the unit sets to the app directory. - After editing the app code, rebuild (
CGO_ENABLED=0 go build -o membershipd ./cmd/membershipd) andsystemctl --user restart unibus-membershipd.service. - To run against an external NATS instead of the embedded one, append
--nats-url nats://<host>:4222toExecStartand re-rundaemon-reload+restart.
Clustering (HA) — see deploy/cluster/
The single-node service above is secure on its own. Running unibus as a
multi-node cluster has extra hardening rules (issues 0006a–0006f); the full
runbook and the generated material live in deploy/cluster/. Key points an
operator must know:
- Homogeneous posture (0006d). Every node MUST run
--bus-auth enforce(the binary refuses to join a cluster otherwise) and present mutual route TLS on a public bind./healthzpublishes each node'spostureso a monitor can flag a node that is notenforce+acl+tls. - Separate route CA (0006f). The cluster route layer authenticates nodes,
not bus users — sign the route certs with a dedicated cluster CA
(
--route-tls-ca), NOT the client data-plane CA (--tls-cert's CA). Keeping the two trust roots separate means a client cert can never be presented to the route port.deploy/cluster/generate-cluster-certs.shbuilds this CA. - Secret out of argv (0006f). Pass the route password via
--cluster-pass-fileor theUNIBUS_CLUSTER_PASSenv var, NOT--cluster-passor anats://user:pass@hostin--routes(both are visible inps/journald). When the secret comes from a file/env, list peers as bare--routes nats://<host>:6250and the binary injects the credentials. migrate-to-kvconfidentiality (0006f). The migration writes the allowlist (handles/roles/sign pubs) into KV. Run it only against a loopback nats-url, or pin TLS with--cafor a remote target — otherwise that metadata travels in cleartext. The binary refuses a remote target without--ca.- R1 is NOT HA (0006a/N3-DoS). With
--kv-replicas 1the control plane (including the nonce bucket) is a single point of failure: if the node owning the stream dies, every authenticated request fails closed (auth DoS). Real HA needs R3 (quorum 2/3): raise replicas in place withnats stream update --replicas 3once the third node has joined. Do not advertise R1 as HA.