b8201a82cd
Low-severity cluster hardening from audit 0008: - Route secret out of argv (N1-low): --cluster-pass and a nats://user:pass@host in --routes are visible in ps/journald. New --cluster-pass-file and the UNIBUS_CLUSTER_PASS env var (precedence file > env > flag); the resolved secret guards the route layer and is injected into bare --routes entries (injectRouteCreds), so peers can be listed as nats://host:6250 with no secret in argv. The legacy --cluster-pass stays for dev/compat. - migrate-to-kv confidentiality (N6): refuse a remote --nats-url without --ca (the allowlist would travel cleartext); loopback targets are exempt (isLoopbackURL). - Docs (N1 route CA, N3 DoS): deploy/README gains a Clustering section — use a SEPARATE cluster CA for routes (not the client CA), keep the secret out of argv, run migrate-to-kv loopback/TLS only, and R1 is a SPOF of auth (not HA); R3 quorum is real HA. The generated cert material lives in deploy/cluster/ (0006g). Tests: - TestResolveClusterPass (file > env > flag precedence; missing file errors), - TestInjectRouteCreds (injects only into userinfo-less routes; preserves overrides), - TestIsLoopbackURL (loopback vs remote vs malformed). CGO_ENABLED=0 go build/vet/test green; govulncheck 0 reachable. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
99 lines
4.4 KiB
Markdown
99 lines
4.4 KiB
Markdown
# Running membershipd as a systemd user service
|
||
|
||
`membershipd` is the unibus control plane (rooms, members, sealed keys, blob
|
||
store) and, unless you point it at an external NATS with `--nats-url`, it also
|
||
runs the embedded NATS + JetStream data plane. Running it as a **systemd user
|
||
service** keeps it alive across logout/reboot and restarts it if it crashes.
|
||
|
||
The unit (`unibus-membershipd.service`) binds both planes to `0.0.0.0`:
|
||
|
||
| Plane | Port | Reachable from |
|
||
|--------------|-------|----------------|
|
||
| HTTP control | 8470 | LAN (`http://<host-ip>:8470/healthz`) |
|
||
| NATS data | 4250 | LAN (`nats://<host-ip>:4250`) |
|
||
|
||
## Install (idempotent)
|
||
|
||
```bash
|
||
cd ~/fn_registry/projects/message_bus/apps/unibus
|
||
./deploy/install.sh
|
||
```
|
||
|
||
This builds the binary, symlinks the unit into `~/.config/systemd/user/`,
|
||
reloads systemd, and enables + starts the service.
|
||
|
||
## Manual steps (what install.sh does)
|
||
|
||
```bash
|
||
cd ~/fn_registry/projects/message_bus/apps/unibus
|
||
|
||
# 1. Build the pure-Go binary (no CGO).
|
||
CGO_ENABLED=0 go build -o membershipd ./cmd/membershipd
|
||
|
||
# 2. Link the unit into the systemd user directory.
|
||
mkdir -p ~/.config/systemd/user
|
||
ln -sf "$PWD/deploy/unibus-membershipd.service" ~/.config/systemd/user/unibus-membershipd.service
|
||
|
||
# 3. Reload, enable (start on login) and start now.
|
||
systemctl --user daemon-reload
|
||
systemctl --user enable --now unibus-membershipd.service
|
||
|
||
# (optional) survive logout without an active session:
|
||
# sudo loginctl enable-linger "$USER"
|
||
```
|
||
|
||
## Operate
|
||
|
||
```bash
|
||
systemctl --user status unibus-membershipd.service # is it active?
|
||
systemctl --user restart unibus-membershipd.service # after a rebuild
|
||
systemctl --user stop unibus-membershipd.service
|
||
systemctl --user disable unibus-membershipd.service # stop starting on login
|
||
journalctl --user -u unibus-membershipd.service -f # follow logs
|
||
|
||
# Health (local and from another LAN host):
|
||
curl -fsS http://127.0.0.1:8470/healthz
|
||
curl -fsS http://<host-lan-ip>:8470/healthz
|
||
```
|
||
|
||
## Notes
|
||
|
||
- Writable state (SQLite DB, JetStream store, blobs) lives under `local_files/`
|
||
relative to `WorkingDirectory`, which the unit sets to the app directory.
|
||
- After editing the app code, rebuild (`CGO_ENABLED=0 go build -o membershipd
|
||
./cmd/membershipd`) and `systemctl --user restart unibus-membershipd.service`.
|
||
- To run against an external NATS instead of the embedded one, append
|
||
`--nats-url nats://<host>:4222` to `ExecStart` and re-run `daemon-reload` +
|
||
`restart`.
|
||
|
||
## Clustering (HA) — see `deploy/cluster/`
|
||
|
||
The single-node service above is secure on its own. Running unibus as a
|
||
multi-node **cluster** has extra hardening rules (issues 0006a–0006f); the full
|
||
runbook and the generated material live in `deploy/cluster/`. Key points an
|
||
operator must know:
|
||
|
||
- **Homogeneous posture (0006d).** Every node MUST run `--bus-auth enforce` (the
|
||
binary refuses to join a cluster otherwise) and present mutual route TLS on a
|
||
public bind. `/healthz` publishes each node's `posture` so a monitor can flag a
|
||
node that is not `enforce`+`acl`+`tls`.
|
||
- **Separate route CA (0006f).** The cluster route layer authenticates *nodes*,
|
||
not bus users — sign the route certs with a **dedicated cluster CA**
|
||
(`--route-tls-ca`), NOT the client data-plane CA (`--tls-cert`'s CA). Keeping
|
||
the two trust roots separate means a client cert can never be presented to the
|
||
route port. `deploy/cluster/generate-cluster-certs.sh` builds this CA.
|
||
- **Secret out of argv (0006f).** Pass the route password via
|
||
`--cluster-pass-file` or the `UNIBUS_CLUSTER_PASS` env var, NOT `--cluster-pass`
|
||
or a `nats://user:pass@host` in `--routes` (both are visible in `ps`/journald).
|
||
When the secret comes from a file/env, list peers as bare `--routes
|
||
nats://<host>:6250` and the binary injects the credentials.
|
||
- **`migrate-to-kv` confidentiality (0006f).** The migration writes the allowlist
|
||
(handles/roles/sign pubs) into KV. Run it only against a **loopback** nats-url,
|
||
or pin TLS with `--ca` for a remote target — otherwise that metadata travels in
|
||
cleartext. The binary refuses a remote target without `--ca`.
|
||
- **R1 is NOT HA (0006a/N3-DoS).** With `--kv-replicas 1` the control plane
|
||
(including the nonce bucket) is a single point of failure: if the node owning
|
||
the stream dies, every authenticated request fails closed (auth DoS). Real HA
|
||
needs **R3** (quorum 2/3): raise replicas in place with `nats stream update
|
||
--replicas 3` once the third node has joined. Do not advertise R1 as HA.
|