fix(0006f): cluster secret out of argv, migrate-to-kv TLS guard, R1/CA docs (audit 0008 lows)
Low-severity cluster hardening from audit 0008: - Route secret out of argv (N1-low): --cluster-pass and a nats://user:pass@host in --routes are visible in ps/journald. New --cluster-pass-file and the UNIBUS_CLUSTER_PASS env var (precedence file > env > flag); the resolved secret guards the route layer and is injected into bare --routes entries (injectRouteCreds), so peers can be listed as nats://host:6250 with no secret in argv. The legacy --cluster-pass stays for dev/compat. - migrate-to-kv confidentiality (N6): refuse a remote --nats-url without --ca (the allowlist would travel cleartext); loopback targets are exempt (isLoopbackURL). - Docs (N1 route CA, N3 DoS): deploy/README gains a Clustering section — use a SEPARATE cluster CA for routes (not the client CA), keep the secret out of argv, run migrate-to-kv loopback/TLS only, and R1 is a SPOF of auth (not HA); R3 quorum is real HA. The generated cert material lives in deploy/cluster/ (0006g). Tests: - TestResolveClusterPass (file > env > flag precedence; missing file errors), - TestInjectRouteCreds (injects only into userinfo-less routes; preserves overrides), - TestIsLoopbackURL (loopback vs remote vs malformed). CGO_ENABLED=0 go build/vet/test green; govulncheck 0 reachable. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -65,3 +65,34 @@ curl -fsS http://<host-lan-ip>:8470/healthz
|
||||
- To run against an external NATS instead of the embedded one, append
|
||||
`--nats-url nats://<host>:4222` to `ExecStart` and re-run `daemon-reload` +
|
||||
`restart`.
|
||||
|
||||
## Clustering (HA) — see `deploy/cluster/`
|
||||
|
||||
The single-node service above is secure on its own. Running unibus as a
|
||||
multi-node **cluster** has extra hardening rules (issues 0006a–0006f); the full
|
||||
runbook and the generated material live in `deploy/cluster/`. Key points an
|
||||
operator must know:
|
||||
|
||||
- **Homogeneous posture (0006d).** Every node MUST run `--bus-auth enforce` (the
|
||||
binary refuses to join a cluster otherwise) and present mutual route TLS on a
|
||||
public bind. `/healthz` publishes each node's `posture` so a monitor can flag a
|
||||
node that is not `enforce`+`acl`+`tls`.
|
||||
- **Separate route CA (0006f).** The cluster route layer authenticates *nodes*,
|
||||
not bus users — sign the route certs with a **dedicated cluster CA**
|
||||
(`--route-tls-ca`), NOT the client data-plane CA (`--tls-cert`'s CA). Keeping
|
||||
the two trust roots separate means a client cert can never be presented to the
|
||||
route port. `deploy/cluster/generate-cluster-certs.sh` builds this CA.
|
||||
- **Secret out of argv (0006f).** Pass the route password via
|
||||
`--cluster-pass-file` or the `UNIBUS_CLUSTER_PASS` env var, NOT `--cluster-pass`
|
||||
or a `nats://user:pass@host` in `--routes` (both are visible in `ps`/journald).
|
||||
When the secret comes from a file/env, list peers as bare `--routes
|
||||
nats://<host>:6250` and the binary injects the credentials.
|
||||
- **`migrate-to-kv` confidentiality (0006f).** The migration writes the allowlist
|
||||
(handles/roles/sign pubs) into KV. Run it only against a **loopback** nats-url,
|
||||
or pin TLS with `--ca` for a remote target — otherwise that metadata travels in
|
||||
cleartext. The binary refuses a remote target without `--ca`.
|
||||
- **R1 is NOT HA (0006a/N3-DoS).** With `--kv-replicas 1` the control plane
|
||||
(including the nonce bucket) is a single point of failure: if the node owning
|
||||
the stream dies, every authenticated request fails closed (auth DoS). Real HA
|
||||
needs **R3** (quorum 2/3): raise replicas in place with `nats stream update
|
||||
--replicas 3` once the third node has joined. Do not advertise R1 as HA.
|
||||
|
||||
Reference in New Issue
Block a user