Files
unibus/cmd/membershipd/wiring.go
T
egutierrez 8b6a01d280 fix(0006a): wire replicated nonce store on clustered nodes (audit 0008 N3)
membershipd never called Server.UseReplicatedNonces, so every node kept a
per-process anti-replay cache and a signed request accepted on node A could be
replayed to node B (200+200). This wires the shared JetStream KV nonce bucket on
any clustered node, closing the cross-node replay hole.

Bootstrap: under enforce the service needs JetStream on its own embedded server,
but the data plane only accepts allowlisted clients. Resolved with an ephemeral
internal service identity the authenticator recognizes and grants full
permissions (NewNkeyAuthenticatorACLInternal), connected over the in-process
transport (no TLS/CA needed for the self-connection).

Hard rule: --cluster-name != "" means the replicated nonce bucket is mandatory;
if it cannot be created the node refuses to start (wireReplicatedNonces returns a
fatal error) rather than run insecurely. Standalone nodes keep the in-memory
cache unchanged (branch-by-abstraction: no JetStream dependency added).

Changes:
- busauth: NewNkeyAuthenticatorACLInternal + fullPermissions for the internal id.
- cmd/membershipd: connectInternalJS (in-process, privileged) / connectExternalJS;
  wireReplicatedNonces helper; main wires it when clustered; --kv-replicas flag.

Tests (regression of audit 0008 N3):
- TestAttack0008_N3: 2 clustered nodes share the bucket, cross-node replay -> 401.
- TestAttack0008_N3_StandaloneKeepsLocalCache: standalone needs no JetStream,
  same-node replay still 401.
- TestAttack0008_N3_ClusteredRequiresJetStream: clustered + no JetStream -> fatal.
- TestInternalConnPrivilegedUnderEnforce / ...OutsiderRejected: the privileged
  self-connection works under enforce and no other identity can claim it.

CGO_ENABLED=0 go build/vet/test green; govulncheck 0 reachable.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 17:02:19 +02:00

41 lines
1.8 KiB
Go

package main
import (
"fmt"
"github.com/enmanuel/unibus/pkg/membership"
"github.com/nats-io/nats.go/jetstream"
)
// wireReplicatedNonces applies the cluster anti-replay policy to srv. It is the
// single piece of wiring the binary uses to decide whether a node must share its
// nonce store, extracted so a regression test exercises the EXACT decision the
// running binary makes (issue 0006a, audit 0008 N3).
//
// Policy:
// - A clustered node (clustered == true) MUST use the shared JetStream KV nonce
// bucket. Every node sees the same bucket, so a request accepted on one node
// cannot be replayed to another whose per-process cache never saw the nonce.
// A missing JetStream context, or a failure to create the bucket, is a FATAL
// configuration error returned to the caller — a clustered node running with a
// per-process nonce cache is precisely the replay hole the audit flagged, so
// it must refuse to start rather than serve insecurely.
// - A standalone node (clustered == false) keeps the in-memory cache that
// NewServer installed: there is no second node to replay to, so the shared
// bucket would only add a JetStream dependency for no security gain.
//
// replicas is the nonce bucket's replication factor (R1..R3). Returns nil when no
// action is required (standalone).
func wireReplicatedNonces(srv *membership.Server, js jetstream.JetStream, clustered bool, replicas int) error {
if !clustered {
return nil // standalone: the in-memory nonce cache is sufficient and safe
}
if js == nil {
return fmt.Errorf("clustered node requires JetStream for the shared nonce bucket, but none is available")
}
if err := srv.UseReplicatedNonces(js, replicas); err != nil {
return fmt.Errorf("replicated nonces: %w", err)
}
return nil
}