feat(0003e/2): replicated anti-replay nonce store on JetStream KV

The per-process nonce cache breaks anti-replay under multi-node failover
(audit 0004): a request captured on one node can be replayed to a
DIFFERENT node whose local cache never saw the nonce, and is accepted.
This makes the nonce state shared so a replay is rejected cluster-wide.

pkg/membership:
- nonceStore is now an interface. The in-memory cache is renamed
  memNonceCache (still the default, single-node behavior).
- kvNonceStore (new) claims each nonce with an atomic KV Create on a
  shared bucket: first sight wins (accept), any later sight on any node
  rejects (replay). A backend error fails CLOSED (reject), so a KV outage
  never silently disables anti-replay. The bucket carries a TTL =
  nonceTTL (2*clockSkew) so a key expires exactly when its replay window
  closes; raw base64 nonces are mapped to KV-safe keys via sha256-hex.
- Server.UseReplicatedNonces(js, replicas) swaps the store on a node;
  every node in a cluster calls it. NewServer still defaults to the
  in-memory cache (master behavior unchanged).

Test (DoD error path — the issue's cross-node replay case):
- TestReplicatedNonceRejectsCrossNodeReplay: two membershipd nodes share
  one KV bucket; a request accepted (200) on node A, replayed with the
  same ts+nonce to node B, is rejected (401) — and replaying to A again
  is rejected too.
This commit is contained in:
agent
2026-06-07 15:21:45 +02:00
parent c6ad63059f
commit 37c778ca9a
5 changed files with 234 additions and 12 deletions
+2 -2
View File
@@ -11,7 +11,7 @@ import (
// (error), and after the TTL the same nonce is accepted again because its entry
// was pruned (edge).
func TestNonceCacheRememberPrune(t *testing.T) {
nc := newNonceCache(50*time.Millisecond, 1000)
nc := newMemNonceCache(50*time.Millisecond, 1000)
base := time.Now()
if !nc.rememberOrReject("a", base) {
@@ -31,7 +31,7 @@ func TestNonceCacheRememberPrune(t *testing.T) {
// from the map.
func TestNonceCacheCapBounded(t *testing.T) {
const capacity = 100
nc := newNonceCache(time.Hour, capacity)
nc := newMemNonceCache(time.Hour, capacity)
base := time.Now()
for i := 0; i < 500; i++ {
nc.rememberOrReject("n"+strconv.Itoa(i), base)