0003 built the JetStream KV store (jetstreamStore) but the binary never selected
it: membership.Open (SQLite) was hardcoded and OpenJetStream was only reached by
migrate-to-kv. This completes the wiring so a node actually serves its control
plane from the replicated KV.
- New flag --store kv|sqlite (default sqlite). kv opens the JetStream KV control
plane over the privileged internal connection; sqlite is the unchanged baseline
(branch-by-abstraction: the full suite's SQLite paths are untouched).
- Bootstrap cycle resolved with storeHolder: the authenticator consults the holder
(fail-closed until set), so it can be built before the KV store exists. The KV
store opens after NATS is up and is published into the holder. The only client
that can connect in that window is the internal identity, which bypasses the
store by key. In SQLite mode the store is set before StartServer, so the window
does not exist.
- needJS now covers --store kv as well as --cluster-name; the JetStream client is
shared by the KV store and the replicated nonce bucket.
- feature_flags.json: decentralized wiring documented as complete, realized via
--store kv (opt-in per deploy; default stays sqlite).
Fail-closed preserved: jetstreamStore.IsAuthorized already denies on any backend
error; the holder denies while unset.
Tests:
- TestStoreHolderFailClosed: empty holder denies; serves after set.
- TestKVStoreBootstrapUnderEnforce: end-to-end decentralized boot — KV-seeded user
authenticates over nkey under enforce; outsider denied.
- TestKVStoreDecentralizedConsistency: a room/user created on one node's KV store
is visible to another's (ends the per-node SQLite divergence, audit 0008 N5).
CGO_ENABLED=0 go build/vet/test green; govulncheck 0 reachable.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Branch-by-abstraction for the control-plane store (issue 0003b), so the
membership state can move off process-local SQLite onto replicated
JetStream KV without rewriting callers and without breaking master.
pkg/membership:
- Store is now an interface (rooms/members/keys + user allowlist +
Close). The existing SQLite implementation is renamed sqliteStore and
stays the default: Open(path) still returns it. openSQLite keeps the
concrete type for internal callers (the 0003c migration).
- ErrNotFound is a storage-agnostic "no such record" sentinel; both
backends return it (the SQLite store maps sql.ErrNoRows to it). The
control plane now branches on ErrNotFound instead of sql.ErrNoRows, so
server.go no longer imports database/sql.
- jetstreamStore (new) implements Store over five replicated KV buckets:
rooms, members, rooms_by_member (reverse index for ListRoomsForEndpoint),
room_keys, users. Replication factor is configurable (R1..R5) for the
R1->R3 rollout. Every read is bounded by OpTimeout and IsAuthorized /
HasAdmin FAIL CLOSED on any backend error (a KV quorum loss denies,
never admits), per the audit's requirement for the decentralized store.
dev/feature_flags.json:
- Add the `decentralized` flag (OFF): sqliteStore default while off,
jetstreamStore behind it. The membershipd boot wiring that selects the
KV store is deliberately deferred to 0003e/0003f (the embedded-NATS
authenticator<->store bootstrap is part of the session/deploy redesign);
OFF keeps the single-node SQLite control plane unchanged.
Tests (DoD: golden + edges + error path):
- TestJetStreamStoreRoomsCRUD: encrypted room + owner + invited member
round-trip through every room/member/key method, including latest-epoch
resolution and rekey.
- TestJetStreamStoreUsers: add/get/authorize/list/revoke + admin gate,
with case-insensitive key normalization and duplicate rejection.
- TestJetStreamStoreNotFound: ErrNotFound mapping for misses.
- TestJetStreamStoreIsAuthorizedFailClosed: NATS backend shut down ->
IsAuthorized and HasAdmin both DENY within the bounded timeout.
The full existing suite stays green: sqliteStore is unchanged behavior.
Declares the project's target rollout: bus-auth enforce, bus-tls enabled.
Flags are declarative; the operator activates them at deploy via membershipd
--bus-auth/--tls-cert/--tls-key. CLI defaults stay off so dev and tests run
unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bus-auth carries the off -> soft -> enforce rollout state; bus-tls is a
boolean. Both start disabled so master keeps compiling and passing tests
while the auth/TLS code lands behind them across phases 0001a-0001e.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>