feat(0003b): membership.Store interface + JetStream KV implementation

Branch-by-abstraction for the control-plane store (issue 0003b), so the
membership state can move off process-local SQLite onto replicated
JetStream KV without rewriting callers and without breaking master.

pkg/membership:
- Store is now an interface (rooms/members/keys + user allowlist +
  Close). The existing SQLite implementation is renamed sqliteStore and
  stays the default: Open(path) still returns it. openSQLite keeps the
  concrete type for internal callers (the 0003c migration).
- ErrNotFound is a storage-agnostic "no such record" sentinel; both
  backends return it (the SQLite store maps sql.ErrNoRows to it). The
  control plane now branches on ErrNotFound instead of sql.ErrNoRows, so
  server.go no longer imports database/sql.
- jetstreamStore (new) implements Store over five replicated KV buckets:
  rooms, members, rooms_by_member (reverse index for ListRoomsForEndpoint),
  room_keys, users. Replication factor is configurable (R1..R5) for the
  R1->R3 rollout. Every read is bounded by OpTimeout and IsAuthorized /
  HasAdmin FAIL CLOSED on any backend error (a KV quorum loss denies,
  never admits), per the audit's requirement for the decentralized store.

dev/feature_flags.json:
- Add the `decentralized` flag (OFF): sqliteStore default while off,
  jetstreamStore behind it. The membershipd boot wiring that selects the
  KV store is deliberately deferred to 0003e/0003f (the embedded-NATS
  authenticator<->store bootstrap is part of the session/deploy redesign);
  OFF keeps the single-node SQLite control plane unchanged.

Tests (DoD: golden + edges + error path):
- TestJetStreamStoreRoomsCRUD: encrypted room + owner + invited member
  round-trip through every room/member/key method, including latest-epoch
  resolution and rekey.
- TestJetStreamStoreUsers: add/get/authorize/list/revoke + admin gate,
  with case-insensitive key normalization and duplicate rejection.
- TestJetStreamStoreNotFound: ErrNotFound mapping for misses.
- TestJetStreamStoreIsAuthorizedFailClosed: NATS backend shut down ->
  IsAuthorized and HasAdmin both DENY within the bounded timeout.

The full existing suite stays green: sqliteStore is unchanged behavior.
This commit is contained in:
agent
2026-06-07 15:04:52 +02:00
parent 3230b31ade
commit 6b3ace1d39
10 changed files with 883 additions and 33 deletions
+6 -6
View File
@@ -45,7 +45,7 @@ func normalizeSignPub(signPub string) string {
// AddUser inserts a new bus user. role defaults to RoleMember when empty. It
// returns ErrUserExists if the sign_pub is already registered (the caller may
// choose to revoke+re-add or ignore). handle and signPub must be non-empty.
func (s *Store) AddUser(signPub, handle, role string) error {
func (s *sqliteStore) AddUser(signPub, handle, role string) error {
signPub = normalizeSignPub(signPub)
if signPub == "" || handle == "" {
return fmt.Errorf("membership: AddUser: sign_pub and handle required")
@@ -74,7 +74,7 @@ func (s *Store) AddUser(signPub, handle, role string) error {
// GetUser returns the user with the given signing public key. It returns
// sql.ErrNoRows (wrapped) when there is no such user.
func (s *Store) GetUser(signPub string) (User, error) {
func (s *sqliteStore) GetUser(signPub string) (User, error) {
signPub = normalizeSignPub(signPub)
var u User
var revoked sql.NullString
@@ -90,7 +90,7 @@ func (s *Store) GetUser(signPub string) (User, error) {
}
// ListUsers returns every user ordered by handle then sign_pub (stable output).
func (s *Store) ListUsers() ([]User, error) {
func (s *sqliteStore) ListUsers() ([]User, error) {
rows, err := s.db.Query(
`SELECT sign_pub, handle, role, status, created_at, revoked_at FROM users ORDER BY handle, sign_pub`,
)
@@ -116,7 +116,7 @@ func (s *Store) ListUsers() ([]User, error) {
// status flip (not a delete) so the identity stays auditable and IsAuthorized
// immediately denies it on both planes. Revoking an unknown or already-revoked
// user returns an error / is a no-op respectively.
func (s *Store) RevokeUser(signPub string) error {
func (s *sqliteStore) RevokeUser(signPub string) error {
signPub = normalizeSignPub(signPub)
res, err := s.db.Exec(
`UPDATE users SET status = ?, revoked_at = ? WHERE sign_pub = ? AND status = ?`,
@@ -140,7 +140,7 @@ func (s *Store) RevokeUser(signPub string) error {
// plane (HTTP request middleware) and the data plane (NATS nkey authenticator),
// so revoking a user denies access on both without restarting anything. An
// unknown key, a revoked key, or any query error all yield false (fail closed).
func (s *Store) IsAuthorized(signPub string) bool {
func (s *sqliteStore) IsAuthorized(signPub string) bool {
signPub = normalizeSignPub(signPub)
if signPub == "" {
return false
@@ -155,7 +155,7 @@ func (s *Store) IsAuthorized(signPub string) bool {
// HasAdmin reports whether at least one active admin exists. The control plane
// uses it to gate user-management endpoints: until the host operator seeds the
// first admin via the local CLI, those endpoints stay closed (chicken-egg).
func (s *Store) HasAdmin() bool {
func (s *sqliteStore) HasAdmin() bool {
var one int
err := s.db.QueryRow(
`SELECT 1 FROM users WHERE role = ? AND status = ? LIMIT 1`, RoleAdmin, StatusActive,