Branch-by-abstraction for the control-plane store (issue 0003b), so the
membership state can move off process-local SQLite onto replicated
JetStream KV without rewriting callers and without breaking master.
pkg/membership:
- Store is now an interface (rooms/members/keys + user allowlist +
Close). The existing SQLite implementation is renamed sqliteStore and
stays the default: Open(path) still returns it. openSQLite keeps the
concrete type for internal callers (the 0003c migration).
- ErrNotFound is a storage-agnostic "no such record" sentinel; both
backends return it (the SQLite store maps sql.ErrNoRows to it). The
control plane now branches on ErrNotFound instead of sql.ErrNoRows, so
server.go no longer imports database/sql.
- jetstreamStore (new) implements Store over five replicated KV buckets:
rooms, members, rooms_by_member (reverse index for ListRoomsForEndpoint),
room_keys, users. Replication factor is configurable (R1..R5) for the
R1->R3 rollout. Every read is bounded by OpTimeout and IsAuthorized /
HasAdmin FAIL CLOSED on any backend error (a KV quorum loss denies,
never admits), per the audit's requirement for the decentralized store.
dev/feature_flags.json:
- Add the `decentralized` flag (OFF): sqliteStore default while off,
jetstreamStore behind it. The membershipd boot wiring that selects the
KV store is deliberately deferred to 0003e/0003f (the embedded-NATS
authenticator<->store bootstrap is part of the session/deploy redesign);
OFF keeps the single-node SQLite control plane unchanged.
Tests (DoD: golden + edges + error path):
- TestJetStreamStoreRoomsCRUD: encrypted room + owner + invited member
round-trip through every room/member/key method, including latest-epoch
resolution and rekey.
- TestJetStreamStoreUsers: add/get/authorize/list/revoke + admin gate,
with case-insensitive key normalization and duplicate rejection.
- TestJetStreamStoreNotFound: ErrNotFound mapping for misses.
- TestJetStreamStoreIsAuthorizedFailClosed: NATS backend shut down ->
IsAuthorized and HasAdmin both DENY within the bounded timeout.
The full existing suite stays green: sqliteStore is unchanged behavior.
Audit H5 (Alto, public). The control plane was signed but plaintext, so a
network MITM could read all metadata (subjects, endpoints, public keys, sealed
keys, blob hashes, the social graph) and drop requests. Signing gives integrity,
not confidentiality.
- membershipd serves the control plane over TLS (ListenAndServeTLS, MinVersion
1.2) with the same CA-signed cert as the data plane when --tls-cert is set; the
fail-open guard already requires --bus-auth enforce alongside it.
- The client gets a separate Options.CtrlTLS so the HTTP client pins the bus CA,
independent of the NATS data-plane TLS. Connect now sets both planes' TLS from
the one CA and REFUSES a plaintext http:// control-plane URL when a CA is
provided, so metadata is never sent in the clear when TLS is expected.
Connect's signature is unchanged; callers (worker/chat --ca, mobile NewSession)
must pass an https:// control-plane URL when they pass a CA. Documented for the
deploy step.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
pkg/membership TestRequireEncryptedRoomsRejectsCleartext: cleartext create ->
403, encrypted -> 201, flag off -> cleartext allowed again.
pkg/client TestAudit_NoSubjectACL: under the public posture a ModeNATS room is
refused; bob (member) decrypts the secret; eve raw-subscribes to the subject off
the data plane and receives only ciphertext (non-empty AEAD nonce, no plaintext
substring) — closing the auditor's 'eve reads internal: salary numbers'.
TestSecureBusEndToEnd boots the server with control-plane enforce, NATS nkey
auth, and TLS all on; two registered peers connect with nkey+TLS, A creates a
Matrix room, invites B, publishes, and B decrypts — proving the three layers
compose. This is the headline golden of issue 0001.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
client.Connect is the single migration seam: a non-empty caPath connects with
TLS pinned to the bus CA plus nkey auth (matching enforce + bus-tls), an empty
caPath keeps the legacy plaintext dev connection; control-plane requests are
signed either way. worker and chat gain a --ca flag; the gomobile NewSession
gains a caPath parameter so the Android app bundles ca.crt and connects
securely. Every peer now flows through one code path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
client/tls_test: mints a throwaway CA + server cert in-memory; a client
pinning the CA completes the handshake and operates (golden), a client
without the CA fails the handshake (error path). busauth/tls_test: golden
load of a CA PEM and a server keypair, plus error paths (missing file,
non-PEM). Harness body extracted to bootHarness(ctrlMode, natsAuth, natsTLS).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
busauth.LoadCATLSConfig turns a ca.crt path into a *tls.Config trusting only
that private CA (clients must pin it; the system roots would reject a
self-signed server cert). busauth.ServerTLSConfig loads the server keypair.
client.Options gains TLS; NewWithOptions calls nats.Secure when set, so the
data-plane connection is encrypted and the server pinned.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Harness gains newHarnessFull(ctrlMode, natsAuth) wiring the nkey authenticator
to the user allowlist; NATS auth and HTTP auth are independent so each plane
is tested in isolation. TestNatsNkeyAuth: registered peer connects with nkey
and operates (golden); unregistered peer and no-nkey client refused at connect
(error paths); peer revoked at runtime refused on its next connection without
a restart (edge).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
nats.go refuses to connect with an nkey to a server that does not advertise
nkey auth, so the connection cannot blindly always present one. New keeps the
legacy plain connection; NewWithOptions(Options{UseNkey:true}) presents the
peer's identity-derived nkey. NewWithOptions is the single place the data-plane
connection is built, so every peer gets identical behavior from the same
Options (TLS fields arrive in phase 0001d).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
membership/auth_test: golden (signed+registered accepted), error paths
(unregistered 401, replayed nonce 401, clock skew 401, tampered body 401,
missing headers 401), exemptions (healthz, soft allows, off no-op).
client_test: end-to-end with the real client against an enforce server —
registered peer accepted, unregistered rejected, revoked peer denied without
a server restart.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
doJSON, putBlob and getBlob now go through newSignedRequest, which attaches
X-Unibus-Pub/Ts/Nonce/Sig signing membership.CanonicalRequest with the peer's
Ed25519 key. GETs are signed too so the server can authenticate the caller
uniformly under enforce. The payload-level owner signature (invite/rekey)
is unchanged and coexists with this transport-level signature.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A peer invited to an encrypted room needs to find it: the control plane is
pull-based (no server push of invitations), so add a discovery endpoint that
lists every room an endpoint belongs to, with the room's metadata and the
endpoint's role.
- store.ListRoomsForEndpoint: JOIN members+rooms, ordered by room id, empty
slice (not error) for an endpoint in no rooms.
- membershipd: GET /members/{endpoint}/rooms returns {room_id, subject, epoch,
policy, role}[].
- client.ListMyRooms + RoomRef: a bot polls this to discover and then Join +
Subscribe rooms it was invited to.
Tests: store-level (owner in N rooms, member in one, unknown endpoint → []) and
client-level e2e through the embedded harness (B discovers a room A invited it
to, without prior knowledge of the room id; owner sees role=owner).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Chat bots need replies, threads and reactions. Add two optional, omitempty
envelope fields (ThreadID, ReplyTo) plus a REACT frame type. The fields ride the
cleartext envelope (message-id references, not secret content) and are omitted
when unset, so non-threaded frames are byte-for-byte identical on the wire and
their signatures unchanged — a non-breaking, additive change.
Client gains PublishReply (threaded reply) and React (emoji reaction). The
reaction content travels in the payload, so it is sealed like any message and
stays confidential in E2E rooms; receivers dispatch on Frame.Type == REACT and
read Frame.ReplyTo for the target. Publish is refactored to share one
publishFrame path with the new helpers; its behavior is unchanged.
Tests: frame round-trip of a threaded REACT frame (golden), non-threaded
wire/sig back-compat asserting thr/re keys are absent (edge), Unmarshal of
garbage errors (error path), and an end-to-end reply+reaction round-trip in an
encrypted ModeMatrix room.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- membership server returns 403 + human-readable message on missing sealed key (was leaking 'sql: no rows in result set')
- client doJSON unwraps the server's {"error"} field instead of pasting the raw HTTP envelope