client.processFrame verified a frame's signature only when one was present
(`info.Policy.SignMsgs && f.Sig != nil`). In a room whose policy REQUIRES
per-message signatures, an attacker with data-plane access could publish a raw
frame with Sig==nil and a forged Sender, and the receiver accepted it as
authentic because the verification block was skipped (audit N3, report 0006).
On a signed-but-cleartext room any peer that knows the subject could thus
impersonate any sender.
Fix: in a SignMsgs room a missing signature is itself a rejection. processFrame
now drops any frame with Sig==nil before attempting verification:
if info.Policy.SignMsgs {
if f.Sig == nil { return } // signature required but absent: drop
// verify ...
}
Non-signed rooms (ModeNATS) are unaffected: unsigned frames there are still
delivered, so the plain-NATS path is unchanged.
Verification (pkg/client/sig_nil_spoof_test.go, TestReaudit_SigNilSpoof):
- golden: a properly signed frame from a member is delivered.
- error : an unsigned frame with a forged Sender in a SignMsgs room is dropped
(the test fails with "SIG-NIL SPOOF: receiver accepted ..." when the fix is
reverted, confirming it is a real regression guard).
- edge : a non-signed room still delivers an unsigned frame.
- CGO_ENABLED=0 go build ./... && go vet ./... && go test -count=1 ./... green.
Refs: report 0006 N3, issue 0005b.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Closes the residual the 0004 hardening deferred: the NATS authenticator
can now confine a registered peer to the subjects of the rooms it
belongs to, instead of letting any registered identity sub/pub on any
subject. The dynamic-membership reconnection model the audit named is
provided by client.RefreshSession.
pkg/busauth:
- verifyNkey factors out the shared nkey verification.
- NewNkeyAuthenticatorACL + PermissionsFunc: an authenticator that, after
authorizing, derives and RegisterUser()s per-subject permissions. A
derivation error denies the connection (fail closed).
pkg/membership:
- SubjectACLFor(store) maps a signing pubkey to the subjects it may use:
the subject of every room it belongs to, plus the client infrastructure
subjects (_INBOX.>, $JS.API.> for request/reply and the persisted plane).
pkg/client:
- RefreshSession() rebuilds the data-plane connection so the authenticator
re-derives permissions after a membership change (NATS freezes
permissions at connect time). It retains the seeds/options to reconnect;
active subscriptions are dropped and must be re-made (documented).
Tests (DoD: isolation + refresh):
- TestSubjectACLIsolation: alice (member of room.A) may sub/pub room.A but
is DENIED sub and pub on room.B (permissions violation), and never reads
bob's room.B traffic; bob never receives alice's cross-room publish.
- TestRefreshSessionGainsNewRoom: alice has no permission for room B until
she is added and calls RefreshSession; the reconnect grants the subject
and she then receives room B traffic.
Scope note: the per-subject ACL authenticator is opt-in (NewServer/
membershipd keep the open authenticator by default) and is wired in with
the decentralized boot path; auto-RefreshSession on every membership
change (fully transparent) remains for 0003f. Master behavior unchanged.
The client (issue 0003e, part 1) accepts a LIST of NATS seeds and a LIST
of control-plane URLs so a node loss is transparent.
pkg/client:
- Options.NatsServers: extra NATS seeds beyond the primary. The client
connects to the joined seed list with MaxReconnects(-1) +
RetryOnFailedConnect, so nats.go fails over to a surviving node when the
one a client is attached to dies and rejoins a node that comes back.
- Options.CtrlURLs: extra control-plane endpoints. doJSON/putBlob/getBlob
now try each endpoint in order, falling over on a transport error to the
next (an HTTP response from any node is authoritative — every node
serves the same state under the KV store). newSignedRequest becomes
newSignedRequestTo(base, ...); each failover attempt mints a fresh nonce
(the signature covers method+path+ts+nonce+body, not the host), so a
retried request is never seen as a replay.
- ConnectedServer()/IsConnected(): observability for which node the data
plane is attached to, for ops and failover tests.
- New/Connect/NewWithOptions keep their signatures (a single URL = a
one-element list), so worker/chat/mobile/playground are unchanged.
Test (DoD edge — the issue's "kill node A" case):
- TestClientFailoverAcrossNodes: A seeds two clustered nodes, subscribes,
receives a cross-node message; the node A is attached to is KILLED; A
reconnects to the survivor and still receives messages — session intact.
Audit H5 (Alto, public). The control plane was signed but plaintext, so a
network MITM could read all metadata (subjects, endpoints, public keys, sealed
keys, blob hashes, the social graph) and drop requests. Signing gives integrity,
not confidentiality.
- membershipd serves the control plane over TLS (ListenAndServeTLS, MinVersion
1.2) with the same CA-signed cert as the data plane when --tls-cert is set; the
fail-open guard already requires --bus-auth enforce alongside it.
- The client gets a separate Options.CtrlTLS so the HTTP client pins the bus CA,
independent of the NATS data-plane TLS. Connect now sets both planes' TLS from
the one CA and REFUSES a plaintext http:// control-plane URL when a CA is
provided, so metadata is never sent in the clear when TLS is expected.
Connect's signature is unchanged; callers (worker/chat --ca, mobile NewSession)
must pass an https:// control-plane URL when they pass a CA. Documented for the
deploy step.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
client.Connect is the single migration seam: a non-empty caPath connects with
TLS pinned to the bus CA plus nkey auth (matching enforce + bus-tls), an empty
caPath keeps the legacy plaintext dev connection; control-plane requests are
signed either way. worker and chat gain a --ca flag; the gomobile NewSession
gains a caPath parameter so the Android app bundles ca.crt and connects
securely. Every peer now flows through one code path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
busauth.LoadCATLSConfig turns a ca.crt path into a *tls.Config trusting only
that private CA (clients must pin it; the system roots would reject a
self-signed server cert). busauth.ServerTLSConfig loads the server keypair.
client.Options gains TLS; NewWithOptions calls nats.Secure when set, so the
data-plane connection is encrypted and the server pinned.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
nats.go refuses to connect with an nkey to a server that does not advertise
nkey auth, so the connection cannot blindly always present one. New keeps the
legacy plain connection; NewWithOptions(Options{UseNkey:true}) presents the
peer's identity-derived nkey. NewWithOptions is the single place the data-plane
connection is built, so every peer gets identical behavior from the same
Options (TLS fields arrive in phase 0001d).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
doJSON, putBlob and getBlob now go through newSignedRequest, which attaches
X-Unibus-Pub/Ts/Nonce/Sig signing membership.CanonicalRequest with the peer's
Ed25519 key. GETs are signed too so the server can authenticate the caller
uniformly under enforce. The payload-level owner signature (invite/rekey)
is unchanged and coexists with this transport-level signature.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A peer invited to an encrypted room needs to find it: the control plane is
pull-based (no server push of invitations), so add a discovery endpoint that
lists every room an endpoint belongs to, with the room's metadata and the
endpoint's role.
- store.ListRoomsForEndpoint: JOIN members+rooms, ordered by room id, empty
slice (not error) for an endpoint in no rooms.
- membershipd: GET /members/{endpoint}/rooms returns {room_id, subject, epoch,
policy, role}[].
- client.ListMyRooms + RoomRef: a bot polls this to discover and then Join +
Subscribe rooms it was invited to.
Tests: store-level (owner in N rooms, member in one, unknown endpoint → []) and
client-level e2e through the embedded harness (B discovers a room A invited it
to, without prior knowledge of the room id; owner sees role=owner).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Chat bots need replies, threads and reactions. Add two optional, omitempty
envelope fields (ThreadID, ReplyTo) plus a REACT frame type. The fields ride the
cleartext envelope (message-id references, not secret content) and are omitted
when unset, so non-threaded frames are byte-for-byte identical on the wire and
their signatures unchanged — a non-breaking, additive change.
Client gains PublishReply (threaded reply) and React (emoji reaction). The
reaction content travels in the payload, so it is sealed like any message and
stays confidential in E2E rooms; receivers dispatch on Frame.Type == REACT and
read Frame.ReplyTo for the target. Publish is refactored to share one
publishFrame path with the new helpers; its behavior is unchanged.
Tests: frame round-trip of a threaded REACT frame (golden), non-threaded
wire/sig back-compat asserting thr/re keys are absent (edge), Unmarshal of
garbage errors (error path), and an end-to-end reply+reaction round-trip in an
encrypted ModeMatrix room.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- membership server returns 403 + human-readable message on missing sealed key (was leaking 'sql: no rows in result set')
- client doJSON unwraps the server's {"error"} field instead of pasting the raw HTTP envelope