Merge issue/0004d-dataplane-acl: data-plane content confidentiality (H4)

Public deployments refuse cleartext rooms, forcing E2E so a data-plane sniffer
gets only ciphertext. Per-subject ACL deferred to 0003 (documented).
This commit is contained in:
2026-06-07 14:26:45 +02:00
6 changed files with 278 additions and 1 deletions
+8
View File
@@ -118,6 +118,14 @@ func main() {
}
srv := membership.NewServer(store, blobs, authMode)
// On a public (non-loopback) bind, disable cleartext rooms: the embedded NATS
// has no per-subject ACL, so cleartext content would be readable by any
// registered peer. Forcing E2E keeps message content confidential regardless
// (audit H4 minimum defense; see dev/0004d-dataplane-acl.md).
if !isLoopbackBind(*bind) {
srv.RequireEncryptedRooms = true
log.Printf("cleartext rooms: DISABLED (public bind requires end-to-end encryption)")
}
log.Printf("control-plane auth: %s", authMode)
addr := *bind + ":" + *httpPort
httpSrv := &http.Server{
+80
View File
@@ -0,0 +1,80 @@
# 0004d — Data-plane access control on NATS (audit H4)
## The finding
The NATS authenticator (`pkg/busauth`) decides one thing per connection:
*is this identity registered on the bus?* It does **not** scope what a connected
client may subscribe to or publish. There is a single NATS account with no
`Permissions`, so any registered peer can subscribe to, or publish on, **any**
subject. Concretely:
- A cleartext room (`ModeNATS`) carries its payload in the clear on its subject.
A registered peer that knows or guesses the subject subscribes and reads the
content directly (the auditor's `TestAudit_NoSubjectACL`: eve, never invited,
receives `"internal: salary numbers"`).
- An encrypted room (`ModeMatrix`) keeps its **content** confidential (the
payload is AEAD ciphertext), but the **metadata of traffic** — that a subject
is active, message sizes and timing, who is publishing — is still observable by
any registered peer that subscribes to the subject.
## Why the "complete" fix does not fit here
The preferred fix is per-subject permissions derived from room membership: when a
client connects, the authenticator looks up the rooms it belongs to and grants
`Sub`/`Pub` only on those subjects. NATS supports this — `CustomClientAuthentication`
can register a `*server.User` carrying `Permissions`.
The blocker is that **NATS evaluates permissions once, at connect time, and never
re-evaluates them on a live connection.** unibus clients routinely *connect → create
or get invited to a room → publish/subscribe* within the **same** connection
(`TestSecureBusEndToEnd` does exactly this: A connects, then creates `room.secure`,
then publishes to it). Permissions frozen at connect time would not include a room
created or joined afterwards, so the legitimate owner could not publish to the room
it just made. Making per-subject ACLs work would therefore require the client to
**reconnect on every membership change**, an invasive change to the client library
and to every peer (worker, chat, mobile) — and the prompt for this issue scopes the
client changes to the minimum.
That dynamic-membership reconnection model is precisely the redesign that issue
**0003** (decentralization) already has to do: it moves the control-plane state to a
replicated JetStream KV and reworks how nodes and clients (re)establish sessions. Per
the issue's own guidance ("if a complete strategy does not fit, implement the minimum
defense and document the rest"), the full subject ACL is deferred to 0003, where the
session/permission model is being rebuilt anyway.
## The strategy implemented here: forbid cleartext rooms in public
`Server.RequireEncryptedRooms` (set by `membershipd` on any non-loopback bind)
refuses to create a cleartext (`ModeNATS`) room. Every room on a public deployment
is therefore end-to-end encrypted, so **message content stays confidential even
though the transport offers no subject isolation**: a peer that sniffs another
room's subject receives only AEAD ciphertext it has no key for.
This composes with the 0004c control-plane authorization: a non-member cannot even
learn a room's subject through the control plane (`GET /rooms/{id}` → 403), so to
sniff it an attacker must already know or guess the subject out of band.
## What this does NOT close (residual exposure, by design)
- **Traffic metadata.** A registered peer that already knows a subject can still
subscribe and observe that the subject is active, the ciphertext sizes, and the
timing/cadence of messages. It cannot read content.
- **Cross-room publish.** A registered peer can still *publish* arbitrary bytes on
any subject. In an encrypted room those bytes fail AEAD open and the signature
check (`SignMsgs`), so receivers drop them — it is a nuisance/spam vector, not a
confidentiality or integrity break.
- **WireGuard-only deployments** may still use cleartext rooms (the guard only trips
on a public bind), because the network already restricts who can reach the bus.
Closing the residual metadata exposure requires the per-subject ACL described above,
tracked for issue 0003.
## Regression evidence
- `pkg/membership``TestRequireEncryptedRoomsRejectsCleartext`: with
`RequireEncryptedRooms` on, `POST /rooms` for a cleartext policy returns 403 while
an encrypted-room create returns 201.
- `pkg/client``TestAudit_NoSubjectACL`: under the public posture, creating a
`ModeNATS` room fails; alice creates an encrypted room and publishes; eve (a
registered non-member) raw-subscribes to the subject and receives only ciphertext —
she never recovers the plaintext.
+2 -1
View File
@@ -32,6 +32,7 @@ type testHarness struct {
ns *server.Server
httpts *httptest.Server
store *membership.Store
srv *membership.Server
}
func freePort(t *testing.T) int {
@@ -98,7 +99,7 @@ func bootHarness(t *testing.T, ctrlMode membership.AuthMode, natsAuth bool, nats
srv := membership.NewServer(store, blobs, ctrlMode)
httpts := httptest.NewServer(srv)
h := &testHarness{natsURL: embeddednats.ClientURL(ns), ctrlURL: httpts.URL, ns: ns, httpts: httpts, store: store}
h := &testHarness{natsURL: embeddednats.ClientURL(ns), ctrlURL: httpts.URL, ns: ns, httpts: httpts, store: store, srv: srv}
t.Cleanup(func() {
httpts.Close()
store.Close()
+124
View File
@@ -0,0 +1,124 @@
package client_test
import (
"bytes"
"sync"
"testing"
"time"
"github.com/enmanuel/unibus/pkg/client"
"github.com/enmanuel/unibus/pkg/frame"
"github.com/enmanuel/unibus/pkg/room"
"github.com/nats-io/nats.go"
)
// TestAudit_NoSubjectACL ports the auditor's H4 (Alto) finding under the minimum
// defense chosen for this issue (forbid cleartext rooms in public; see
// dev/0004d-dataplane-acl.md). The NATS data plane still has no per-subject ACL,
// so the guarantee we make is CONTENT confidentiality, proven three ways:
//
// error : a cleartext (ModeNATS) room cannot be created under the public posture;
// golden: a legitimate member (bob) decrypts the secret;
// edge : eve, sniffing the raw subject off the data plane, receives only
// ciphertext — she never recovers the plaintext the auditor's eve did.
func TestAudit_NoSubjectACL(t *testing.T) {
h := newHarness(t)
h.srv.RequireEncryptedRooms = true // the public posture
waitHealth(t, h.ctrlURL)
alice, err := client.New(h.natsURL, h.ctrlURL, mustIdentity(t))
if err != nil {
t.Fatalf("connect alice: %v", err)
}
defer alice.Close()
// Error path: a cleartext room is refused, so no payload ever rides a subject
// in the clear for a sniffer to read (the exact vector the auditor exploited).
if _, err := alice.CreateRoom("secret.subject.payroll", room.ModeNATS); err == nil {
t.Fatalf("cleartext room must be refused on a public deployment")
}
// alice creates an encrypted room and invites bob (the legitimate reader).
const subject = "secret.subject.payroll.e2e"
const secret = "internal: salary numbers"
roomID, err := alice.CreateRoom(subject, room.ModeMatrix)
if err != nil {
t.Fatalf("alice create encrypted room: %v", err)
}
bob, err := client.New(h.natsURL, h.ctrlURL, mustIdentity(t))
if err != nil {
t.Fatalf("connect bob: %v", err)
}
defer bob.Close()
if err := alice.Invite(roomID, bob.Endpoint()); err != nil {
t.Fatalf("alice invite bob: %v", err)
}
if err := bob.Join(roomID); err != nil {
t.Fatalf("bob join: %v", err)
}
// Golden: bob (a member) subscribes and decrypts the secret.
var bmu sync.Mutex
var bobGot []string
bobSub, err := bob.Subscribe(roomID, func(_ frame.Frame, plaintext []byte) {
bmu.Lock()
bobGot = append(bobGot, string(plaintext))
bmu.Unlock()
})
if err != nil {
t.Fatalf("bob subscribe: %v", err)
}
defer bobSub.Unsubscribe()
// Edge: eve sniffs the raw subject directly off NATS (no membership, no key).
rawEve, err := nats.Connect(h.natsURL)
if err != nil {
t.Fatalf("eve raw connect: %v", err)
}
defer rawEve.Close()
eveGot := make(chan []byte, 8)
if _, err := rawEve.Subscribe(subject, func(m *nats.Msg) { eveGot <- m.Data }); err != nil {
t.Fatalf("eve raw subscribe: %v", err)
}
if err := rawEve.Flush(); err != nil {
t.Fatalf("eve flush: %v", err)
}
time.Sleep(200 * time.Millisecond) // let both subscriptions settle
if err := alice.Publish(roomID, []byte(secret)); err != nil {
t.Fatalf("alice publish: %v", err)
}
// bob must decrypt the secret.
if !waitFor(&bmu, &bobGot, func(rs []string) bool {
for _, r := range rs {
if r == secret {
return true
}
}
return false
}, 2*time.Second) {
t.Fatalf("bob (member) should decrypt the secret; got %v", snapshot(&bmu, &bobGot))
}
// eve must receive only ciphertext — never the plaintext.
select {
case data := <-eveGot:
if bytes.Contains(data, []byte(secret)) {
t.Fatalf("eve sniffed the plaintext off the data plane: %q", data)
}
f, err := frame.Unmarshal(data)
if err != nil {
t.Fatalf("eve received an undecodable frame: %v", err)
}
if string(f.Payload) == secret {
t.Fatalf("eve read the secret from the frame payload")
}
if len(f.Nonce) == 0 {
t.Fatalf("expected an AEAD-encrypted payload (non-empty nonce), got cleartext frame")
}
case <-time.After(2 * time.Second):
// eve receiving nothing is also a safe outcome; the assertion is only that
// she never gets the plaintext, which holds vacuously here.
}
}
+46
View File
@@ -0,0 +1,46 @@
package membership
import (
"bytes"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
)
// TestRequireEncryptedRoomsRejectsCleartext is the control-plane half of the
// audit H4 minimum defense: with RequireEncryptedRooms on (the public posture),
// creating a cleartext (ModeNATS) room is refused 403, while an encrypted room is
// created normally. This is what guarantees no message ever rides the un-ACL'd
// NATS subject in the clear on a public deployment.
func TestRequireEncryptedRoomsRejectsCleartext(t *testing.T) {
srv := dosServer(t, AuthOff)
srv.RequireEncryptedRooms = true
create := func(encrypt bool) int {
body, _ := json.Marshal(createRoomReq{
Subject: "payroll.subject",
Policy: policyJSON{Encrypt: encrypt, Persist: encrypt, SignMsgs: encrypt},
Owner: endpointJSON{Endpoint: "owner-ep", SignPub: []byte("sp"), KexPub: []byte("kp")},
SealedKeySelf: []byte("sealed"),
})
rec := httptest.NewRecorder()
srv.ServeHTTP(rec, httptest.NewRequest(http.MethodPost, "/rooms", bytes.NewReader(body)))
return rec.Code
}
// Error path: a cleartext room is refused.
if code := create(false); code != http.StatusForbidden {
t.Fatalf("cleartext room under RequireEncryptedRooms should be 403, got %d", code)
}
// Golden: an encrypted room is created.
if code := create(true); code != http.StatusCreated {
t.Fatalf("encrypted room should be 201, got %d", code)
}
// Edge: with the flag OFF (loopback/dev), cleartext rooms are allowed again.
srv.RequireEncryptedRooms = false
if code := create(false); code != http.StatusCreated {
t.Fatalf("cleartext room with the flag off should be 201, got %d", code)
}
}
+18
View File
@@ -61,6 +61,16 @@ type Server struct {
authMode AuthMode
nonces *nonceCache
limiter *ipRateLimiter
// RequireEncryptedRooms, when true, refuses to create cleartext (ModeNATS)
// rooms. It is the minimum-defensive control for the data plane (audit H4):
// the embedded NATS has no per-subject ACL, so a cleartext room is readable by
// any registered peer that knows (or guesses) its subject. Forcing every room
// to be end-to-end encrypted keeps message CONTENT confidential even when the
// transport offers no subject isolation. The command sets this on a public
// (non-loopback) bind. See dev/0004d-dataplane-acl.md for the full rationale
// and the residual metadata exposure this does NOT close.
RequireEncryptedRooms bool
}
// NewServer wires the membership store and blob store into an http.Handler. The
@@ -341,6 +351,14 @@ func (s *Server) handleCreateRoom(w http.ResponseWriter, r *http.Request) {
writeErr(w, http.StatusBadRequest, "subject and owner.endpoint required")
return
}
// Data-plane minimum defense (audit H4): on a public deployment cleartext
// rooms are disabled, so no message ever rides the un-ACL'd NATS subject in
// the clear for another registered peer to sniff.
if s.RequireEncryptedRooms && !req.Policy.Encrypt {
writeErr(w, http.StatusForbidden,
"cleartext rooms are disabled on this deployment; create an encrypted (Matrix-policy) room")
return
}
roomID := newULID()
info := RoomInfo{
RoomID: roomID,