Merge issue/0004d-dataplane-acl: data-plane content confidentiality (H4)
Public deployments refuse cleartext rooms, forcing E2E so a data-plane sniffer gets only ciphertext. Per-subject ACL deferred to 0003 (documented).
This commit is contained in:
@@ -118,6 +118,14 @@ func main() {
|
||||
}
|
||||
|
||||
srv := membership.NewServer(store, blobs, authMode)
|
||||
// On a public (non-loopback) bind, disable cleartext rooms: the embedded NATS
|
||||
// has no per-subject ACL, so cleartext content would be readable by any
|
||||
// registered peer. Forcing E2E keeps message content confidential regardless
|
||||
// (audit H4 minimum defense; see dev/0004d-dataplane-acl.md).
|
||||
if !isLoopbackBind(*bind) {
|
||||
srv.RequireEncryptedRooms = true
|
||||
log.Printf("cleartext rooms: DISABLED (public bind requires end-to-end encryption)")
|
||||
}
|
||||
log.Printf("control-plane auth: %s", authMode)
|
||||
addr := *bind + ":" + *httpPort
|
||||
httpSrv := &http.Server{
|
||||
|
||||
@@ -0,0 +1,80 @@
|
||||
# 0004d — Data-plane access control on NATS (audit H4)
|
||||
|
||||
## The finding
|
||||
|
||||
The NATS authenticator (`pkg/busauth`) decides one thing per connection:
|
||||
*is this identity registered on the bus?* It does **not** scope what a connected
|
||||
client may subscribe to or publish. There is a single NATS account with no
|
||||
`Permissions`, so any registered peer can subscribe to, or publish on, **any**
|
||||
subject. Concretely:
|
||||
|
||||
- A cleartext room (`ModeNATS`) carries its payload in the clear on its subject.
|
||||
A registered peer that knows or guesses the subject subscribes and reads the
|
||||
content directly (the auditor's `TestAudit_NoSubjectACL`: eve, never invited,
|
||||
receives `"internal: salary numbers"`).
|
||||
- An encrypted room (`ModeMatrix`) keeps its **content** confidential (the
|
||||
payload is AEAD ciphertext), but the **metadata of traffic** — that a subject
|
||||
is active, message sizes and timing, who is publishing — is still observable by
|
||||
any registered peer that subscribes to the subject.
|
||||
|
||||
## Why the "complete" fix does not fit here
|
||||
|
||||
The preferred fix is per-subject permissions derived from room membership: when a
|
||||
client connects, the authenticator looks up the rooms it belongs to and grants
|
||||
`Sub`/`Pub` only on those subjects. NATS supports this — `CustomClientAuthentication`
|
||||
can register a `*server.User` carrying `Permissions`.
|
||||
|
||||
The blocker is that **NATS evaluates permissions once, at connect time, and never
|
||||
re-evaluates them on a live connection.** unibus clients routinely *connect → create
|
||||
or get invited to a room → publish/subscribe* within the **same** connection
|
||||
(`TestSecureBusEndToEnd` does exactly this: A connects, then creates `room.secure`,
|
||||
then publishes to it). Permissions frozen at connect time would not include a room
|
||||
created or joined afterwards, so the legitimate owner could not publish to the room
|
||||
it just made. Making per-subject ACLs work would therefore require the client to
|
||||
**reconnect on every membership change**, an invasive change to the client library
|
||||
and to every peer (worker, chat, mobile) — and the prompt for this issue scopes the
|
||||
client changes to the minimum.
|
||||
|
||||
That dynamic-membership reconnection model is precisely the redesign that issue
|
||||
**0003** (decentralization) already has to do: it moves the control-plane state to a
|
||||
replicated JetStream KV and reworks how nodes and clients (re)establish sessions. Per
|
||||
the issue's own guidance ("if a complete strategy does not fit, implement the minimum
|
||||
defense and document the rest"), the full subject ACL is deferred to 0003, where the
|
||||
session/permission model is being rebuilt anyway.
|
||||
|
||||
## The strategy implemented here: forbid cleartext rooms in public
|
||||
|
||||
`Server.RequireEncryptedRooms` (set by `membershipd` on any non-loopback bind)
|
||||
refuses to create a cleartext (`ModeNATS`) room. Every room on a public deployment
|
||||
is therefore end-to-end encrypted, so **message content stays confidential even
|
||||
though the transport offers no subject isolation**: a peer that sniffs another
|
||||
room's subject receives only AEAD ciphertext it has no key for.
|
||||
|
||||
This composes with the 0004c control-plane authorization: a non-member cannot even
|
||||
learn a room's subject through the control plane (`GET /rooms/{id}` → 403), so to
|
||||
sniff it an attacker must already know or guess the subject out of band.
|
||||
|
||||
## What this does NOT close (residual exposure, by design)
|
||||
|
||||
- **Traffic metadata.** A registered peer that already knows a subject can still
|
||||
subscribe and observe that the subject is active, the ciphertext sizes, and the
|
||||
timing/cadence of messages. It cannot read content.
|
||||
- **Cross-room publish.** A registered peer can still *publish* arbitrary bytes on
|
||||
any subject. In an encrypted room those bytes fail AEAD open and the signature
|
||||
check (`SignMsgs`), so receivers drop them — it is a nuisance/spam vector, not a
|
||||
confidentiality or integrity break.
|
||||
- **WireGuard-only deployments** may still use cleartext rooms (the guard only trips
|
||||
on a public bind), because the network already restricts who can reach the bus.
|
||||
|
||||
Closing the residual metadata exposure requires the per-subject ACL described above,
|
||||
tracked for issue 0003.
|
||||
|
||||
## Regression evidence
|
||||
|
||||
- `pkg/membership` — `TestRequireEncryptedRoomsRejectsCleartext`: with
|
||||
`RequireEncryptedRooms` on, `POST /rooms` for a cleartext policy returns 403 while
|
||||
an encrypted-room create returns 201.
|
||||
- `pkg/client` — `TestAudit_NoSubjectACL`: under the public posture, creating a
|
||||
`ModeNATS` room fails; alice creates an encrypted room and publishes; eve (a
|
||||
registered non-member) raw-subscribes to the subject and receives only ciphertext —
|
||||
she never recovers the plaintext.
|
||||
@@ -32,6 +32,7 @@ type testHarness struct {
|
||||
ns *server.Server
|
||||
httpts *httptest.Server
|
||||
store *membership.Store
|
||||
srv *membership.Server
|
||||
}
|
||||
|
||||
func freePort(t *testing.T) int {
|
||||
@@ -98,7 +99,7 @@ func bootHarness(t *testing.T, ctrlMode membership.AuthMode, natsAuth bool, nats
|
||||
srv := membership.NewServer(store, blobs, ctrlMode)
|
||||
httpts := httptest.NewServer(srv)
|
||||
|
||||
h := &testHarness{natsURL: embeddednats.ClientURL(ns), ctrlURL: httpts.URL, ns: ns, httpts: httpts, store: store}
|
||||
h := &testHarness{natsURL: embeddednats.ClientURL(ns), ctrlURL: httpts.URL, ns: ns, httpts: httpts, store: store, srv: srv}
|
||||
t.Cleanup(func() {
|
||||
httpts.Close()
|
||||
store.Close()
|
||||
|
||||
@@ -0,0 +1,124 @@
|
||||
package client_test
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"sync"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/enmanuel/unibus/pkg/client"
|
||||
"github.com/enmanuel/unibus/pkg/frame"
|
||||
"github.com/enmanuel/unibus/pkg/room"
|
||||
"github.com/nats-io/nats.go"
|
||||
)
|
||||
|
||||
// TestAudit_NoSubjectACL ports the auditor's H4 (Alto) finding under the minimum
|
||||
// defense chosen for this issue (forbid cleartext rooms in public; see
|
||||
// dev/0004d-dataplane-acl.md). The NATS data plane still has no per-subject ACL,
|
||||
// so the guarantee we make is CONTENT confidentiality, proven three ways:
|
||||
//
|
||||
// error : a cleartext (ModeNATS) room cannot be created under the public posture;
|
||||
// golden: a legitimate member (bob) decrypts the secret;
|
||||
// edge : eve, sniffing the raw subject off the data plane, receives only
|
||||
// ciphertext — she never recovers the plaintext the auditor's eve did.
|
||||
func TestAudit_NoSubjectACL(t *testing.T) {
|
||||
h := newHarness(t)
|
||||
h.srv.RequireEncryptedRooms = true // the public posture
|
||||
waitHealth(t, h.ctrlURL)
|
||||
|
||||
alice, err := client.New(h.natsURL, h.ctrlURL, mustIdentity(t))
|
||||
if err != nil {
|
||||
t.Fatalf("connect alice: %v", err)
|
||||
}
|
||||
defer alice.Close()
|
||||
|
||||
// Error path: a cleartext room is refused, so no payload ever rides a subject
|
||||
// in the clear for a sniffer to read (the exact vector the auditor exploited).
|
||||
if _, err := alice.CreateRoom("secret.subject.payroll", room.ModeNATS); err == nil {
|
||||
t.Fatalf("cleartext room must be refused on a public deployment")
|
||||
}
|
||||
|
||||
// alice creates an encrypted room and invites bob (the legitimate reader).
|
||||
const subject = "secret.subject.payroll.e2e"
|
||||
const secret = "internal: salary numbers"
|
||||
roomID, err := alice.CreateRoom(subject, room.ModeMatrix)
|
||||
if err != nil {
|
||||
t.Fatalf("alice create encrypted room: %v", err)
|
||||
}
|
||||
bob, err := client.New(h.natsURL, h.ctrlURL, mustIdentity(t))
|
||||
if err != nil {
|
||||
t.Fatalf("connect bob: %v", err)
|
||||
}
|
||||
defer bob.Close()
|
||||
if err := alice.Invite(roomID, bob.Endpoint()); err != nil {
|
||||
t.Fatalf("alice invite bob: %v", err)
|
||||
}
|
||||
if err := bob.Join(roomID); err != nil {
|
||||
t.Fatalf("bob join: %v", err)
|
||||
}
|
||||
|
||||
// Golden: bob (a member) subscribes and decrypts the secret.
|
||||
var bmu sync.Mutex
|
||||
var bobGot []string
|
||||
bobSub, err := bob.Subscribe(roomID, func(_ frame.Frame, plaintext []byte) {
|
||||
bmu.Lock()
|
||||
bobGot = append(bobGot, string(plaintext))
|
||||
bmu.Unlock()
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatalf("bob subscribe: %v", err)
|
||||
}
|
||||
defer bobSub.Unsubscribe()
|
||||
|
||||
// Edge: eve sniffs the raw subject directly off NATS (no membership, no key).
|
||||
rawEve, err := nats.Connect(h.natsURL)
|
||||
if err != nil {
|
||||
t.Fatalf("eve raw connect: %v", err)
|
||||
}
|
||||
defer rawEve.Close()
|
||||
eveGot := make(chan []byte, 8)
|
||||
if _, err := rawEve.Subscribe(subject, func(m *nats.Msg) { eveGot <- m.Data }); err != nil {
|
||||
t.Fatalf("eve raw subscribe: %v", err)
|
||||
}
|
||||
if err := rawEve.Flush(); err != nil {
|
||||
t.Fatalf("eve flush: %v", err)
|
||||
}
|
||||
time.Sleep(200 * time.Millisecond) // let both subscriptions settle
|
||||
|
||||
if err := alice.Publish(roomID, []byte(secret)); err != nil {
|
||||
t.Fatalf("alice publish: %v", err)
|
||||
}
|
||||
|
||||
// bob must decrypt the secret.
|
||||
if !waitFor(&bmu, &bobGot, func(rs []string) bool {
|
||||
for _, r := range rs {
|
||||
if r == secret {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}, 2*time.Second) {
|
||||
t.Fatalf("bob (member) should decrypt the secret; got %v", snapshot(&bmu, &bobGot))
|
||||
}
|
||||
|
||||
// eve must receive only ciphertext — never the plaintext.
|
||||
select {
|
||||
case data := <-eveGot:
|
||||
if bytes.Contains(data, []byte(secret)) {
|
||||
t.Fatalf("eve sniffed the plaintext off the data plane: %q", data)
|
||||
}
|
||||
f, err := frame.Unmarshal(data)
|
||||
if err != nil {
|
||||
t.Fatalf("eve received an undecodable frame: %v", err)
|
||||
}
|
||||
if string(f.Payload) == secret {
|
||||
t.Fatalf("eve read the secret from the frame payload")
|
||||
}
|
||||
if len(f.Nonce) == 0 {
|
||||
t.Fatalf("expected an AEAD-encrypted payload (non-empty nonce), got cleartext frame")
|
||||
}
|
||||
case <-time.After(2 * time.Second):
|
||||
// eve receiving nothing is also a safe outcome; the assertion is only that
|
||||
// she never gets the plaintext, which holds vacuously here.
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,46 @@
|
||||
package membership
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
)
|
||||
|
||||
// TestRequireEncryptedRoomsRejectsCleartext is the control-plane half of the
|
||||
// audit H4 minimum defense: with RequireEncryptedRooms on (the public posture),
|
||||
// creating a cleartext (ModeNATS) room is refused 403, while an encrypted room is
|
||||
// created normally. This is what guarantees no message ever rides the un-ACL'd
|
||||
// NATS subject in the clear on a public deployment.
|
||||
func TestRequireEncryptedRoomsRejectsCleartext(t *testing.T) {
|
||||
srv := dosServer(t, AuthOff)
|
||||
srv.RequireEncryptedRooms = true
|
||||
|
||||
create := func(encrypt bool) int {
|
||||
body, _ := json.Marshal(createRoomReq{
|
||||
Subject: "payroll.subject",
|
||||
Policy: policyJSON{Encrypt: encrypt, Persist: encrypt, SignMsgs: encrypt},
|
||||
Owner: endpointJSON{Endpoint: "owner-ep", SignPub: []byte("sp"), KexPub: []byte("kp")},
|
||||
SealedKeySelf: []byte("sealed"),
|
||||
})
|
||||
rec := httptest.NewRecorder()
|
||||
srv.ServeHTTP(rec, httptest.NewRequest(http.MethodPost, "/rooms", bytes.NewReader(body)))
|
||||
return rec.Code
|
||||
}
|
||||
|
||||
// Error path: a cleartext room is refused.
|
||||
if code := create(false); code != http.StatusForbidden {
|
||||
t.Fatalf("cleartext room under RequireEncryptedRooms should be 403, got %d", code)
|
||||
}
|
||||
// Golden: an encrypted room is created.
|
||||
if code := create(true); code != http.StatusCreated {
|
||||
t.Fatalf("encrypted room should be 201, got %d", code)
|
||||
}
|
||||
|
||||
// Edge: with the flag OFF (loopback/dev), cleartext rooms are allowed again.
|
||||
srv.RequireEncryptedRooms = false
|
||||
if code := create(false); code != http.StatusCreated {
|
||||
t.Fatalf("cleartext room with the flag off should be 201, got %d", code)
|
||||
}
|
||||
}
|
||||
@@ -61,6 +61,16 @@ type Server struct {
|
||||
authMode AuthMode
|
||||
nonces *nonceCache
|
||||
limiter *ipRateLimiter
|
||||
|
||||
// RequireEncryptedRooms, when true, refuses to create cleartext (ModeNATS)
|
||||
// rooms. It is the minimum-defensive control for the data plane (audit H4):
|
||||
// the embedded NATS has no per-subject ACL, so a cleartext room is readable by
|
||||
// any registered peer that knows (or guesses) its subject. Forcing every room
|
||||
// to be end-to-end encrypted keeps message CONTENT confidential even when the
|
||||
// transport offers no subject isolation. The command sets this on a public
|
||||
// (non-loopback) bind. See dev/0004d-dataplane-acl.md for the full rationale
|
||||
// and the residual metadata exposure this does NOT close.
|
||||
RequireEncryptedRooms bool
|
||||
}
|
||||
|
||||
// NewServer wires the membership store and blob store into an http.Handler. The
|
||||
@@ -341,6 +351,14 @@ func (s *Server) handleCreateRoom(w http.ResponseWriter, r *http.Request) {
|
||||
writeErr(w, http.StatusBadRequest, "subject and owner.endpoint required")
|
||||
return
|
||||
}
|
||||
// Data-plane minimum defense (audit H4): on a public deployment cleartext
|
||||
// rooms are disabled, so no message ever rides the un-ACL'd NATS subject in
|
||||
// the clear for another registered peer to sniff.
|
||||
if s.RequireEncryptedRooms && !req.Policy.Encrypt {
|
||||
writeErr(w, http.StatusForbidden,
|
||||
"cleartext rooms are disabled on this deployment; create an encrypted (Matrix-policy) room")
|
||||
return
|
||||
}
|
||||
roomID := newULID()
|
||||
info := RoomInfo{
|
||||
RoomID: roomID,
|
||||
|
||||
Reference in New Issue
Block a user