Ports the auditor's TestAudit_DoSBodyLimitNoAuth: an unsigned oversized POST
to /blobs is now rejected 413 without the resident set spiking (measured via
/proc/self/status, delta bounded to <96 MiB vs the attack's 400 MB+). Covers
both a truthful over-ceiling Content-Length (rejected pre-read) and a chunked
unknown-length sender (MaxBytesReader caps the read). Plus golden (normal blob
stored), boundary (exactly at the ceiling accepted), the 1 MiB control-plane
ceiling, and the per-IP rate limit (flood -> 429, distinct IPs not throttled).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pre-auth DoS hardening (audit H1, Critical). The control-plane middleware
read the request body with io.ReadAll before authenticating and with no size
cap, so an unauthenticated peer could force the server to buffer an arbitrary
body in RAM (the auditor sent 400 MB and watched RSS climb to ~898 MB).
- ServeHTTP now caps the buffered body before reading: a per-route ceiling
(1 MiB JSON, 16 MiB /blobs) rejects an over-declared Content-Length outright
and wraps the body in http.MaxBytesReader so a lying/chunked sender trips at
the ceiling instead of unbounded.
- handlePutBlob maps the MaxBytesReader cutoff to 413 in every auth mode.
- Per-IP token-bucket rate limiter (golang.org/x/time/rate, already in the
module graph) sheds floods before auth or body reads. Loopback dev stacks are
unaffected (burst >> any single client's rate). Kept in-package as transport
glue, not promoted to the registry, mirroring the nonceCache decision in 0003.
- membershipd sets http.Server.MaxHeaderBytes and ReadHeaderTimeout.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Specs de los tres issues siguientes del bus, derivados de esta sesión:
- 0002 media v2: chunking, mimetype, GC del object store, exponer en clientes.
- 0003 descentralización/HA: cluster NATS magnus+homer (R1→R3), control plane
SQLite→JetStream KV, quorum, failover. Tercer nodo = homer (141.94.69.66).
- 0004 hardening: cierra los hallazgos de la auditoría red-team (report 0004):
DoS pre-auth, fail-open, autorización por pertenencia, ACL NATS, TLS control plane.
Phase 0001e of issue 0001. client.Connect(caPath) is the single seam every
peer uses: with the bundled ca.crt it connects with TLS + nkey and signs the
control plane (enforce); without it, legacy plaintext dev. worker/chat gain
--ca, the mobile NewSession gains caPath, membershipd gains --tls-cert/--tls-key
and turns on the nkey authenticator under enforce. dev/feature_flags.json
declares the target state (bus-auth enforce, bus-tls on); the gateway and
unibots migrations are documented as notes (dev/0001e-remaining-clients.md).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
TestSecureBusEndToEnd boots the server with control-plane enforce, NATS nkey
auth, and TLS all on; two registered peers connect with nkey+TLS, A creates a
Matrix room, invites B, publishes, and B decrypts — proving the three layers
compose. This is the headline golden of issue 0001.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The web gateway (playground) and unibots (in the agents repo) are not migrated
here: the gateway stays a local dev tool at AuthOff, and the bot transport
lives outside this sub-repo. dev/0001e-remaining-clients.md records exactly
what each needs (client.Connect with ca.crt, identity registered in the
allowlist) and the operator server flags for phase 0001f.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Declares the project's target rollout: bus-auth enforce, bus-tls enabled.
Flags are declarative; the operator activates them at deploy via membershipd
--bus-auth/--tls-cert/--tls-key. CLI defaults stay off so dev and tests run
unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Opens the store before NATS so the authenticator can consult IsAuthorized.
Under --bus-auth enforce the embedded NATS gets the nkey authenticator (only
allowlisted identities connect); --tls-cert/--tls-key make it present the
server certificate and require TLS. External NATS manages its own auth/TLS.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
client.Connect is the single migration seam: a non-empty caPath connects with
TLS pinned to the bus CA plus nkey auth (matching enforce + bus-tls), an empty
caPath keeps the legacy plaintext dev connection; control-plane requests are
signed either way. worker and chat gain a --ca flag; the gomobile NewSession
gains a caPath parameter so the Android app bundles ca.crt and connects
securely. Every peer now flows through one code path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 0001d of issue 0001. embeddednats grows a ServerConfig with an optional
TLS config; the client can pin the bus's self-signed CA via Options.TLS built
from busauth.LoadCATLSConfig. deploy/tls/generate-certs.sh mints the CA and a
server cert (SAN: public IP, WG IP, om, localhost) — only the public ca.crt is
versioned, private keys are gitignored. A client trusting the CA completes the
handshake; one without it fails. TLS stays off until phase 0001e wires it in.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
client/tls_test: mints a throwaway CA + server cert in-memory; a client
pinning the CA completes the handshake and operates (golden), a client
without the CA fails the handshake (error path). busauth/tls_test: golden
load of a CA PEM and a server keypair, plus error paths (missing file,
non-PEM). Harness body extracted to bootHarness(ctrlMode, natsAuth, natsTLS).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
generate-certs.sh mints the bus CA and a NATS server certificate whose SANs
cover the public IP (135.125.201.30), the WireGuard IP (10.42.0.1), the om
hostname, and localhost/127.0.0.1 for on-host smoke tests (all overridable via
env). Only the public ca.crt is committed; ca.key, server.key and server.crt
are gitignored and distributed out of band. README documents generation, use
and rotation.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
busauth.LoadCATLSConfig turns a ca.crt path into a *tls.Config trusting only
that private CA (clients must pin it; the system roots would reject a
self-signed server cert). busauth.ServerTLSConfig loads the server keypair.
client.Options gains TLS; NewWithOptions calls nats.Secure when set, so the
data-plane connection is encrypted and the server pinned.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Collapses Start/StartHost/StartHostAuth onto StartServer(ServerConfig) so
auth and a TLS config can be set without growing the parameter list further.
When TLS is set the server presents the certificate and requires TLS on the
data plane; the wrappers preserve the existing no-auth/no-TLS behavior.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 0001c of issue 0001. The data plane now authenticates with each peer's
Ed25519 identity reused as a NATS nkey: busauth converts the identity to an
nkey and back, embeddednats installs a CustomClientAuthentication that verifies
the nkey signature and checks the user allowlist on every connection (live
revocation, no restart), and the client opts into nkey via NewWithOptions. The
embedded server stays open by default so dev stacks and existing tests are
unaffected.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Harness gains newHarnessFull(ctrlMode, natsAuth) wiring the nkey authenticator
to the user allowlist; NATS auth and HTTP auth are independent so each plane
is tested in isolation. TestNatsNkeyAuth: registered peer connects with nkey
and operates (golden); unregistered peer and no-nkey client refused at connect
(error paths); peer revoked at runtime refused on its next connection without
a restart (edge).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
nats.go refuses to connect with an nkey to a server that does not advertise
nkey auth, so the connection cannot blindly always present one. New keeps the
legacy plain connection; NewWithOptions(Options{UseNkey:true}) presents the
peer's identity-derived nkey. NewWithOptions is the single place the data-plane
connection is built, so every peer gets identical behavior from the same
Options (TLS fields arrive in phase 0001d).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
busauth.NewNkeyAuthenticator verifies a client's nkey signature over the
server nonce (decoding like nats-server: raw-url then std base64), maps the
nkey to its Ed25519 hex, and consults an injected IsAuthorized predicate.
Checking on every connection (rather than a static Options.Nkeys map) means
revoking a user denies its next connection with no restart. embeddednats
gains StartHostAuth(auth) and sets AlwaysEnableNonce so the server advertises
the nonce nkey clients need; Start/StartHost stay open (auth=nil) for dev.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A NATS nkey is an Ed25519 keypair, so the bus reuses each peer's signing
identity for the data plane instead of minting new key material. ClientNkey
derives the user nkey public string and a nonce-signing callback from the
peer's Ed25519 private key (its first 32 bytes are the nkey seed);
SignPubHexFromNkey maps a presented nkey back to the allowlist's hex key;
NkeyPublicFromSignPub is the public-only derivation.
This is NATS-specific transport glue kept in the app, not promoted to the
registry, to avoid pulling nats-io/nkeys into the multi-domain registry
module. The dedicated round-trip test runs first (spec requirement): it
proves the nkey signature equals the identity's raw Ed25519 signature and
that the nkey maps back to the identity's hex.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 0001b of issue 0001. The control plane (membershipd HTTP) now supports
the bus-auth rollout off->soft->enforce: clients sign every request with their
Ed25519 identity (headers over method/path/ts/nonce/sha256(body)); the server
verifies the signature, clock skew (+/-30s), nonce replay (60s TTL cache), and
the user allowlist. Revocation denies access on the next request without a
restart. Default stays off so master keeps working.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
membership/auth_test: golden (signed+registered accepted), error paths
(unregistered 401, replayed nonce 401, clock skew 401, tampered body 401,
missing headers 401), exemptions (healthz, soft allows, off no-op).
client_test: end-to-end with the real client against an enforce server —
registered peer accepted, unregistered rejected, revoked peer denied without
a server restart.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The local dev gateway has not adopted signed requests; tracked for phase
0001e. Keeps it working while the NewServer signature gains the auth mode.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Maps off|soft|enforce to membership.AuthMode and wires it into NewServer.
Defaults to off so existing deployments are unaffected until the operator
opts into the rollout.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
doJSON, putBlob and getBlob now go through newSignedRequest, which attaches
X-Unibus-Pub/Ts/Nonce/Sig signing membership.CanonicalRequest with the peer's
Ed25519 key. GETs are signed too so the server can authenticate the caller
uniformly under enforce. The payload-level owner signature (invite/rekey)
is unchanged and coexists with this transport-level signature.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds the bus-auth rollout (off|soft|enforce) to the control plane. The
middleware verifies an Ed25519 request signature over CanonicalRequest
(method, request-URI, ts, nonce, sha256(body)), checks the timestamp is
within +/-30s, rejects replayed nonces via an in-memory TTL cache (60s), and
requires the signer to be an active user in the allowlist. soft logs
rejections but lets requests through so clients can migrate without an
outage; off is the legacy no-op default. /healthz is exempt so health probes
work before any identity exists. CanonicalRequest is exported as the single
source of truth shared with the client.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 0001a of issue 0001 (bus auth + TLS). Adds the users table, store CRUD
(AddUser/GetUser/ListUsers/RevokeUser/IsAuthorized/HasAdmin), the local
'membershipd user' admin CLI for seeding the first admin, and the bus-auth /
bus-tls feature flags (both off). No behavior change yet: the allowlist is
not consulted until phase 0001b wires the control-plane middleware.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Golden: add -> get -> IsAuthorized true, admin seeded. Edge: empty role
defaults to member, case-insensitive hex lookup, list ordering, revoke
denies authorization and stamps revoked_at. Error: duplicate key
(ErrUserExists), invalid role, empty sign_pub, unknown user not authorized,
revoke of unknown/already-revoked. Plus users-table migration idempotency.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bus-auth carries the off -> soft -> enforce rollout state; bus-tls is a
boolean. Both start disabled so master keeps compiling and passing tests
while the auth/TLS code lands behind them across phases 0001a-0001e.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Local administration surface for the user allowlist, dispatched before the
server flag set parses os.Args. It opens the SQLite store directly with no
network or auth: running on the bus host is trusted by design, which is how
the first admin is seeded (breaking the chicken-egg of needing an admin to
add an admin). Validates that sign-pub is a 32-byte Ed25519 key in hex and
tolerates the sign-pub positional appearing before or after --db.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bus-level user allowlist (issue 0001a): the authoritative directory of
Ed25519 signing identities permitted to use the bus, independent of room
membership. Migration is additive and mirrored byte-for-byte between the
module-root migrations/ and the embedded pkg/membership/migrations/.
Store adds AddUser/GetUser/ListUsers/RevokeUser/IsAuthorized/HasAdmin.
IsAuthorized is the single fail-closed predicate both the control plane and
the NATS data plane will consult, so revocation is a status flip that denies
access on both without a restart.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Decisión del operador: el bus se expone a internet protegido por auth+TLS
(WireGuard pasa a ser una vía más, no la barrera). ufw en om abre 8470/4250;
el server cert lleva SAN con la IP pública 135.125.201.30 + la IP WG 10.42.0.1
+ hostname; los clientes controlados embeben el ca.crt propio (sin Let's Encrypt).
La fase de despliegue 0001f la ejecuta el humano; el agente entrega 0001a-0001e.
Diseño de las tres capas de seguridad del bus para que WireGuard pase a ser
opcional: tabla users (allowlist Ed25519 con roles/revocación), middleware de
firma Ed25519 + anti-replay en el control plane (generaliza signRequest/
verifyOwnerSig ya existentes), y NATS endurecido con CustomClientAuthentication
(nkey sobre la identidad del peer, revocación dinámica) + TLS con CA propia.
Incluye 6 fases TBD con feature flag bus-auth (off->soft->enforce), migración de
clientes (pkg/client centraliza el cambio), plan de despliegue a om y matriz de
tests (golden/edge/error).
Base para Matrix-out de agents_and_robots: un bot descubre por polling las rooms
cifradas a las que lo invitaron. Aditivo, tests verdes. Bump 0.4.0.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A peer invited to an encrypted room needs to find it: the control plane is
pull-based (no server push of invitations), so add a discovery endpoint that
lists every room an endpoint belongs to, with the room's metadata and the
endpoint's role.
- store.ListRoomsForEndpoint: JOIN members+rooms, ordered by room id, empty
slice (not error) for an endpoint in no rooms.
- membershipd: GET /members/{endpoint}/rooms returns {room_id, subject, epoch,
policy, role}[].
- client.ListMyRooms + RoomRef: a bot polls this to discover and then Join +
Subscribe rooms it was invited to.
Tests: store-level (owner in N rooms, member in one, unknown endpoint → []) and
client-level e2e through the embedded harness (B discovers a room A invited it
to, without prior knowledge of the room id; owner sees role=owner).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
SPA de chat (web/, React+Vite+Mantine v9) contra el gateway, y app Android
nativa (android/, Kotlin+Compose) sobre el binding gomobile, con E2E en el
dispositivo. Amplía el binding (Card/Invite/Kick) y el gateway (rooms/members
+ CORS). Verificado end-to-end: chat cifrado en vivo entre dos pestañas web y
envío/recepción en el AVD Pixel_API34. Ver reports/0002.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Cliente móvil nativo: embebe un peer real del bus (unibus.aar), de modo que
el cifrado E2E y el transporte NATS corren en el dispositivo.
- Conexión: Host (control plane) + NATS (data plane) + identidad; defaults
10.0.2.2 para el emulador, configurables (sin IPs hardcodeadas).
- BusViewModel: llamadas de red del binding en Dispatchers.IO; los frames
entrantes (FrameListener.onFrame, hilo NATS) se publican en un StateFlow
thread-safe que Compose recolecta en el hilo principal.
- Chat: crear/unir room (toggle cifrado), enviar, recibir.
- El .aar es artefacto (gitignored); se regenera con gomobile bind (README).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Cliente web sobre el gateway (REST + SSE). El navegador no habla NATS ni
cripto: el peer Go del gateway lo hace.
- Pantalla de conexión: gateway URL + identidad (persistidas en localStorage).
- Navbar: crear room (con toggle de cifrado E2E), unirse por id, lista de rooms.
- Centro: mensajes en vivo por SSE, burbujas con autor y hora, composer.
- Lateral: miembros (rol owner), invitar por peer conectado, expulsar (owner).
- Mantine v9 (createTheme + MantineProvider), @tabler/icons-react, layout con
AppShell/Stack/Group; sin Tailwind ni CSS manual. React 19 (peer dep de v9).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Madura el gateway web para servir a una SPA en otro origen:
- GET /api/rooms?peer=: rooms que conoce un peer (creadas o unidas).
- GET /api/members?room_id=: proxy al control plane (endpoint + rol).
- withCORS: middleware con preflight OPTIONS y headers permisivos para el
dev server de Vite (mismo modelo de confianza de red que el control plane).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Añade al binding plano sobre pkg/client:
- Card(): exporta la identidad pública del peer (id + sign_pub + kex_pub)
como JSON portable, para intercambio peer-a-peer (paste/QR) sin gateway.
- Invite(roomID, peerCard): parsea una Card y sella la clave de room al
invitado (delega en client.Invite).
- Kick(roomID, endpointID): expulsa y rota la clave (forward secrecy).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Chat bots need replies, threads and reactions. Add two optional, omitempty
envelope fields (ThreadID, ReplyTo) plus a REACT frame type. The fields ride the
cleartext envelope (message-id references, not secret content) and are omitted
when unset, so non-threaded frames are byte-for-byte identical on the wire and
their signatures unchanged — a non-breaking, additive change.
Client gains PublishReply (threaded reply) and React (emoji reaction). The
reaction content travels in the payload, so it is sealed like any message and
stays confidential in E2E rooms; receivers dispatch on Frame.Type == REACT and
read Frame.ReplyTo for the target. Publish is refactored to share one
publishFrame path with the new helpers; its behavior is unchanged.
Tests: frame round-trip of a threaded REACT frame (golden), non-threaded
wire/sig back-compat asserting thr/re keys are absent (edge), Unmarshal of
garbage errors (error path), and an end-to-end reply+reaction round-trip in an
encrypted ModeMatrix room.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
membershipd now ships as a systemd user service (unit unibus-membershipd.service,
restart_policy always, runtime systemd-user). is_local_only flips to false since
--bind 0.0.0.0 makes both planes LAN-reachable. fn doctor services-spec: OK, no
drift.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add deploy/unibus-membershipd.service (Restart=always, binds both planes to
0.0.0.0 for LAN reachability), an idempotent deploy/install.sh that builds the
binary, symlinks the unit, and enables+starts it, plus deploy/README.md with
operate/health instructions.
Restart=always is deliberate: a clean SIGTERM exits 0 and Restart=on-failure
would not restart it, leaving the service silently dead (the sqlite_api gotcha).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a --bind flag (default 127.0.0.1) to membershipd that controls which
network interface both the control-plane HTTP API and the embedded NATS data
plane listen on. Use 0.0.0.0 to expose the stack to the LAN so remote peers
(phones, other PCs) can connect; keep the default for a loopback-only dev stack.
embeddednats gains StartHost(storeDir, host, port) for explicit interface
control; Start stays a backward-compatible wrapper (host "" = nats default
0.0.0.0) so the playground and tests are untouched.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Añade GET /api/bench (SSE) y una seccion de simulador en index.html: un publisher
inunda una room con miles de mensajes a N subscribers y una grafica en vivo anima
el throughput. Las dos politicas de room se exponen como flags independientes
(persist=JetStream, encrypt=E2E AEAD+Ed25519) mas tamano de payload, midiendo el
coste de cada capa con la libreria cliente real. El benchmark usa peers efimeros
propios, sin tocar los peers nombrados del sandbox manual.
Verificado: las 4 combinaciones enc x persist con fan-out exacto. Bump app v0.2.0.