Commit Graph

14 Commits

Author SHA1 Message Date
Egutierrez 1c9325104c feat(embeddednats): UNIBUS_NATS_MONITOR flag decoupled from debug log
Add a dedicated UNIBUS_NATS_MONITOR=1 toggle that opens the embedded
nats-server monitoring HTTP endpoint (127.0.0.1:8222, loopback only) so a
local metrics scraper can read /varz, /connz and /jsz for server-level
metrics (msgs/s, connections, KV bucket msgs, RAFT leader per stream,
restarts).

Previously the monitoring endpoint was only reachable via UNIBUS_NATS_DEBUG=1,
which is coupled to the verbose nats-server debug log: enabling the endpoint
also wrote routes/RAFT/room subjects to journald in clear, which regresses the
hardened posture (issue 0007). The two concerns are now decoupled.

The toggle computation is extracted to a pure function
natsLogOpts(debugEnv, monitorEnv) (noLog, debug, trace, monitor): MONITOR=1
opens the endpoint while keeping the log quiet (NoLog true / Debug false). The
inverse coupling is preserved for backward compatibility (DEBUG still implies
MONITOR). The 127.0.0.1 bind stays hardcoded — the monitoring endpoint has no
auth and must never be reachable from the network.

Deploy wiring versioned: additive systemd drop-in
membershipd-cluster.service.d/nats-monitor.conf (Environment=UNIBUS_NATS_MONITOR=1)
plus a "NATS server metrics" section in the cluster README with the rolling
activation runbook (magnus -> homer -> datardos) gated on R3 reconvergence
(followers 2/2) between nodes.

Tests: pure decoupling table (monitor on => log NOT debug; debug => monitor;
default closed) + a real embedded server with MONITOR=1 asserting /varz answers
200 on loopback:8222, and a server without the flag with the endpoint closed.
100% additive: behavior is identical without the flag. Bump app.md 0.10.0 ->
0.11.0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 20:57:46 +02:00
egutierrez b379730225 docs(app): document users HTTP admin model, bump 0.10.0
Add a gotcha describing the unified-storage model (the server writes
users to the same store/KV as rooms), the admin-only HTTP surface, and
the CLI-seeds-admin-#0 bootstrap. Bump the version 0.9.0 -> 0.10.0 and
add the capability growth log entry for the new HTTP admin users API.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 20:32:05 +02:00
egutierrez e1a7402ff1 chore: bump unibus to 0.9.0 (live user-add + clientcheck)
New capability membershipd user add --store kv against a live cluster plus
cmd/clientcheck end-to-end verification (issue 0011 gaps, report 0012). Adds
the capability growth log entry.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 19:41:56 +02:00
egutierrez 926b8e96af chore(0006): bump unibus to 0.8.0, close issue 0006 (cluster hardening + wiring)
All seven phases (0006a–0006g) merged: blockers N3 (replicated nonce) and N2
($JS.API.> KV leak) closed, decentralized KV store wired (--store kv), homogeneous
cluster posture enforced (N1), RefreshSession in all clients (N4), the lows
(secret out of argv, migrate guard, R1/CA docs), and the 3-node deploy material.

Full suite + every audit-0008 attack regression green; govulncheck 0 reachable.
See report 0009.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 17:33:03 +02:00
egutierrez 6976537842 chore(0005): bump unibus to 0.7.0, close issue 0005 (hardening 2)
Hardening 2 (issue 0005, fases 0005a-0005e) cierra los hallazgos nuevos de la
re-auditoría red-team (report 0006): bump de nats-server + toolchain (16 CVEs ->
0 alcanzables), drop de frames sin firma en rooms SignMsgs, limiter global de
bytes en vuelo contra el DoS por concurrencia, TLS obligatorio en bind publico, y
cableado de la ACL por subject que cierra el wildcard metadata leak. Detalle por
fase en el capability growth log del app.md y en el report 0007.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 16:17:41 +02:00
agent d821bc1794 chore(0003): bump unibus to 0.6.0 — decentralization / HA (0003a-0003e)
Cluster NATS routes (auth + mutual TLS), Store/blobstore interfaces with
replicated JetStream KV and Object Store backends, idempotent
migrate-to-kv with backup, client failover over seed/control-plane lists,
replicated nonce store (closes the multi-node replay hole), and the
per-subject membership ACL (audit H4 residual). All behind the
`decentralized` flag (off); single-node SQLite+disk behavior unchanged.
The multi-node deploy (0003f) is the human's; runbook in report 0006.
2026-06-07 15:31:14 +02:00
egutierrez 618f6b61da chore(0004): close issue, bump unibus to 0.5.0, record dataplane-acl decision
Issue 0004 (security hardening) done across 0004a-0004f. app.md version 0.5.0
with the capability growth log entry; dev/0004d-dataplane-acl.md documents the
chosen minimum-defense strategy for the NATS data plane and its residual limit
(per-subject ACL deferred to 0003). Full work report in
projects/message_bus/reports/0005-2026-06-07-unibus-security-hardening.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 14:40:39 +02:00
egutierrez 7fab473bc3 docs(app): bump to 0.4.0 — room discovery endpoint growth log
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 03:08:07 +02:00
egutierrez 69079d17d5 docs(app): bump to 0.3.0 — service hardening + frame threading growth log
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 18:30:55 +02:00
egutierrez b2e6712dd2 docs(app): fill service: block — systemd-user, Restart=always, LAN-reachable
membershipd now ships as a systemd user service (unit unibus-membershipd.service,
restart_policy always, runtime systemd-user). is_local_only flips to false since
--bind 0.0.0.0 makes both planes LAN-reachable. fn doctor services-spec: OK, no
drift.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 18:06:33 +02:00
Egutierrez 6b162deeb0 feat(playground): benchmark de rendimiento con flags JetStream/E2E/payload
Añade GET /api/bench (SSE) y una seccion de simulador en index.html: un publisher
inunda una room con miles de mensajes a N subscribers y una grafica en vivo anima
el throughput. Las dos politicas de room se exponen como flags independientes
(persist=JetStream, encrypt=E2E AEAD+Ed25519) mas tamano de payload, midiendo el
coste de cada capa con la libreria cliente real. El benchmark usa peers efimeros
propios, sin tocar los peers nombrados del sandbox manual.

Verificado: las 4 combinaciones enc x persist con fan-out exacto. Bump app v0.2.0.
2026-06-03 22:33:26 +02:00
agent 8c680bc002 feat: optional per-room JetStream persistence (history + offline replay), gated by RoomPolicy.Persist 2026-06-03 21:48:55 +02:00
agent 888ff75236 fix: default ports 8470 (HTTP) / 4250 (NATS) to avoid clash with registry_api on 8420 2026-06-03 19:54:36 +02:00
agent cd02a52191 feat: initial scaffold of unibus message bus (membership service + client lib + demo peers) 2026-06-03 19:47:32 +02:00