feat(embeddednats): UNIBUS_NATS_MONITOR flag decoupled from debug log
Add a dedicated UNIBUS_NATS_MONITOR=1 toggle that opens the embedded nats-server monitoring HTTP endpoint (127.0.0.1:8222, loopback only) so a local metrics scraper can read /varz, /connz and /jsz for server-level metrics (msgs/s, connections, KV bucket msgs, RAFT leader per stream, restarts). Previously the monitoring endpoint was only reachable via UNIBUS_NATS_DEBUG=1, which is coupled to the verbose nats-server debug log: enabling the endpoint also wrote routes/RAFT/room subjects to journald in clear, which regresses the hardened posture (issue 0007). The two concerns are now decoupled. The toggle computation is extracted to a pure function natsLogOpts(debugEnv, monitorEnv) (noLog, debug, trace, monitor): MONITOR=1 opens the endpoint while keeping the log quiet (NoLog true / Debug false). The inverse coupling is preserved for backward compatibility (DEBUG still implies MONITOR). The 127.0.0.1 bind stays hardcoded — the monitoring endpoint has no auth and must never be reachable from the network. Deploy wiring versioned: additive systemd drop-in membershipd-cluster.service.d/nats-monitor.conf (Environment=UNIBUS_NATS_MONITOR=1) plus a "NATS server metrics" section in the cluster README with the rolling activation runbook (magnus -> homer -> datardos) gated on R3 reconvergence (followers 2/2) between nodes. Tests: pure decoupling table (monitor on => log NOT debug; debug => monitor; default closed) + a real embedded server with MONITOR=1 asserting /varz answers 200 on loopback:8222, and a server without the flag with the endpoint closed. 100% additive: behavior is identical without the flag. Bump app.md 0.10.0 -> 0.11.0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -2,7 +2,7 @@
|
||||
name: unibus
|
||||
lang: go
|
||||
domain: infra
|
||||
version: 0.10.0
|
||||
version: 0.11.0
|
||||
description: "Bus de mensajería unificado sobre NATS+JetStream con cifrado E2E por room (megolm/olm reducido): service de membresía/claves, librería cliente y peers demo."
|
||||
tags: [service, messaging, nats, e2e]
|
||||
uses_functions:
|
||||
@@ -169,6 +169,26 @@ agent.<nombre>.{in,out} inbox/outbox de agente LLM (agent.scout.in)
|
||||
|
||||
## Capability growth log
|
||||
|
||||
- v0.11.0 (2026-06-07) — flag dedicado `UNIBUS_NATS_MONITOR` que abre el endpoint
|
||||
de monitoring HTTP del nats-server embebido (`127.0.0.1:8222`, loopback only) de
|
||||
forma DESACOPLADA del debug-log. Antes el monitoring solo se abría con
|
||||
`UNIBUS_NATS_DEBUG=1`, que además encendía el log verboso del nats-server
|
||||
(rutas/RAFT/subjects a journald en claro) — incompatible con el endurecimiento
|
||||
del issue 0007. El cómputo de los toggles se extrae a una función pura
|
||||
`natsLogOpts(debugEnv, monitorEnv) (noLog, debug, trace, monitor)`: `MONITOR=1`
|
||||
abre el endpoint dejando el log en silencio (`NoLog` true / `Debug` false), y se
|
||||
mantiene el acoplamiento inverso por compatibilidad (`DEBUG` sigue implicando
|
||||
`MONITOR`). El bind loopback `127.0.0.1` queda hardcoded — el monitoring NUNCA es
|
||||
público y no lleva auth; lo lee un scraper local que empuja a VictoriaMetrics
|
||||
(dashboard `unibus-nats` en `fleet_monitoring`). Se versiona el cableado de
|
||||
deploy: drop-in systemd aditivo `membershipd-cluster.service.d/nats-monitor.conf`
|
||||
(`Environment=UNIBUS_NATS_MONITOR=1`) + sección "NATS server metrics" en el
|
||||
README del cluster con el runbook de activación rolling (magnus→homer→datardos)
|
||||
y gate de reconvergencia R3 (`followers 2/2`) entre nodos. Tests nuevos: tabla
|
||||
pura del desacoplamiento (monitor on ⇒ log NO debug; debug ⇒ monitor; default
|
||||
cerrado) + server real con `MONITOR=1` que confirma `/varz` 200 en loopback:8222
|
||||
y server sin flag con el endpoint cerrado. Cambios 100% aditivos: sin el flag el
|
||||
comportamiento es idéntico; build/test verdes.
|
||||
- v0.10.0 (2026-06-07) — API HTTP admin-only de gestión de usuarios, cerrando la
|
||||
última asimetría del control plane: las rooms tenían superficie HTTP firmada
|
||||
(`POST /rooms`, etc.) pero los users solo se gestionaban por CLI local o acceso
|
||||
|
||||
Reference in New Issue
Block a user