merge: origin/master into local
This commit is contained in:
@@ -0,0 +1,152 @@
|
||||
---
|
||||
id: "0128"
|
||||
title: "agents_and_robots: HTTP API + SSE + apikey + TLS subdominio"
|
||||
status: pendiente
|
||||
type: feature
|
||||
domain:
|
||||
- agents
|
||||
- infra
|
||||
- deploy
|
||||
scope: app
|
||||
priority: alta
|
||||
depends: []
|
||||
blocks:
|
||||
- "0129"
|
||||
related: []
|
||||
created: 2026-05-22
|
||||
updated: 2026-05-22
|
||||
tags: [agents_and_robots, http, sse, apikey, traefik, systemd]
|
||||
dod_evidence_schema:
|
||||
- id: build_ok
|
||||
kind: cmd
|
||||
expected: "cd projects/element_agents/apps/agents_and_robots && go build -tags goolm ./cmd/launcher → exit 0"
|
||||
required: true
|
||||
- id: api_list_authorized
|
||||
kind: cmd
|
||||
expected: "curl -fsS -H 'Authorization: Bearer $AGENTS_API_KEY' https://agents.organic-machine.com/agents devuelve JSON con N>=7 agentes"
|
||||
required: true
|
||||
- id: api_list_unauthorized_401
|
||||
kind: cmd
|
||||
expected: "curl -s -o /dev/null -w '%{http_code}' https://agents.organic-machine.com/agents == 401"
|
||||
required: true
|
||||
- id: api_start_stop_roundtrip
|
||||
kind: cmd
|
||||
expected: "POST /agents/test-bot/stop → POST /agents/test-bot/start: status running confirmado via GET /agents/test-bot tras 2s"
|
||||
required: true
|
||||
- id: sse_logs_streaming
|
||||
kind: cmd
|
||||
expected: "curl -N -H 'Authorization: Bearer $KEY' https://agents.organic-machine.com/sse/agents/assistant-bot/logs entrega >=1 line en 5s con agente activo"
|
||||
required: true
|
||||
- id: sse_status_broadcast
|
||||
kind: cmd
|
||||
expected: "curl -N /sse/status recibe evento {agent_id, old_status, new_status} tras stop/start manual"
|
||||
required: true
|
||||
- id: systemd_active
|
||||
kind: cmd
|
||||
expected: "ssh organic-machine.com 'systemctl is-active agents_and_robots.service' == active"
|
||||
required: true
|
||||
- id: traefik_route
|
||||
kind: url
|
||||
expected: "agents.organic-machine.com resuelve y devuelve cert LE valido (curl -vI muestra subject CN=agents.organic-machine.com)"
|
||||
required: true
|
||||
- id: app_md_drift_fixed
|
||||
kind: cmd
|
||||
expected: "fn doctor services-spec apps/element_agents/apps/agents_and_robots reporta OK (sin drift runtime/systemd)"
|
||||
required: true
|
||||
---
|
||||
|
||||
# 0128 — agents_and_robots HTTP API + SSE + apikey + TLS
|
||||
|
||||
## Contexto
|
||||
|
||||
Hoy `agents_and_robots` solo expone control via `agentctl` CLI local (filesystem-based, `shell/process.Manager`). No hay forma remota de gestionar agentes.
|
||||
|
||||
Necesitamos backend HTTP seguro para que un frontend local C++ (issue 0129) pueda listar, start/stop/restart agentes, y streamear logs/status en vivo.
|
||||
|
||||
## Decision
|
||||
|
||||
**Integrar daemon HTTP DENTRO de `cmd/launcher`** como goroutine. Comparte `process.Manager` + acceso a `shell/memory/*.db` + Matrix clients. Un solo proceso, sin drift entre daemon y supervisor.
|
||||
|
||||
**Auth:** `Authorization: Bearer <AGENTS_API_KEY>` con `subtle.ConstantTimeCompare`. Clave 32 bytes hex en `.env` (`AGENTS_API_KEY`). 401 sin header o key invalida.
|
||||
|
||||
**TLS:** Traefik en VPS organic-machine.com con LE cert auto. Subdominio `agents.organic-machine.com` (DNS A record nuevo → IP del VPS). Ruta Traefik `agents.organic-machine.com → 127.0.0.1:8487`.
|
||||
|
||||
**SSE in-memory pubsub.** NATS OFF de momento (1 cliente local, broker = overhead). Documentar TODO en app.md para anadir bus si llega 2do consumidor.
|
||||
|
||||
## Scope v0.1 (lean)
|
||||
|
||||
| Verbo | Path | Wrap |
|
||||
|---|---|---|
|
||||
| GET | `/health` | 200 OK sin auth (liveness) |
|
||||
| GET | `/agents` | `Scan` + `StatusAll` + `msg_count_24h` (query `shell/memory/*.db`) |
|
||||
| GET | `/agents/{id}` | detail + config + `LogTail(200)` |
|
||||
| POST | `/agents/{id}/start` | `Manager.Start` |
|
||||
| POST | `/agents/{id}/stop` | `Manager.Stop` |
|
||||
| POST | `/agents/{id}/restart` | Stop+Start con espera health |
|
||||
| GET | `/agents/{id}/logs?n=200` | `LogTail` snapshot |
|
||||
|
||||
**SSE:**
|
||||
- `GET /sse/status` — broadcast cambios de status (poll cada 2s + diff)
|
||||
- `GET /sse/agents/{id}/logs` — tail -f del logfile, emite line events
|
||||
|
||||
**Fuera de scope v0.1** (queda v0.2):
|
||||
- POST `/agents/{id}/message` (send Matrix message)
|
||||
- PUT `/agents/{id}/config` (config edit)
|
||||
- SSE messages stream
|
||||
|
||||
## Tareas
|
||||
|
||||
1. **Nuevo paquete `internal/api`** con server HTTP (stdlib `net/http`, sin gin/echo).
|
||||
- `api.New(mgr *process.Manager, apiKey string, port int) *Server`
|
||||
- `Server.Run(ctx) error` arranca y bloquea hasta ctx done.
|
||||
- Middleware: log + auth + recover.
|
||||
2. **Handlers REST** sobre `process.Manager`. Tests unitarios con mock manager.
|
||||
3. **SSE pubsub in-memory** (`internal/api/pubsub.go`):
|
||||
- `Bus` con `Subscribe(topic) <-chan event` + `Publish(topic, event)`.
|
||||
- Poller goroutine que llama `StatusAll` cada 2s y publica diffs.
|
||||
- Tail goroutine por logfile (`file_tail_follow` — buscar en registry o crear).
|
||||
4. **Integrar en launcher** — `cmd/launcher/main.go` arranca `api.Server` en goroutine si `--api-port > 0`.
|
||||
5. **Crear systemd unit** `/etc/systemd/system/agents_and_robots.service` con `Restart=always`, `EnvironmentFile=.env`, `ExecStart=.../bin/launcher --log-level info --api-port 8487`.
|
||||
6. **Traefik route + DNS:**
|
||||
- Anadir `agents.organic-machine.com` en DNS (A record).
|
||||
- Anadir config Traefik (label en docker-compose del stack o file provider) apuntando a `127.0.0.1:8487`.
|
||||
7. **Fix drift app.md** — `runtime: systemd-system` ahora es verdad. Verificar con `fn doctor services-spec`.
|
||||
8. **Tests:**
|
||||
- Go: pkg `internal/api` con httptest.
|
||||
- e2e: `e2e_checks` en `app.md` con curl smoke.
|
||||
9. **Deploy:**
|
||||
- `rsync_deploy_bash_infra` o `deploy_server` target nuevo.
|
||||
- Generar `AGENTS_API_KEY` con `openssl rand -hex 32` y escribir `.env` remoto.
|
||||
- `systemctl enable --now agents_and_robots.service`.
|
||||
|
||||
## Funciones del registry a usar / proponer
|
||||
|
||||
Buscar antes de codear:
|
||||
|
||||
- `mcp__registry__fn_search query="tail follow file" lang="go"` — ¿existe `file_tail_follow_go_infra`? Si no, delegar a fn-constructor.
|
||||
- `mcp__registry__fn_search query="http auth bearer" lang="go"` — middleware auth.
|
||||
- `mcp__registry__fn_search query="sse server" lang="go"` — helper SSE.
|
||||
- `systemd_generate_unit_go_infra` + `systemd_install_go_infra` — generar/instalar unit.
|
||||
|
||||
## Acceptance
|
||||
|
||||
- [ ] `curl -fsS -H 'Authorization: Bearer $KEY' https://agents.organic-machine.com/agents` devuelve lista correcta.
|
||||
- [ ] Sin header → 401. Con key invalida → 401. Key valida → 200.
|
||||
- [ ] Start/Stop/Restart cambian estado real del proceso (verificable con `ps`).
|
||||
- [ ] SSE logs entrega lineas en menos de 1s de aparecer en el archivo.
|
||||
- [ ] SSE status broadcast tras stop/start manual.
|
||||
- [ ] systemd unit activo y reinicia tras kill -9.
|
||||
- [ ] `fn doctor services-spec` reporta OK.
|
||||
- [ ] Tests Go pasan.
|
||||
|
||||
## DoD humano
|
||||
|
||||
- **Donde:** terminal local → `curl https://agents.organic-machine.com/agents`. SSE verificable con `curl -N`.
|
||||
- **Latencia:** SSE log lag < 1s. REST list < 200ms.
|
||||
- **Onboarding:** README de agents_and_robots actualizado con seccion "HTTP API" + ejemplos curl.
|
||||
|
||||
## Riesgos
|
||||
|
||||
- DNS propagation puede tardar (configurar con TTL bajo).
|
||||
- Traefik en este VPS: verificar si esta gestionado por Coolify o standalone — anadir ruta donde corresponda.
|
||||
- `LogTail` actual solo lee snapshot — necesitamos `tail -f` real para SSE. Si no existe en el registry, ronda previa.
|
||||
@@ -0,0 +1,180 @@
|
||||
---
|
||||
id: "0129"
|
||||
title: "agents_dashboard: C++ ImGui frontend para gestionar agentes Matrix"
|
||||
status: pendiente
|
||||
type: feature
|
||||
domain:
|
||||
- agents
|
||||
- tui
|
||||
scope: app
|
||||
priority: alta
|
||||
depends:
|
||||
- "0128"
|
||||
blocks: []
|
||||
related: []
|
||||
created: 2026-05-22
|
||||
updated: 2026-05-22
|
||||
tags: [cpp, imgui, agents, dashboard, sse, http-client]
|
||||
dod_evidence_schema:
|
||||
- id: scaffold_ok
|
||||
kind: cmd
|
||||
expected: "ls projects/element_agents/apps/agents_dashboard/{app.md,main.cpp,CMakeLists.txt,.git} todos existen"
|
||||
required: true
|
||||
- id: build_windows
|
||||
kind: cmd
|
||||
expected: "cmake --build cpp/build/windows --target agents_dashboard -j → exit 0"
|
||||
required: true
|
||||
- id: appicon_embedded
|
||||
kind: cmd
|
||||
expected: "x86_64-w64-mingw32-objdump -h cpp/build/windows/apps/agents_dashboard/agents_dashboard.exe | grep .rsrc"
|
||||
required: true
|
||||
- id: hub_card_visible
|
||||
kind: screenshot
|
||||
expected: "App Hub muestra tarjeta agents_dashboard con icono robot violeta + description correcta"
|
||||
required: true
|
||||
- id: connection_flow
|
||||
kind: screenshot
|
||||
expected: "Panel Connection con base_url + apikey input, LED verde tras handshake exitoso con backend"
|
||||
required: true
|
||||
- id: agents_table_populated
|
||||
kind: screenshot
|
||||
expected: "Tabla Agents muestra >=7 filas con id/status/uptime/msg_24h + botones accion"
|
||||
required: true
|
||||
- id: start_stop_works
|
||||
kind: screenshot
|
||||
expected: "Click stop sobre test-bot lo apaga (status cambia a stopped en menos de 2s); click start lo reinicia"
|
||||
required: true
|
||||
- id: logs_sse_streaming
|
||||
kind: screenshot
|
||||
expected: "Panel Logs streamea lineas en vivo de assistant-bot (lineas nuevas aparecen sin pulsar refresh)"
|
||||
required: true
|
||||
- id: apikey_encrypted_local
|
||||
kind: cmd
|
||||
expected: "strings cpp/build/windows/apps/agents_dashboard/local_files/agents_dashboard.db | grep -v '<plaintext apikey>' (apikey no aparece en claro)"
|
||||
required: true
|
||||
- id: e2e_self_test
|
||||
kind: cmd
|
||||
expected: "agents_dashboard.exe --self-test exit 0 (verifica subsistemas: GL loader, http client, SSE client, DB local)"
|
||||
required: true
|
||||
---
|
||||
|
||||
# 0129 — agents_dashboard C++ ImGui frontend
|
||||
|
||||
## Contexto
|
||||
|
||||
Cuando 0128 cierre, el backend `agents_and_robots` expondra HTTPS API + SSE en `agents.organic-machine.com` con apikey. Necesitamos frontend local C++ ImGui que consuma esa API y permita gestionar agentes sin SSH ni terminal.
|
||||
|
||||
## Decision
|
||||
|
||||
C++ ImGui app en `projects/element_agents/apps/agents_dashboard/`. Sub-repo Gitea `dataforge/agents_dashboard`. Integrada en App Hub con icono propio.
|
||||
|
||||
Scope v0.1 = lo que 0128 expone: list + start/stop/restart + logs SSE. v0.2 anade send-message + config-edit cuando backend los exponga.
|
||||
|
||||
## Tareas
|
||||
|
||||
### 1. Scaffold (REGLA: scaffolder canonico, NUNCA a mano)
|
||||
|
||||
```bash
|
||||
./fn run init_cpp_app agents_dashboard \
|
||||
--project element_agents \
|
||||
--desc "Frontend C++ ImGui para gestionar agentes Matrix de agents_and_robots via HTTPS+apikey, SSE para logs/status en vivo"
|
||||
```
|
||||
|
||||
Tras scaffold:
|
||||
- `git init` dentro de `projects/element_agents/apps/agents_dashboard/` (regla `apps_subrepo.md`).
|
||||
- Trio `app.md`: `description` + `icon.phosphor: "robot"` + `icon.accent: "#8b5cf6"`.
|
||||
- `./fn run regenerate_app_icons agents_dashboard`.
|
||||
- `./fn run refresh_app_hub` para que aparezca en el hub.
|
||||
|
||||
### 2. Funciones del registry — buscar primero
|
||||
|
||||
| Necesidad | Buscar en registry | Si falta |
|
||||
|---|---|---|
|
||||
| HTTP client C++ (sync GET/POST + Bearer + JSON body) | `mcp__registry__fn_search query="http client" lang="cpp"` | Delegar `fn-constructor`: `http_client_cpp_infra` con libcurl |
|
||||
| SSE client C++ | `sse_client_cpp_core` (FRESH 7d) | ✓ reuso directo |
|
||||
| JSON parse/serialize C++ | buscar nlohmann wrapper | Si falta, vendoring `cpp/vendor/json.hpp` (single-header) |
|
||||
| Data table | `data_table_cpp_viz` | ✓ reuso |
|
||||
| Secret store local (DPAPI Windows) | buscar | Si falta: `secret_store_cpp_infra` (DPAPI wrap, base64 fallback Linux) |
|
||||
| Ring buffer C++ | buscar | Si falta: `ring_buffer_cpp_core` |
|
||||
|
||||
Delegacion paralela: **una sola llamada Agent con N tool_use blocks paralelos** para las que falten (regla `delegation.md`).
|
||||
|
||||
### 3. Paneles UI
|
||||
|
||||
- **Connection** — `base_url` input + apikey input (mask) + boton "Test" → GET /health + GET /agents. LED estado SSE (gris/amarillo/verde/rojo). Save credentials en `local_files/agents_dashboard.db` encriptadas via secret_store.
|
||||
- **Agents** — `data_table_cpp_viz` con cols:
|
||||
- id (texto)
|
||||
- status (icono colored: running=green, stopped=gray, crashed=red)
|
||||
- uptime (humanized)
|
||||
- msg_24h (numero)
|
||||
- actions (botones `▶ ⏹ ↻` por fila)
|
||||
- Filtro por substring + sort por col.
|
||||
- **Logs** — selector agente (combo) + tail viewport (ring buffer 5000 lineas) + autoscroll toggle + boton "Pause". Stream via `/sse/agents/{id}/logs`.
|
||||
- **Status feed** — panel collapsible con eventos del `/sse/status` (timeline reciente).
|
||||
|
||||
### 4. Persistencia local
|
||||
|
||||
- `<exe_dir>/local_files/agents_dashboard.db` (SQLite via funciones del registry o sqlite3 directo).
|
||||
- Schema migraciones en `migrations/001_init.sql`:
|
||||
```sql
|
||||
CREATE TABLE connections (
|
||||
id INTEGER PRIMARY KEY,
|
||||
name TEXT NOT NULL,
|
||||
base_url TEXT NOT NULL,
|
||||
apikey_encrypted BLOB NOT NULL,
|
||||
last_used DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
CREATE TABLE app_state (
|
||||
key TEXT PRIMARY KEY,
|
||||
value TEXT
|
||||
);
|
||||
```
|
||||
- `app_settings.ini` via `fn_ui::settings_*` (theme, layout).
|
||||
- apikey cifrada con DPAPI Windows (clave nunca abandona la maquina).
|
||||
|
||||
### 5. Build + deploy local
|
||||
|
||||
- CMake target `agents_dashboard` en `cpp/CMakeLists.txt` (auto via scaffolder).
|
||||
- Build Windows: `cmake --build cpp/build/windows --target agents_dashboard -j`.
|
||||
- Deploy local: `./fn run redeploy_cpp_app_windows agents_dashboard projects/element_agents/apps/agents_dashboard --build`.
|
||||
- Icono via windres (gestionado por `add_imgui_app`).
|
||||
|
||||
### 6. Tests + e2e_checks
|
||||
|
||||
```yaml
|
||||
e2e_checks:
|
||||
- id: build
|
||||
cmd: "cmake --build cpp/build/windows --target agents_dashboard -j"
|
||||
timeout_s: 180
|
||||
- id: self_test
|
||||
cmd: "./cpp/build/windows/apps/agents_dashboard/agents_dashboard.exe --self-test"
|
||||
timeout_s: 30
|
||||
- id: pytest_mock
|
||||
cmd: "cd projects/element_agents/apps/agents_dashboard/tests && python3 -m pytest -x -q"
|
||||
timeout_s: 60
|
||||
```
|
||||
|
||||
Mock server pytest emula 0128 (list/start/stop + SSE) y verifica que la app C++ conecta + popula tabla + start/stop funciona en headless con `--capture` mode.
|
||||
|
||||
## Acceptance
|
||||
|
||||
- [ ] App arranca, muestra Connection panel.
|
||||
- [ ] Tras meter apikey valida → tabla Agents populated con datos reales de VPS.
|
||||
- [ ] Stop/Start desde UI cambia estado real del agente en VPS.
|
||||
- [ ] Logs streamea lineas nuevas sin polling.
|
||||
- [ ] Cerrar y reabrir app → credentials persisten (cifradas).
|
||||
- [ ] Sin red / apikey invalida → error visible, app no crashea.
|
||||
- [ ] `--self-test` exit 0.
|
||||
- [ ] Visible en App Hub con icono + description correctos.
|
||||
|
||||
## DoD humano
|
||||
|
||||
- **Donde:** Windows Desktop → App Hub → Click "agents_dashboard".
|
||||
- **Latencia:** logs SSE < 1s lag. Lista agents < 200ms tras handshake.
|
||||
- **Onboarding:** First-run wizard pide base_url + apikey; tooltip explica donde obtener la key (gestor de secretos del VPS).
|
||||
|
||||
## Riesgos
|
||||
|
||||
- libcurl en Windows mingw-w64: cross-compile setup. Si `http_client_cpp_infra` no existe, dedicar tiempo al wrapper antes de UI.
|
||||
- DPAPI solo Windows: fallback Linux puede ser texto plano con permisos 0600 + warning visible en UI.
|
||||
- SSE reconnect logic: backoff exponencial + indicador de estado claro.
|
||||
@@ -0,0 +1,235 @@
|
||||
---
|
||||
id: "0131"
|
||||
title: "agents v0.2: control per-agent unified mode + uptime/msg_24h + data_table_cpp_viz + clear/cache actions"
|
||||
status: pendiente
|
||||
type: feature
|
||||
domain:
|
||||
- agents
|
||||
- tui
|
||||
- infra
|
||||
scope: app
|
||||
priority: alta
|
||||
depends:
|
||||
- "0128"
|
||||
- "0129"
|
||||
blocks: []
|
||||
related: []
|
||||
created: 2026-05-22
|
||||
updated: 2026-05-22
|
||||
tags: [agents_and_robots, agents_dashboard, http, unified-mode, data-table, control]
|
||||
dod_evidence_schema:
|
||||
# Backend: agents_and_robots
|
||||
- id: build_backend
|
||||
kind: cmd
|
||||
expected: "cd projects/element_agents/apps/agents_and_robots && go build -tags goolm ./... → exit 0"
|
||||
required: true
|
||||
- id: tests_backend
|
||||
kind: cmd
|
||||
expected: "cd projects/element_agents/apps/agents_and_robots && go test -tags goolm -count=1 ./internal/api/... → exit 0"
|
||||
required: true
|
||||
- id: stop_unified_works
|
||||
kind: cmd
|
||||
expected: "POST /agents/test-bot/stop devuelve {status:stopped}; GET /agents/test-bot → running=false en <2s"
|
||||
required: true
|
||||
- id: start_unified_works
|
||||
kind: cmd
|
||||
expected: "POST /agents/test-bot/start tras stop devuelve {status:started}; GET /agents/test-bot → running=true en <5s"
|
||||
required: true
|
||||
- id: restart_unified_works
|
||||
kind: cmd
|
||||
expected: "POST /agents/test-bot/restart sobre agente running deja running=true en <8s sin error"
|
||||
required: true
|
||||
- id: clear_memory_endpoint
|
||||
kind: cmd
|
||||
expected: "POST /agents/test-bot/clear_memory devuelve {status:cleared, messages_deleted:N}; SELECT COUNT(*) FROM messages WHERE agent_id='test-bot' == 0"
|
||||
required: true
|
||||
- id: delete_cache_endpoint
|
||||
kind: cmd
|
||||
expected: "POST /agents/test-bot/delete_cache devuelve {status:cleared, paths_deleted:[...]}; verificar que crypto.db cache borrado"
|
||||
required: true
|
||||
- id: uptime_exposed
|
||||
kind: cmd
|
||||
expected: "GET /agents incluye campo uptime_seconds:int >0 para agents running"
|
||||
required: true
|
||||
- id: msg_24h_exposed
|
||||
kind: cmd
|
||||
expected: "GET /agents incluye campo messages_24h:int (puede ser 0) calculado de tabla messages"
|
||||
required: true
|
||||
# Frontend: agents_dashboard
|
||||
- id: build_frontend
|
||||
kind: cmd
|
||||
expected: "cmake --build cpp/build/windows --target agents_dashboard -j → exit 0"
|
||||
required: true
|
||||
- id: data_table_cpp_viz_used
|
||||
kind: cmd
|
||||
expected: "grep -E 'BeginTable|EndTable' projects/element_agents/apps/agents_dashboard/main.cpp devuelve 0 lineas (migrado a data_table_cpp_viz); grep data_table_cpp_viz app.md uses_functions = 1"
|
||||
required: true
|
||||
- id: per_agent_buttons_rendered
|
||||
kind: screenshot
|
||||
expected: "Tabla Agents muestra >=5 botones por fila: Start, Stop, Restart, Clear Memory, Delete Cache (puede iconos+tooltip)"
|
||||
required: true
|
||||
- id: uptime_visible
|
||||
kind: screenshot
|
||||
expected: "Tabla Agents columna uptime muestra valor humanizado (ej 12h, 3d) para agents running"
|
||||
required: true
|
||||
- id: msg_24h_visible
|
||||
kind: screenshot
|
||||
expected: "Tabla Agents columna msg/24h muestra contador real (no 'instances' como hack)"
|
||||
required: true
|
||||
# E2E: pytest
|
||||
- id: e2e_tests_pass
|
||||
kind: cmd
|
||||
expected: "AGENTS_API_KEY=... pytest tests/test_connect_e2e.py → todos PASS (>=20 tests)"
|
||||
required: true
|
||||
- id: e2e_control_roundtrip
|
||||
kind: cmd
|
||||
expected: "Nuevo test_control_roundtrip: stop → poll running=false → start → poll running=true → restart → poll running=true. Todo dentro de 30s."
|
||||
required: true
|
||||
- id: e2e_clear_memory
|
||||
kind: cmd
|
||||
expected: "Nuevo test_clear_memory: insert filas en messages → POST /clear_memory → COUNT == 0"
|
||||
required: true
|
||||
---
|
||||
|
||||
# 0131 — agents v0.2: full per-agent control + data_table + nuevos botones
|
||||
|
||||
## Contexto
|
||||
|
||||
v0.1 (issues 0128+0129) entrego:
|
||||
- HTTP API + apikey + TLS + SSE
|
||||
- C++ frontend con Connection/Agents/Logs/Status feed
|
||||
- Tabla agents con `running` derivado de backend
|
||||
|
||||
**Gaps detectados durante uso real:**
|
||||
1. **Control individual roto en unified mode** — Manager.Start/Stop esperan PID files por agente; en unified mode no existen → endpoints devuelven errores confusos ("not running" sobre agente que SI corre).
|
||||
2. **No hay uptime ni msg_24h reales** — backend no expone esos campos. UI muestra `instances` como hack para msg_24h.
|
||||
3. **Faltan acciones de gestion** — clear memory (mensajes en SQLite), delete cache (crypto E2EE), reset state.
|
||||
4. **Tabla manual** — `ImGui::BeginTable` inline en main.cpp. El registry tiene `data_table_cpp_viz` (funcion canonica). Migrar.
|
||||
|
||||
## Scope v0.2
|
||||
|
||||
### Backend (`projects/element_agents/apps/agents_and_robots/`)
|
||||
|
||||
**1. Control per-agent en unified mode**
|
||||
|
||||
Hoy launcher arranca todos los agents como goroutines bajo 1 PID via mode "unified". `Manager.Start/Stop/Restart` actuales solo funcionan en mode multi-process (PID por agente).
|
||||
|
||||
Anadir registro de cancel-context por agente en el launcher:
|
||||
- Por cada agente que arranca como goroutine, guardar `context.CancelFunc` en `Manager.unifiedCancels map[string]context.CancelFunc`.
|
||||
- `Manager.StopUnifiedAgent(id)` llama cancel del agente especifico.
|
||||
- `Manager.StartUnifiedAgent(id)` re-arranca solo ese agente sin restart del launcher entero.
|
||||
- `Manager.RestartUnifiedAgent(id)` = Stop + Start.
|
||||
|
||||
Handlers `handleStart/Stop/Restart` autodetectan via `IsUnifiedRunning()` y delegan a las nuevas variantes unified.
|
||||
|
||||
**2. Uptime real**
|
||||
|
||||
- `Manager.startedAt map[string]time.Time` poblado al arrancar cada goroutine.
|
||||
- En `AgentStatus.UptimeSeconds`, calcular `time.Since(startedAt[id]).Seconds()` si running, else 0.
|
||||
- Exponer en `agentResponse` como `uptime_seconds: int`.
|
||||
|
||||
**3. Messages_24h**
|
||||
|
||||
Cada agent persiste mensajes en su SQLite (`agents/<id>/data/memory.db`). El handler `handleListAgents` debe agregar por agente:
|
||||
- Abrir DB del agente readonly
|
||||
- `SELECT COUNT(*) FROM messages WHERE created_at > datetime('now', '-24 hours')`
|
||||
- Cache 30s para no abrir DB en cada request
|
||||
|
||||
Exponer como `messages_24h: int`.
|
||||
|
||||
**4. Endpoint `POST /agents/{id}/clear_memory`**
|
||||
|
||||
- Stop agent (si running)
|
||||
- Open agent's memory.db
|
||||
- `DELETE FROM messages` + `DELETE FROM facts`
|
||||
- Optionally start back si estaba running (deber `?restart=true` opcional)
|
||||
- Return `{status:"cleared", messages_deleted:N, facts_deleted:M}`
|
||||
|
||||
**5. Endpoint `POST /agents/{id}/delete_cache`**
|
||||
|
||||
- Stop agent (si running)
|
||||
- Delete `agents/<id>/data/crypto/` directory (E2EE cache; agent re-init on next start)
|
||||
- Delete `agents/<id>/data/cache/*` si existe
|
||||
- Return `{status:"cleared", paths_deleted:[...]}`
|
||||
- Optionally start back si estaba running (`?restart=true`)
|
||||
|
||||
NOTA: delete_cache fuerza re-verificacion E2EE. El agente debe re-autenticarse via SSSS recovery key on next start. Documentar.
|
||||
|
||||
### Frontend (`projects/element_agents/apps/agents_dashboard/`)
|
||||
|
||||
**1. Migrar a `data_table_cpp_viz`**
|
||||
|
||||
Hoy main.cpp usa `ImGui::BeginTable` inline. Sustituir por `data_table::Table` del registry (funcion `data_table_cpp_viz`). Anadir a `app.md::uses_functions`. Verificar via `fn doctor cpp-apps` que la app pasa de `CANDIDATE` a limpio.
|
||||
|
||||
**2. Columnas tabla:**
|
||||
- id
|
||||
- status icon (running=green, stopped=gray, disabled=yellow, crashed=red)
|
||||
- uptime (humanized via `human_duration_secs`)
|
||||
- msg/24h (numero real, NO instances)
|
||||
- actions (5 botones agrupados):
|
||||
- `▶ Start` (disabled si running)
|
||||
- `⏹ Stop` (disabled si !running)
|
||||
- `↻ Restart`
|
||||
- `🧠 Clear Memory` (confirmacion modal)
|
||||
- `🗑 Delete Cache` (confirmacion modal)
|
||||
|
||||
**3. Sort + filter** mantener via data_table_cpp_viz API.
|
||||
|
||||
### E2E (`tests/`)
|
||||
|
||||
Anadir 7 tests nuevos:
|
||||
- `test_control_roundtrip` — stop → poll → start → poll → restart → poll. Usa `test-bot`.
|
||||
- `test_clear_memory` — POST clear_memory, verifica COUNT(*) FROM messages == 0.
|
||||
- `test_delete_cache` — POST delete_cache, verifica crypto/ borrado.
|
||||
- `test_uptime_field_present` — /agents response incluye uptime_seconds key
|
||||
- `test_msg_24h_field_present` — /agents response incluye messages_24h key
|
||||
- `test_unified_stop_does_not_kill_launcher` — tras stop de 1 agente, otros siguen running.
|
||||
- `test_clear_memory_requires_apikey` — sin Bearer → 401
|
||||
|
||||
## Tareas
|
||||
|
||||
### Fase A — Backend (agents_and_robots)
|
||||
|
||||
1. Agregar `unifiedCancels map[string]context.CancelFunc` + `startedAt map[string]time.Time` + mutex a `shell/process.Manager`.
|
||||
2. Hook en `launcher` runtime para registrar/desregistrar cancels al arrancar/parar cada agent goroutine.
|
||||
3. Implementar `StopUnifiedAgent`, `StartUnifiedAgent`, `RestartUnifiedAgent` (Stop+Start).
|
||||
4. Refactor handlers `handleStartAgent/Stop/Restart` para autodetect unified vs multi.
|
||||
5. Anadir `uptime_seconds` y `messages_24h` a `AgentResponse`. Implementar query 24h con cache 30s.
|
||||
6. Implementar handlers `handleClearMemory`, `handleDeleteCache`.
|
||||
7. Anadir rutas en `server.go`.
|
||||
8. Tests Go unit `internal/api/*_test.go`.
|
||||
|
||||
### Fase B — Frontend (agents_dashboard)
|
||||
|
||||
1. Cambiar `parse_agents` para leer `uptime_seconds` y `messages_24h` del backend.
|
||||
2. Migrar tabla a `data_table_cpp_viz`. Mantener filter + sort.
|
||||
3. Anadir 5 botones por fila (Start/Stop/Restart/Clear/Delete).
|
||||
4. Confirmacion modal para Clear/Delete.
|
||||
5. Actualizar app.md::uses_functions con `data_table_cpp_viz`.
|
||||
|
||||
### Fase C — E2E + verify
|
||||
|
||||
1. Anadir 7 pytest tests.
|
||||
2. Run all e2e from registry venv. >=20 tests pass.
|
||||
3. Rebuild .exe + redeploy Windows.
|
||||
4. Visual confirm: botones, uptime, msg_24h.
|
||||
|
||||
## Acceptance
|
||||
|
||||
- [ ] All 14 DoD items green (cmd + screenshots).
|
||||
- [ ] >=20 e2e tests passing.
|
||||
- [ ] App C++ deployed to Windows Desktop, visible buttons + working roundtrip.
|
||||
- [ ] Backend unit tests pass.
|
||||
- [ ] No regression: 0128 + 0129 funcionalidad existente intacta (curl smoke del v0.1 sigue green).
|
||||
|
||||
## DoD humano
|
||||
|
||||
- **Donde**: Windows Desktop → agents_dashboard.exe → tabla Agents.
|
||||
- **Latencia**: stop → running=false reflected in UI within 2s (via SSE status diff). msg/24h refresh cada 30s ok.
|
||||
- **Onboarding**: tooltip en boton "Clear Memory" explica que borra mensajes; "Delete Cache" explica que el agente tendra que re-autenticar via SSSS al volver a arrancar.
|
||||
|
||||
## Riesgos
|
||||
|
||||
- Refactor de Manager unified-mode toca el ciclo de vida del launcher (paso ~7 del create_agent pipeline). Tests existentes deben pasar.
|
||||
- delete_cache borra crypto store; agente debe poder re-verify via env var `SSSS_RECOVERY_KEY_<NORM>`. Si esa env var no esta, agente queda en estado degradado. Validar antes de borrar.
|
||||
- data_table_cpp_viz puede tener limites de API que ImGui inline no tiene (sort custom, alignment). Verificar antes de migrar.
|
||||
@@ -0,0 +1,115 @@
|
||||
---
|
||||
id: "0166"
|
||||
title: "Desplegar TURN para LiveKit (coturn o integrado)"
|
||||
status: done
|
||||
type: infra
|
||||
domain:
|
||||
- matrix
|
||||
scope: app:element_matrix_chat
|
||||
priority: alta
|
||||
depends: []
|
||||
blocks: []
|
||||
related: ["0167", "0168"]
|
||||
created: 2026-05-24
|
||||
updated: 2026-05-24
|
||||
tags: [matrix, livekit, webrtc, turn, nat]
|
||||
---
|
||||
# 0166 — Desplegar TURN para LiveKit (coturn o integrado)
|
||||
|
||||
**Status:** pendiente
|
||||
**Created:** 2026-05-24
|
||||
**Type:** infra
|
||||
**Priority:** alta
|
||||
**Domain:** matrix
|
||||
**Scope:** app:element_matrix_chat
|
||||
**Depends:** —
|
||||
**Blocks:** —
|
||||
|
||||
## Problema
|
||||
|
||||
LiveKit corre sin TURN (`turn.enabled: false` en `configs/livekit/livekit.yaml`). Usuarios detras de NAT simetrico (CGNAT movil 4G/5G, redes corporativas con firewall estricto, hotel WiFi) NO pueden establecer call — WebRTC ICE direct/reflexive falla. Calls fallan silenciosos para ~10-20% usuarios.
|
||||
|
||||
## Objetivo
|
||||
|
||||
Calls funcionan en cualquier red. Element X movil sobre 4G CGNAT completa handshake.
|
||||
|
||||
## Plan
|
||||
|
||||
1. Decidir: coturn standalone vs LiveKit TURN integrado (recomendado: integrado, menos moving parts).
|
||||
2. Anadir subdominio `turn.organic-machine.com` con Let's Encrypt cert (Traefik).
|
||||
3. Activar bloque `turn:` en `livekit.yaml`:
|
||||
```yaml
|
||||
turn:
|
||||
enabled: true
|
||||
domain: "turn.organic-machine.com"
|
||||
tls_port: 5349
|
||||
udp_port: 443
|
||||
external_tls: true
|
||||
```
|
||||
4. Abrir puertos VPS firewall: TCP+UDP 443 (best practice — bypassea firewalls corp), TCP 5349.
|
||||
5. Rotar shared secret TURN.
|
||||
6. Test: navegador en red corp con `force-tcp` flag → call establecida.
|
||||
|
||||
## Acceptance
|
||||
|
||||
- [ ] `nc -vz turn.organic-machine.com 443` UDP+TCP OK.
|
||||
- [ ] Test call Element Web detras de NAT simetrico (movil hotspot tethering) → audio/video pasa.
|
||||
- [ ] LiveKit logs muestran `TURN allocation` requests servidas.
|
||||
- [ ] `.well-known/matrix/client` sigue apuntando al `livekit_service_url` JWT correcto.
|
||||
|
||||
## Definition of Done
|
||||
|
||||
- [ ] Repetibilidad: 5 calls consecutivas desde 5 redes distintas (incluido CGNAT) sin fallo.
|
||||
- [ ] Observabilidad: dashboard LiveKit muestra TURN vs direct ratio.
|
||||
- [ ] User-facing: usuario movil 4G inicia call → conecta < 3s.
|
||||
|
||||
## Notas
|
||||
|
||||
UDP 443 es trick conocido: la mayoria de firewalls corporativos solo dejan 443 (HTTPS) — TURN sobre UDP 443 bypassea sin requerir TCP relay que aumenta latencia.
|
||||
|
||||
Alternativa coturn standalone si LiveKit integrado tiene gaps de gestion: `docker run -d coturn/coturn` + config compartida con shared secret de LiveKit.
|
||||
|
||||
## Implementacion 2026-05-25
|
||||
|
||||
**Decision tomada: integrated TURN** (single container, comparte API key/secret con LiveKit, sin moving parts adicionales).
|
||||
|
||||
**Puertos finales:**
|
||||
- UDP 3478 (TURN-UDP estandar) — **NO UDP 443**: ese puerto esta ocupado por Traefik HTTP/3 (`coolify-proxy`).
|
||||
- TCP 5349 (TURN-TLS estandar) — libre.
|
||||
- Cert TLS: wildcard `*.organic-machine.com` extraido de Traefik `acme.json` (DNS-01 LE).
|
||||
|
||||
**Subdomain:** `turn-matrix-rtc-320bd4.organic-machine.com` (cubierto por wildcard DNS + wildcard cert; no requiere DNS manual).
|
||||
|
||||
**Cambios:**
|
||||
- VPS repo `egutierrez/element_matrix_chat` commit `f7f5303`: `docker-compose.livekit.yml` expone puertos TURN + monta certs.
|
||||
- `configs/livekit/livekit.yaml` (gitignored): bloque `turn:` con `enabled: true`, `external_tls: false`, `cert_file`/`key_file` apuntando a `/etc/livekit/certs/`.
|
||||
- `configs/livekit/certs/{turn-cert.pem,turn-key.pem}` (gitignored): extraidos de `/data/coolify/proxy/acme.json` via `jq | base64 -d`.
|
||||
- UFW: `3478/udp` + `5349/tcp` ALLOW.
|
||||
|
||||
**Verificacion:**
|
||||
- `nc -vz organic-machine.com 5349` -> succeeded
|
||||
- `nc -vzu organic-machine.com 3478` -> succeeded
|
||||
- `openssl s_client -connect turn-matrix-rtc-320bd4.organic-machine.com:5349` -> Verify return code: 0 (ok), wildcard cert servido
|
||||
- `docker logs livekit` -> `Starting TURN server {portTLS: 5349, portUDP: 3478, externalTLS: false}`
|
||||
|
||||
**TODO operador (follow-up, no bloquea cierre):**
|
||||
|
||||
1. **Rotacion cert**: Traefik renueva wildcard automaticamente, pero los PEM extraidos a `configs/livekit/certs/` quedan obsoletos. Anadir cron (mensual) o post-renew hook que re-extraiga desde `acme.json` + `docker compose restart livekit`. Script sugerido:
|
||||
```bash
|
||||
#!/bin/bash
|
||||
set -e
|
||||
ACME=/data/coolify/proxy/acme.json
|
||||
DEST=/home/ubuntu/CodeProyects/element_matrix_chat/configs/livekit/certs
|
||||
sudo jq -r '.letsencrypt.Certificates[0].certificate' $ACME | base64 -d > $DEST/turn-cert.pem
|
||||
sudo jq -r '.letsencrypt.Certificates[0].key' $ACME | base64 -d > $DEST/turn-key.pem
|
||||
chmod 644 $DEST/turn-cert.pem && chmod 600 $DEST/turn-key.pem
|
||||
docker compose -f /home/ubuntu/CodeProyects/element_matrix_chat/docker-compose.yml -f /home/ubuntu/CodeProyects/element_matrix_chat/docker-compose.livekit.yml restart livekit
|
||||
```
|
||||
|
||||
2. **DoD usage real** (capa 3 DoD Quality): pendiente test desde CGNAT movil + 5 redes distintas. Acceptance items 1-2 verificables solo con calls reales. Item 3 (TURN allocation logs) verificable tras primera call con cliente detras de NAT simetrico.
|
||||
|
||||
3. **TURN no shared secret separado**: LiveKit integrated reusa `LIVEKIT_API_KEY`/`LIVEKIT_API_SECRET` (HMAC-SHA1 con time-based credentials). No requiere rotacion adicional sobre la del API key. Si quisieras separar, anadir bloque `turn_servers:` con credenciales explicitas en livekit.yaml.
|
||||
|
||||
4. **Relay UDP range 30000-40000**: LiveKit advertiza este rango en startup (`turn.relay_range_start/end`). Hoy NO esta expuesto en docker-compose. Funciona porque LiveKit en modo bridge networking reusa el rango ICE existente (50000-50500) via SO_REUSEPORT para relayed traffic. Si hay problemas con relays, exponer 30000-40000/udp.
|
||||
|
||||
**Backups:** `configs/livekit/livekit.yaml.bak.20260524_224254` + `docker-compose.livekit.yml.bak.20260524_224254` en el VPS.
|
||||
@@ -0,0 +1,62 @@
|
||||
---
|
||||
id: "0167"
|
||||
title: "Eliminar STUN leak a Google en LiveKit (hardcode external_ip)"
|
||||
status: pendiente
|
||||
type: infra
|
||||
domain:
|
||||
- matrix
|
||||
scope: app:element_matrix_chat
|
||||
priority: baja
|
||||
depends: []
|
||||
blocks: []
|
||||
related: ["0166"]
|
||||
created: 2026-05-24
|
||||
updated: 2026-05-24
|
||||
tags: [matrix, livekit, privacy, stun]
|
||||
---
|
||||
# 0167 — Eliminar STUN leak a Google en LiveKit (hardcode external_ip)
|
||||
|
||||
**Status:** pendiente
|
||||
**Created:** 2026-05-24
|
||||
**Type:** infra
|
||||
**Priority:** baja
|
||||
**Domain:** matrix
|
||||
**Scope:** app:element_matrix_chat
|
||||
**Depends:** —
|
||||
**Blocks:** —
|
||||
|
||||
## Problema
|
||||
|
||||
`rtc.use_external_ip: true` con `external_ip` vacio → LiveKit hace STUN query a `stun.l.google.com:19302` cada arranque para descubrir IP publica. Leak metadata server (IP del VPS) a Google. Contradice premisa "self-host privacy first".
|
||||
|
||||
## Objetivo
|
||||
|
||||
LiveKit conoce su IP publica sin contactar STUN externos.
|
||||
|
||||
## Plan
|
||||
|
||||
1. Determinar IP publica VPS: `curl -s ifconfig.me`.
|
||||
2. Editar `configs/livekit/livekit.yaml`:
|
||||
```yaml
|
||||
rtc:
|
||||
use_external_ip: false
|
||||
node_ip: "<IP_PUBLICA>"
|
||||
```
|
||||
3. Si TURN propio desplegado (issue 0166), usar coturn como STUN propio.
|
||||
4. Restart `element_matrix_chat-livekit-1`.
|
||||
5. Test: call funciona igual.
|
||||
6. Auditar: `docker logs element_matrix_chat-livekit-1 | grep -i stun` no muestra queries a google.
|
||||
|
||||
## Acceptance
|
||||
|
||||
- [ ] `tcpdump -i eth0 dst stun.l.google.com` no captura paquetes tras restart.
|
||||
- [ ] Calls Element Call siguen funcionando 1:1 y grupo.
|
||||
|
||||
## Definition of Done
|
||||
|
||||
- [ ] Repetibilidad: reboot VPS, 0 paquetes a stun.l.google.com.
|
||||
- [ ] Observabilidad: log LiveKit confirma IP hardcoded.
|
||||
|
||||
## Notas
|
||||
|
||||
Bajo impacto operacional pero alta consistencia con doctrina self-host. Si IP del VPS cambia (rara vez con VPS estatico), actualizar config manual o automatizar con script de healthcheck.
|
||||
@@ -0,0 +1,58 @@
|
||||
---
|
||||
id: "0168"
|
||||
title: "Ampliar UDP range LiveKit de 200 a 500 ports"
|
||||
status: pendiente
|
||||
type: infra
|
||||
domain:
|
||||
- matrix
|
||||
scope: app:element_matrix_chat
|
||||
priority: baja
|
||||
depends: []
|
||||
blocks: []
|
||||
related: ["0166"]
|
||||
created: 2026-05-24
|
||||
updated: 2026-05-24
|
||||
tags: [matrix, livekit, scaling, webrtc]
|
||||
---
|
||||
# 0168 — Ampliar UDP range LiveKit de 200 a 500 ports
|
||||
|
||||
**Status:** pendiente
|
||||
**Created:** 2026-05-24
|
||||
**Type:** infra
|
||||
**Priority:** baja
|
||||
**Domain:** matrix
|
||||
**Scope:** app:element_matrix_chat
|
||||
**Depends:** —
|
||||
**Blocks:** —
|
||||
|
||||
## Problema
|
||||
|
||||
LiveKit configurado con `port_range_start: 50000`, `port_range_end: 50200` (200 ports UDP). Cada participante usa ~2 ports → cap **~100 participantes concurrentes** sumando TODAS las calls del server. OK para uso personal hoy, justo si se anaden grupos simultaneos o reuniones >10 personas.
|
||||
|
||||
## Objetivo
|
||||
|
||||
Sostener al menos 250 participantes concurrentes sin port exhaustion.
|
||||
|
||||
## Plan
|
||||
|
||||
1. Editar `configs/livekit/livekit.yaml`: `port_range_end: 50500`.
|
||||
2. Actualizar `docker-compose.yml` para exponer rango ampliado (300 puertos UDP adicionales).
|
||||
3. Abrir rango en firewall VPS (UFW/iptables).
|
||||
4. Restart stack LiveKit.
|
||||
5. Smoke test: call funciona.
|
||||
|
||||
## Acceptance
|
||||
|
||||
- [ ] `docker port element_matrix_chat-livekit-1` muestra 50000-50500 UDP.
|
||||
- [ ] `ss -lun | grep -c "0.0.0.0:50"` >= 500 tras restart.
|
||||
- [ ] Call test OK.
|
||||
|
||||
## Definition of Done
|
||||
|
||||
- [ ] Repetibilidad: stack reinicia limpio.
|
||||
|
||||
## Notas
|
||||
|
||||
`docker-compose.yml` actualmente lista los 200 ports uno a uno (verboso pero explicito). Considerar usar sintaxis `"50000-50500:50000-50500/udp"` para legibilidad.
|
||||
|
||||
NO incrementar a >1000 sin medir consumo memoria LiveKit — cada port asignado tiene overhead minimo pero acumula.
|
||||
@@ -0,0 +1,60 @@
|
||||
---
|
||||
id: "0169"
|
||||
title: "Rotar LIVEKIT_SECRET (expuesto en sesion auditoria)"
|
||||
status: pendiente
|
||||
type: bugfix
|
||||
domain:
|
||||
- matrix
|
||||
scope: app:element_matrix_chat
|
||||
priority: alta
|
||||
depends: []
|
||||
blocks: []
|
||||
related: []
|
||||
created: 2026-05-24
|
||||
updated: 2026-05-24
|
||||
tags: [matrix, livekit, security, secret-rotation]
|
||||
---
|
||||
# 0169 — Rotar LIVEKIT_SECRET (expuesto en sesion auditoria)
|
||||
|
||||
**Status:** pendiente
|
||||
**Created:** 2026-05-24
|
||||
**Type:** bugfix
|
||||
**Priority:** alta
|
||||
**Domain:** matrix
|
||||
**Scope:** app:element_matrix_chat
|
||||
**Depends:** —
|
||||
**Blocks:** —
|
||||
|
||||
## Problema
|
||||
|
||||
Durante auditoria 2026-05-24 (sesion Claude), `docker inspect element_matrix_chat-livekit-jwt-1` volco `LIVEKIT_SECRET=b00e98f70722bc...` cleartext en stdout de la sesion. Aunque la sesion es del operador, el secret quedo en log de conversacion + potencialmente en backups del log + transcripts. Rotacion necesaria por higiene.
|
||||
|
||||
## Objetivo
|
||||
|
||||
Nuevo secret 32 bytes hex, mismo `api_key` (o regenerar ambos), stack restart sin perdida sesion.
|
||||
|
||||
## Plan
|
||||
|
||||
1. Generar nuevo secret: `openssl rand -hex 32`.
|
||||
2. Editar `configs/livekit/livekit.yaml` → bloque `keys:` con nuevo valor.
|
||||
3. Editar `.env` de docker-compose (var `LIVEKIT_SECRET` consumida por `livekit-jwt`).
|
||||
4. Restart `element_matrix_chat-livekit-1` y `element_matrix_chat-livekit-jwt-1` en orden.
|
||||
5. Test call Element Call → handshake JWT OK.
|
||||
6. Guardar secret antiguo + nuevo en `pass` con timestamp rotacion.
|
||||
|
||||
## Acceptance
|
||||
|
||||
- [ ] `docker inspect ... --format "{{.Config.Env}}"` muestra secret nuevo.
|
||||
- [ ] Element Call inicia call sin error "invalid token".
|
||||
- [ ] Entry `pass matrix/livekit-secret` actualizada.
|
||||
|
||||
## Definition of Done
|
||||
|
||||
- [ ] Repetibilidad: rotacion documentada como funcion del registry (candidato `livekit_secret_rotate_bash_infra`).
|
||||
- [ ] Observabilidad: rotation log con timestamp.
|
||||
|
||||
## Notas
|
||||
|
||||
Considerar promover el procedimiento a funcion del registry: `livekit_secret_rotate_bash_infra(ssh_host, compose_dir)` que automatiza pasos 1-5 y guarda en pass via `gpg_pass_write`.
|
||||
|
||||
Patron similar para otros secrets del stack (Synapse macaroon, MAS encryption key, postgres passwords) → capability group nuevo `secret-rotation`.
|
||||
@@ -0,0 +1,55 @@
|
||||
---
|
||||
id: "0170"
|
||||
title: "Renombrar livekit.example.yaml -> livekit.yaml en bind mount"
|
||||
status: pendiente
|
||||
type: chore
|
||||
domain:
|
||||
- matrix
|
||||
scope: app:element_matrix_chat
|
||||
priority: baja
|
||||
depends: []
|
||||
blocks: []
|
||||
related: []
|
||||
created: 2026-05-24
|
||||
updated: 2026-05-24
|
||||
tags: [matrix, livekit, hygiene]
|
||||
---
|
||||
# 0170 — Renombrar livekit.example.yaml -> livekit.yaml en bind mount
|
||||
|
||||
**Status:** pendiente
|
||||
**Created:** 2026-05-24
|
||||
**Type:** chore
|
||||
**Priority:** baja
|
||||
**Domain:** matrix
|
||||
**Scope:** app:element_matrix_chat
|
||||
**Depends:** —
|
||||
**Blocks:** —
|
||||
|
||||
## Problema
|
||||
|
||||
`configs/livekit/livekit.yaml` mantiene los comentarios "Copy this file..." del template original. Funciona pero confunde: parece config sin completar. El bind mount apunta directo a este archivo, asi que renombrar limpiamente el archivo template y mantener `livekit.yaml` limpio para mantenimiento.
|
||||
|
||||
## Objetivo
|
||||
|
||||
`livekit.yaml` limpio sin comentarios de "example", `livekit.example.yaml` separado como referencia template inicial en repo.
|
||||
|
||||
## Plan
|
||||
|
||||
1. Crear `configs/livekit/livekit.example.yaml` con plantilla limpia (placeholders).
|
||||
2. Eliminar comentarios "Copy this file..." del `livekit.yaml` actual.
|
||||
3. Verificar `.gitignore` cubre `livekit.yaml` real pero no `livekit.example.yaml`.
|
||||
4. Commit en `egutierrez/element_matrix_chat`.
|
||||
|
||||
## Acceptance
|
||||
|
||||
- [ ] `head -3 configs/livekit/livekit.yaml` NO menciona "example".
|
||||
- [ ] `configs/livekit/livekit.example.yaml` versionado.
|
||||
- [ ] Stack restart sin cambios funcionales.
|
||||
|
||||
## Definition of Done
|
||||
|
||||
- [ ] PR mergeado en `dataforge/element_matrix_chat`.
|
||||
|
||||
## Notas
|
||||
|
||||
Tarea de higiene puro. Cero impacto runtime. Mejora onboarding futuro si otro operador clona el repo.
|
||||
Reference in New Issue
Block a user