Files
fn_registry/dev/issues/completed/0131-agents-dashboard-v0-2-full-control.md
T

10 KiB

id, title, status, type, domain, scope, priority, depends, blocks, related, created, updated, tags, dod_evidence_schema
id title status type domain scope priority depends blocks related created updated tags dod_evidence_schema
0131 agents v0.2: control per-agent unified mode + uptime/msg_24h + data_table_cpp_viz + clear/cache actions pendiente feature
agents
tui
infra
app alta
0128
0129
2026-05-22 2026-05-22
agents_and_robots
agents_dashboard
http
unified-mode
data-table
control
id kind expected required
build_backend cmd cd projects/element_agents/apps/agents_and_robots && go build -tags goolm ./... → exit 0 true
id kind expected required
tests_backend cmd cd projects/element_agents/apps/agents_and_robots && go test -tags goolm -count=1 ./internal/api/... → exit 0 true
id kind expected required
stop_unified_works cmd POST /agents/test-bot/stop devuelve {status:stopped}; GET /agents/test-bot → running=false en <2s true
id kind expected required
start_unified_works cmd POST /agents/test-bot/start tras stop devuelve {status:started}; GET /agents/test-bot → running=true en <5s true
id kind expected required
restart_unified_works cmd POST /agents/test-bot/restart sobre agente running deja running=true en <8s sin error true
id kind expected required
clear_memory_endpoint cmd POST /agents/test-bot/clear_memory devuelve {status:cleared, messages_deleted:N}; SELECT COUNT(*) FROM messages WHERE agent_id='test-bot' == 0 true
id kind expected required
delete_cache_endpoint cmd POST /agents/test-bot/delete_cache devuelve {status:cleared, paths_deleted:[...]}; verificar que crypto.db cache borrado true
id kind expected required
uptime_exposed cmd GET /agents incluye campo uptime_seconds:int >0 para agents running true
id kind expected required
msg_24h_exposed cmd GET /agents incluye campo messages_24h:int (puede ser 0) calculado de tabla messages true
id kind expected required
build_frontend cmd cmake --build cpp/build/windows --target agents_dashboard -j → exit 0 true
id kind expected required
data_table_cpp_viz_used cmd grep -E 'BeginTable|EndTable' projects/element_agents/apps/agents_dashboard/main.cpp devuelve 0 lineas (migrado a data_table_cpp_viz); grep data_table_cpp_viz app.md uses_functions = 1 true
id kind expected required
per_agent_buttons_rendered screenshot Tabla Agents muestra >=5 botones por fila: Start, Stop, Restart, Clear Memory, Delete Cache (puede iconos+tooltip) true
id kind expected required
uptime_visible screenshot Tabla Agents columna uptime muestra valor humanizado (ej 12h, 3d) para agents running true
id kind expected required
msg_24h_visible screenshot Tabla Agents columna msg/24h muestra contador real (no 'instances' como hack) true
id kind expected required
e2e_tests_pass cmd AGENTS_API_KEY=... pytest tests/test_connect_e2e.py → todos PASS (>=20 tests) true
id kind expected required
e2e_control_roundtrip cmd Nuevo test_control_roundtrip: stop → poll running=false → start → poll running=true → restart → poll running=true. Todo dentro de 30s. true
id kind expected required
e2e_clear_memory cmd Nuevo test_clear_memory: insert filas en messages → POST /clear_memory → COUNT == 0 true

0131 — agents v0.2: full per-agent control + data_table + nuevos botones

Contexto

v0.1 (issues 0128+0129) entrego:

  • HTTP API + apikey + TLS + SSE
  • C++ frontend con Connection/Agents/Logs/Status feed
  • Tabla agents con running derivado de backend

Gaps detectados durante uso real:

  1. Control individual roto en unified mode — Manager.Start/Stop esperan PID files por agente; en unified mode no existen → endpoints devuelven errores confusos ("not running" sobre agente que SI corre).
  2. No hay uptime ni msg_24h reales — backend no expone esos campos. UI muestra instances como hack para msg_24h.
  3. Faltan acciones de gestion — clear memory (mensajes en SQLite), delete cache (crypto E2EE), reset state.
  4. Tabla manualImGui::BeginTable inline en main.cpp. El registry tiene data_table_cpp_viz (funcion canonica). Migrar.

Scope v0.2

Backend (projects/element_agents/apps/agents_and_robots/)

1. Control per-agent en unified mode

Hoy launcher arranca todos los agents como goroutines bajo 1 PID via mode "unified". Manager.Start/Stop/Restart actuales solo funcionan en mode multi-process (PID por agente).

Anadir registro de cancel-context por agente en el launcher:

  • Por cada agente que arranca como goroutine, guardar context.CancelFunc en Manager.unifiedCancels map[string]context.CancelFunc.
  • Manager.StopUnifiedAgent(id) llama cancel del agente especifico.
  • Manager.StartUnifiedAgent(id) re-arranca solo ese agente sin restart del launcher entero.
  • Manager.RestartUnifiedAgent(id) = Stop + Start.

Handlers handleStart/Stop/Restart autodetectan via IsUnifiedRunning() y delegan a las nuevas variantes unified.

2. Uptime real

  • Manager.startedAt map[string]time.Time poblado al arrancar cada goroutine.
  • En AgentStatus.UptimeSeconds, calcular time.Since(startedAt[id]).Seconds() si running, else 0.
  • Exponer en agentResponse como uptime_seconds: int.

3. Messages_24h

Cada agent persiste mensajes en su SQLite (agents/<id>/data/memory.db). El handler handleListAgents debe agregar por agente:

  • Abrir DB del agente readonly
  • SELECT COUNT(*) FROM messages WHERE created_at > datetime('now', '-24 hours')
  • Cache 30s para no abrir DB en cada request

Exponer como messages_24h: int.

4. Endpoint POST /agents/{id}/clear_memory

  • Stop agent (si running)
  • Open agent's memory.db
  • DELETE FROM messages + DELETE FROM facts
  • Optionally start back si estaba running (deber ?restart=true opcional)
  • Return {status:"cleared", messages_deleted:N, facts_deleted:M}

5. Endpoint POST /agents/{id}/delete_cache

  • Stop agent (si running)
  • Delete agents/<id>/data/crypto/ directory (E2EE cache; agent re-init on next start)
  • Delete agents/<id>/data/cache/* si existe
  • Return {status:"cleared", paths_deleted:[...]}
  • Optionally start back si estaba running (?restart=true)

NOTA: delete_cache fuerza re-verificacion E2EE. El agente debe re-autenticarse via SSSS recovery key on next start. Documentar.

Frontend (projects/element_agents/apps/agents_dashboard/)

1. Migrar a data_table_cpp_viz

Hoy main.cpp usa ImGui::BeginTable inline. Sustituir por data_table::Table del registry (funcion data_table_cpp_viz). Anadir a app.md::uses_functions. Verificar via fn doctor cpp-apps que la app pasa de CANDIDATE a limpio.

2. Columnas tabla:

  • id
  • status icon (running=green, stopped=gray, disabled=yellow, crashed=red)
  • uptime (humanized via human_duration_secs)
  • msg/24h (numero real, NO instances)
  • actions (5 botones agrupados):
    • ▶ Start (disabled si running)
    • ⏹ Stop (disabled si !running)
    • ↻ Restart
    • 🧠 Clear Memory (confirmacion modal)
    • 🗑 Delete Cache (confirmacion modal)

3. Sort + filter mantener via data_table_cpp_viz API.

E2E (tests/)

Anadir 7 tests nuevos:

  • test_control_roundtrip — stop → poll → start → poll → restart → poll. Usa test-bot.
  • test_clear_memory — POST clear_memory, verifica COUNT(*) FROM messages == 0.
  • test_delete_cache — POST delete_cache, verifica crypto/ borrado.
  • test_uptime_field_present — /agents response incluye uptime_seconds key
  • test_msg_24h_field_present — /agents response incluye messages_24h key
  • test_unified_stop_does_not_kill_launcher — tras stop de 1 agente, otros siguen running.
  • test_clear_memory_requires_apikey — sin Bearer → 401

Tareas

Fase A — Backend (agents_and_robots)

  1. Agregar unifiedCancels map[string]context.CancelFunc + startedAt map[string]time.Time + mutex a shell/process.Manager.
  2. Hook en launcher runtime para registrar/desregistrar cancels al arrancar/parar cada agent goroutine.
  3. Implementar StopUnifiedAgent, StartUnifiedAgent, RestartUnifiedAgent (Stop+Start).
  4. Refactor handlers handleStartAgent/Stop/Restart para autodetect unified vs multi.
  5. Anadir uptime_seconds y messages_24h a AgentResponse. Implementar query 24h con cache 30s.
  6. Implementar handlers handleClearMemory, handleDeleteCache.
  7. Anadir rutas en server.go.
  8. Tests Go unit internal/api/*_test.go.

Fase B — Frontend (agents_dashboard)

  1. Cambiar parse_agents para leer uptime_seconds y messages_24h del backend.
  2. Migrar tabla a data_table_cpp_viz. Mantener filter + sort.
  3. Anadir 5 botones por fila (Start/Stop/Restart/Clear/Delete).
  4. Confirmacion modal para Clear/Delete.
  5. Actualizar app.md::uses_functions con data_table_cpp_viz.

Fase C — E2E + verify

  1. Anadir 7 pytest tests.
  2. Run all e2e from registry venv. >=20 tests pass.
  3. Rebuild .exe + redeploy Windows.
  4. Visual confirm: botones, uptime, msg_24h.

Acceptance

  • All 14 DoD items green (cmd + screenshots).
  • >=20 e2e tests passing.
  • App C++ deployed to Windows Desktop, visible buttons + working roundtrip.
  • Backend unit tests pass.
  • No regression: 0128 + 0129 funcionalidad existente intacta (curl smoke del v0.1 sigue green).

DoD humano

  • Donde: Windows Desktop → agents_dashboard.exe → tabla Agents.
  • Latencia: stop → running=false reflected in UI within 2s (via SSE status diff). msg/24h refresh cada 30s ok.
  • Onboarding: tooltip en boton "Clear Memory" explica que borra mensajes; "Delete Cache" explica que el agente tendra que re-autenticar via SSSS al volver a arrancar.

Riesgos

  • Refactor de Manager unified-mode toca el ciclo de vida del launcher (paso ~7 del create_agent pipeline). Tests existentes deben pasar.
  • delete_cache borra crypto store; agente debe poder re-verify via env var SSSS_RECOVERY_KEY_<NORM>. Si esa env var no esta, agente queda en estado degradado. Validar antes de borrar.
  • data_table_cpp_viz puede tener limites de API que ImGui inline no tiene (sort custom, alignment). Verificar antes de migrar.