feat(issues): 0131 agents v0.2 — unified control + uptime/msg_24h + data_table + clear/cache
This commit is contained in:
@@ -0,0 +1,235 @@
|
||||
---
|
||||
id: "0131"
|
||||
title: "agents v0.2: control per-agent unified mode + uptime/msg_24h + data_table_cpp_viz + clear/cache actions"
|
||||
status: pendiente
|
||||
type: feature
|
||||
domain:
|
||||
- agents
|
||||
- tui
|
||||
- infra
|
||||
scope: app
|
||||
priority: alta
|
||||
depends:
|
||||
- "0128"
|
||||
- "0129"
|
||||
blocks: []
|
||||
related: []
|
||||
created: 2026-05-22
|
||||
updated: 2026-05-22
|
||||
tags: [agents_and_robots, agents_dashboard, http, unified-mode, data-table, control]
|
||||
dod_evidence_schema:
|
||||
# Backend: agents_and_robots
|
||||
- id: build_backend
|
||||
kind: cmd
|
||||
expected: "cd projects/element_agents/apps/agents_and_robots && go build -tags goolm ./... → exit 0"
|
||||
required: true
|
||||
- id: tests_backend
|
||||
kind: cmd
|
||||
expected: "cd projects/element_agents/apps/agents_and_robots && go test -tags goolm -count=1 ./internal/api/... → exit 0"
|
||||
required: true
|
||||
- id: stop_unified_works
|
||||
kind: cmd
|
||||
expected: "POST /agents/test-bot/stop devuelve {status:stopped}; GET /agents/test-bot → running=false en <2s"
|
||||
required: true
|
||||
- id: start_unified_works
|
||||
kind: cmd
|
||||
expected: "POST /agents/test-bot/start tras stop devuelve {status:started}; GET /agents/test-bot → running=true en <5s"
|
||||
required: true
|
||||
- id: restart_unified_works
|
||||
kind: cmd
|
||||
expected: "POST /agents/test-bot/restart sobre agente running deja running=true en <8s sin error"
|
||||
required: true
|
||||
- id: clear_memory_endpoint
|
||||
kind: cmd
|
||||
expected: "POST /agents/test-bot/clear_memory devuelve {status:cleared, messages_deleted:N}; SELECT COUNT(*) FROM messages WHERE agent_id='test-bot' == 0"
|
||||
required: true
|
||||
- id: delete_cache_endpoint
|
||||
kind: cmd
|
||||
expected: "POST /agents/test-bot/delete_cache devuelve {status:cleared, paths_deleted:[...]}; verificar que crypto.db cache borrado"
|
||||
required: true
|
||||
- id: uptime_exposed
|
||||
kind: cmd
|
||||
expected: "GET /agents incluye campo uptime_seconds:int >0 para agents running"
|
||||
required: true
|
||||
- id: msg_24h_exposed
|
||||
kind: cmd
|
||||
expected: "GET /agents incluye campo messages_24h:int (puede ser 0) calculado de tabla messages"
|
||||
required: true
|
||||
# Frontend: agents_dashboard
|
||||
- id: build_frontend
|
||||
kind: cmd
|
||||
expected: "cmake --build cpp/build/windows --target agents_dashboard -j → exit 0"
|
||||
required: true
|
||||
- id: data_table_cpp_viz_used
|
||||
kind: cmd
|
||||
expected: "grep -E 'BeginTable|EndTable' projects/element_agents/apps/agents_dashboard/main.cpp devuelve 0 lineas (migrado a data_table_cpp_viz); grep data_table_cpp_viz app.md uses_functions = 1"
|
||||
required: true
|
||||
- id: per_agent_buttons_rendered
|
||||
kind: screenshot
|
||||
expected: "Tabla Agents muestra >=5 botones por fila: Start, Stop, Restart, Clear Memory, Delete Cache (puede iconos+tooltip)"
|
||||
required: true
|
||||
- id: uptime_visible
|
||||
kind: screenshot
|
||||
expected: "Tabla Agents columna uptime muestra valor humanizado (ej 12h, 3d) para agents running"
|
||||
required: true
|
||||
- id: msg_24h_visible
|
||||
kind: screenshot
|
||||
expected: "Tabla Agents columna msg/24h muestra contador real (no 'instances' como hack)"
|
||||
required: true
|
||||
# E2E: pytest
|
||||
- id: e2e_tests_pass
|
||||
kind: cmd
|
||||
expected: "AGENTS_API_KEY=... pytest tests/test_connect_e2e.py → todos PASS (>=20 tests)"
|
||||
required: true
|
||||
- id: e2e_control_roundtrip
|
||||
kind: cmd
|
||||
expected: "Nuevo test_control_roundtrip: stop → poll running=false → start → poll running=true → restart → poll running=true. Todo dentro de 30s."
|
||||
required: true
|
||||
- id: e2e_clear_memory
|
||||
kind: cmd
|
||||
expected: "Nuevo test_clear_memory: insert filas en messages → POST /clear_memory → COUNT == 0"
|
||||
required: true
|
||||
---
|
||||
|
||||
# 0131 — agents v0.2: full per-agent control + data_table + nuevos botones
|
||||
|
||||
## Contexto
|
||||
|
||||
v0.1 (issues 0128+0129) entrego:
|
||||
- HTTP API + apikey + TLS + SSE
|
||||
- C++ frontend con Connection/Agents/Logs/Status feed
|
||||
- Tabla agents con `running` derivado de backend
|
||||
|
||||
**Gaps detectados durante uso real:**
|
||||
1. **Control individual roto en unified mode** — Manager.Start/Stop esperan PID files por agente; en unified mode no existen → endpoints devuelven errores confusos ("not running" sobre agente que SI corre).
|
||||
2. **No hay uptime ni msg_24h reales** — backend no expone esos campos. UI muestra `instances` como hack para msg_24h.
|
||||
3. **Faltan acciones de gestion** — clear memory (mensajes en SQLite), delete cache (crypto E2EE), reset state.
|
||||
4. **Tabla manual** — `ImGui::BeginTable` inline en main.cpp. El registry tiene `data_table_cpp_viz` (funcion canonica). Migrar.
|
||||
|
||||
## Scope v0.2
|
||||
|
||||
### Backend (`projects/element_agents/apps/agents_and_robots/`)
|
||||
|
||||
**1. Control per-agent en unified mode**
|
||||
|
||||
Hoy launcher arranca todos los agents como goroutines bajo 1 PID via mode "unified". `Manager.Start/Stop/Restart` actuales solo funcionan en mode multi-process (PID por agente).
|
||||
|
||||
Anadir registro de cancel-context por agente en el launcher:
|
||||
- Por cada agente que arranca como goroutine, guardar `context.CancelFunc` en `Manager.unifiedCancels map[string]context.CancelFunc`.
|
||||
- `Manager.StopUnifiedAgent(id)` llama cancel del agente especifico.
|
||||
- `Manager.StartUnifiedAgent(id)` re-arranca solo ese agente sin restart del launcher entero.
|
||||
- `Manager.RestartUnifiedAgent(id)` = Stop + Start.
|
||||
|
||||
Handlers `handleStart/Stop/Restart` autodetectan via `IsUnifiedRunning()` y delegan a las nuevas variantes unified.
|
||||
|
||||
**2. Uptime real**
|
||||
|
||||
- `Manager.startedAt map[string]time.Time` poblado al arrancar cada goroutine.
|
||||
- En `AgentStatus.UptimeSeconds`, calcular `time.Since(startedAt[id]).Seconds()` si running, else 0.
|
||||
- Exponer en `agentResponse` como `uptime_seconds: int`.
|
||||
|
||||
**3. Messages_24h**
|
||||
|
||||
Cada agent persiste mensajes en su SQLite (`agents/<id>/data/memory.db`). El handler `handleListAgents` debe agregar por agente:
|
||||
- Abrir DB del agente readonly
|
||||
- `SELECT COUNT(*) FROM messages WHERE created_at > datetime('now', '-24 hours')`
|
||||
- Cache 30s para no abrir DB en cada request
|
||||
|
||||
Exponer como `messages_24h: int`.
|
||||
|
||||
**4. Endpoint `POST /agents/{id}/clear_memory`**
|
||||
|
||||
- Stop agent (si running)
|
||||
- Open agent's memory.db
|
||||
- `DELETE FROM messages` + `DELETE FROM facts`
|
||||
- Optionally start back si estaba running (deber `?restart=true` opcional)
|
||||
- Return `{status:"cleared", messages_deleted:N, facts_deleted:M}`
|
||||
|
||||
**5. Endpoint `POST /agents/{id}/delete_cache`**
|
||||
|
||||
- Stop agent (si running)
|
||||
- Delete `agents/<id>/data/crypto/` directory (E2EE cache; agent re-init on next start)
|
||||
- Delete `agents/<id>/data/cache/*` si existe
|
||||
- Return `{status:"cleared", paths_deleted:[...]}`
|
||||
- Optionally start back si estaba running (`?restart=true`)
|
||||
|
||||
NOTA: delete_cache fuerza re-verificacion E2EE. El agente debe re-autenticarse via SSSS recovery key on next start. Documentar.
|
||||
|
||||
### Frontend (`projects/element_agents/apps/agents_dashboard/`)
|
||||
|
||||
**1. Migrar a `data_table_cpp_viz`**
|
||||
|
||||
Hoy main.cpp usa `ImGui::BeginTable` inline. Sustituir por `data_table::Table` del registry (funcion `data_table_cpp_viz`). Anadir a `app.md::uses_functions`. Verificar via `fn doctor cpp-apps` que la app pasa de `CANDIDATE` a limpio.
|
||||
|
||||
**2. Columnas tabla:**
|
||||
- id
|
||||
- status icon (running=green, stopped=gray, disabled=yellow, crashed=red)
|
||||
- uptime (humanized via `human_duration_secs`)
|
||||
- msg/24h (numero real, NO instances)
|
||||
- actions (5 botones agrupados):
|
||||
- `▶ Start` (disabled si running)
|
||||
- `⏹ Stop` (disabled si !running)
|
||||
- `↻ Restart`
|
||||
- `🧠 Clear Memory` (confirmacion modal)
|
||||
- `🗑 Delete Cache` (confirmacion modal)
|
||||
|
||||
**3. Sort + filter** mantener via data_table_cpp_viz API.
|
||||
|
||||
### E2E (`tests/`)
|
||||
|
||||
Anadir 7 tests nuevos:
|
||||
- `test_control_roundtrip` — stop → poll → start → poll → restart → poll. Usa `test-bot`.
|
||||
- `test_clear_memory` — POST clear_memory, verifica COUNT(*) FROM messages == 0.
|
||||
- `test_delete_cache` — POST delete_cache, verifica crypto/ borrado.
|
||||
- `test_uptime_field_present` — /agents response incluye uptime_seconds key
|
||||
- `test_msg_24h_field_present` — /agents response incluye messages_24h key
|
||||
- `test_unified_stop_does_not_kill_launcher` — tras stop de 1 agente, otros siguen running.
|
||||
- `test_clear_memory_requires_apikey` — sin Bearer → 401
|
||||
|
||||
## Tareas
|
||||
|
||||
### Fase A — Backend (agents_and_robots)
|
||||
|
||||
1. Agregar `unifiedCancels map[string]context.CancelFunc` + `startedAt map[string]time.Time` + mutex a `shell/process.Manager`.
|
||||
2. Hook en `launcher` runtime para registrar/desregistrar cancels al arrancar/parar cada agent goroutine.
|
||||
3. Implementar `StopUnifiedAgent`, `StartUnifiedAgent`, `RestartUnifiedAgent` (Stop+Start).
|
||||
4. Refactor handlers `handleStartAgent/Stop/Restart` para autodetect unified vs multi.
|
||||
5. Anadir `uptime_seconds` y `messages_24h` a `AgentResponse`. Implementar query 24h con cache 30s.
|
||||
6. Implementar handlers `handleClearMemory`, `handleDeleteCache`.
|
||||
7. Anadir rutas en `server.go`.
|
||||
8. Tests Go unit `internal/api/*_test.go`.
|
||||
|
||||
### Fase B — Frontend (agents_dashboard)
|
||||
|
||||
1. Cambiar `parse_agents` para leer `uptime_seconds` y `messages_24h` del backend.
|
||||
2. Migrar tabla a `data_table_cpp_viz`. Mantener filter + sort.
|
||||
3. Anadir 5 botones por fila (Start/Stop/Restart/Clear/Delete).
|
||||
4. Confirmacion modal para Clear/Delete.
|
||||
5. Actualizar app.md::uses_functions con `data_table_cpp_viz`.
|
||||
|
||||
### Fase C — E2E + verify
|
||||
|
||||
1. Anadir 7 pytest tests.
|
||||
2. Run all e2e from registry venv. >=20 tests pass.
|
||||
3. Rebuild .exe + redeploy Windows.
|
||||
4. Visual confirm: botones, uptime, msg_24h.
|
||||
|
||||
## Acceptance
|
||||
|
||||
- [ ] All 14 DoD items green (cmd + screenshots).
|
||||
- [ ] >=20 e2e tests passing.
|
||||
- [ ] App C++ deployed to Windows Desktop, visible buttons + working roundtrip.
|
||||
- [ ] Backend unit tests pass.
|
||||
- [ ] No regression: 0128 + 0129 funcionalidad existente intacta (curl smoke del v0.1 sigue green).
|
||||
|
||||
## DoD humano
|
||||
|
||||
- **Donde**: Windows Desktop → agents_dashboard.exe → tabla Agents.
|
||||
- **Latencia**: stop → running=false reflected in UI within 2s (via SSE status diff). msg/24h refresh cada 30s ok.
|
||||
- **Onboarding**: tooltip en boton "Clear Memory" explica que borra mensajes; "Delete Cache" explica que el agente tendra que re-autenticar via SSSS al volver a arrancar.
|
||||
|
||||
## Riesgos
|
||||
|
||||
- Refactor de Manager unified-mode toca el ciclo de vida del launcher (paso ~7 del create_agent pipeline). Tests existentes deben pasar.
|
||||
- delete_cache borra crypto store; agente debe poder re-verify via env var `SSSS_RECOVERY_KEY_<NORM>`. Si esa env var no esta, agente queda en estado degradado. Validar antes de borrar.
|
||||
- data_table_cpp_viz puede tener limites de API que ImGui inline no tiene (sort custom, alignment). Verificar antes de migrar.
|
||||
Reference in New Issue
Block a user