Files
fn_registry/dev/issues/completed/0128-agents-and-robots-http-api-sse.md
T
egutierrez cc1e88fe55 done: 0128 + 0129 — agents_and_robots HTTP API + agents_dashboard C++ ImGui
Both issues delivered end-to-end:

0128 (backend, merged via dataforge/agents_and_robots/pulls/1):
- HTTP daemon in cmd/launcher with apikey Bearer auth + SSE
- LIVE at https://agents.organic-machine.com via Coolify Traefik + LE cert
- systemd Restart=always
- Unified status autodetect fix applied

0129 (frontend, merged via dataforge/agents_dashboard/pulls/1):
- C++ ImGui app in projects/element_agents/apps/agents_dashboard
- 4 panels: Connection / Agents / Logs / Status
- secret_store_cpp_infra new function (DPAPI Windows / XOR Linux)
- Deployed to Windows Desktop, App Hub tarjeta visible

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 21:58:05 +02:00

153 lines
6.9 KiB
Markdown

---
id: "0128"
title: "agents_and_robots: HTTP API + SSE + apikey + TLS subdominio"
status: pendiente
type: feature
domain:
- agents
- infra
- deploy
scope: app
priority: alta
depends: []
blocks:
- "0129"
related: []
created: 2026-05-22
updated: 2026-05-22
tags: [agents_and_robots, http, sse, apikey, traefik, systemd]
dod_evidence_schema:
- id: build_ok
kind: cmd
expected: "cd projects/element_agents/apps/agents_and_robots && go build -tags goolm ./cmd/launcher → exit 0"
required: true
- id: api_list_authorized
kind: cmd
expected: "curl -fsS -H 'Authorization: Bearer $AGENTS_API_KEY' https://agents.organic-machine.com/agents devuelve JSON con N>=7 agentes"
required: true
- id: api_list_unauthorized_401
kind: cmd
expected: "curl -s -o /dev/null -w '%{http_code}' https://agents.organic-machine.com/agents == 401"
required: true
- id: api_start_stop_roundtrip
kind: cmd
expected: "POST /agents/test-bot/stop → POST /agents/test-bot/start: status running confirmado via GET /agents/test-bot tras 2s"
required: true
- id: sse_logs_streaming
kind: cmd
expected: "curl -N -H 'Authorization: Bearer $KEY' https://agents.organic-machine.com/sse/agents/assistant-bot/logs entrega >=1 line en 5s con agente activo"
required: true
- id: sse_status_broadcast
kind: cmd
expected: "curl -N /sse/status recibe evento {agent_id, old_status, new_status} tras stop/start manual"
required: true
- id: systemd_active
kind: cmd
expected: "ssh organic-machine.com 'systemctl is-active agents_and_robots.service' == active"
required: true
- id: traefik_route
kind: url
expected: "agents.organic-machine.com resuelve y devuelve cert LE valido (curl -vI muestra subject CN=agents.organic-machine.com)"
required: true
- id: app_md_drift_fixed
kind: cmd
expected: "fn doctor services-spec apps/element_agents/apps/agents_and_robots reporta OK (sin drift runtime/systemd)"
required: true
---
# 0128 — agents_and_robots HTTP API + SSE + apikey + TLS
## Contexto
Hoy `agents_and_robots` solo expone control via `agentctl` CLI local (filesystem-based, `shell/process.Manager`). No hay forma remota de gestionar agentes.
Necesitamos backend HTTP seguro para que un frontend local C++ (issue 0129) pueda listar, start/stop/restart agentes, y streamear logs/status en vivo.
## Decision
**Integrar daemon HTTP DENTRO de `cmd/launcher`** como goroutine. Comparte `process.Manager` + acceso a `shell/memory/*.db` + Matrix clients. Un solo proceso, sin drift entre daemon y supervisor.
**Auth:** `Authorization: Bearer <AGENTS_API_KEY>` con `subtle.ConstantTimeCompare`. Clave 32 bytes hex en `.env` (`AGENTS_API_KEY`). 401 sin header o key invalida.
**TLS:** Traefik en VPS organic-machine.com con LE cert auto. Subdominio `agents.organic-machine.com` (DNS A record nuevo → IP del VPS). Ruta Traefik `agents.organic-machine.com → 127.0.0.1:8487`.
**SSE in-memory pubsub.** NATS OFF de momento (1 cliente local, broker = overhead). Documentar TODO en app.md para anadir bus si llega 2do consumidor.
## Scope v0.1 (lean)
| Verbo | Path | Wrap |
|---|---|---|
| GET | `/health` | 200 OK sin auth (liveness) |
| GET | `/agents` | `Scan` + `StatusAll` + `msg_count_24h` (query `shell/memory/*.db`) |
| GET | `/agents/{id}` | detail + config + `LogTail(200)` |
| POST | `/agents/{id}/start` | `Manager.Start` |
| POST | `/agents/{id}/stop` | `Manager.Stop` |
| POST | `/agents/{id}/restart` | Stop+Start con espera health |
| GET | `/agents/{id}/logs?n=200` | `LogTail` snapshot |
**SSE:**
- `GET /sse/status` — broadcast cambios de status (poll cada 2s + diff)
- `GET /sse/agents/{id}/logs` — tail -f del logfile, emite line events
**Fuera de scope v0.1** (queda v0.2):
- POST `/agents/{id}/message` (send Matrix message)
- PUT `/agents/{id}/config` (config edit)
- SSE messages stream
## Tareas
1. **Nuevo paquete `internal/api`** con server HTTP (stdlib `net/http`, sin gin/echo).
- `api.New(mgr *process.Manager, apiKey string, port int) *Server`
- `Server.Run(ctx) error` arranca y bloquea hasta ctx done.
- Middleware: log + auth + recover.
2. **Handlers REST** sobre `process.Manager`. Tests unitarios con mock manager.
3. **SSE pubsub in-memory** (`internal/api/pubsub.go`):
- `Bus` con `Subscribe(topic) <-chan event` + `Publish(topic, event)`.
- Poller goroutine que llama `StatusAll` cada 2s y publica diffs.
- Tail goroutine por logfile (`file_tail_follow` — buscar en registry o crear).
4. **Integrar en launcher**`cmd/launcher/main.go` arranca `api.Server` en goroutine si `--api-port > 0`.
5. **Crear systemd unit** `/etc/systemd/system/agents_and_robots.service` con `Restart=always`, `EnvironmentFile=.env`, `ExecStart=.../bin/launcher --log-level info --api-port 8487`.
6. **Traefik route + DNS:**
- Anadir `agents.organic-machine.com` en DNS (A record).
- Anadir config Traefik (label en docker-compose del stack o file provider) apuntando a `127.0.0.1:8487`.
7. **Fix drift app.md**`runtime: systemd-system` ahora es verdad. Verificar con `fn doctor services-spec`.
8. **Tests:**
- Go: pkg `internal/api` con httptest.
- e2e: `e2e_checks` en `app.md` con curl smoke.
9. **Deploy:**
- `rsync_deploy_bash_infra` o `deploy_server` target nuevo.
- Generar `AGENTS_API_KEY` con `openssl rand -hex 32` y escribir `.env` remoto.
- `systemctl enable --now agents_and_robots.service`.
## Funciones del registry a usar / proponer
Buscar antes de codear:
- `mcp__registry__fn_search query="tail follow file" lang="go"` — ¿existe `file_tail_follow_go_infra`? Si no, delegar a fn-constructor.
- `mcp__registry__fn_search query="http auth bearer" lang="go"` — middleware auth.
- `mcp__registry__fn_search query="sse server" lang="go"` — helper SSE.
- `systemd_generate_unit_go_infra` + `systemd_install_go_infra` — generar/instalar unit.
## Acceptance
- [ ] `curl -fsS -H 'Authorization: Bearer $KEY' https://agents.organic-machine.com/agents` devuelve lista correcta.
- [ ] Sin header → 401. Con key invalida → 401. Key valida → 200.
- [ ] Start/Stop/Restart cambian estado real del proceso (verificable con `ps`).
- [ ] SSE logs entrega lineas en menos de 1s de aparecer en el archivo.
- [ ] SSE status broadcast tras stop/start manual.
- [ ] systemd unit activo y reinicia tras kill -9.
- [ ] `fn doctor services-spec` reporta OK.
- [ ] Tests Go pasan.
## DoD humano
- **Donde:** terminal local → `curl https://agents.organic-machine.com/agents`. SSE verificable con `curl -N`.
- **Latencia:** SSE log lag < 1s. REST list < 200ms.
- **Onboarding:** README de agents_and_robots actualizado con seccion "HTTP API" + ejemplos curl.
## Riesgos
- DNS propagation puede tardar (configurar con TTL bajo).
- Traefik en este VPS: verificar si esta gestionado por Coolify o standalone — anadir ruta donde corresponda.
- `LogTail` actual solo lee snapshot — necesitamos `tail -f` real para SSE. Si no existe en el registry, ronda previa.