Files
fn_registry/dev/issues/0113-agent-runner-api.md
T
egutierrez b9716a7cd6 chore: snapshot WIP previo + flow 0008 + 7 sub-issues (0112-0119)
Snapshot de WIP acumulado de sesiones previas antes de merge wave 1
del flow 0008 (kanban_cpp + agent_runner_api + DoD schema).

Incluye:
- dev/flows/0008-kanban-cpp-and-agent-workflows.md
- dev/issues/0112-0119*.md (7 sub-issues)
- WIP previo en cmd/fn/doctor.go, registry/*, modules/, cpp/, etc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 18:17:08 +02:00

4.7 KiB

id, title, status, type, domain, scope, priority, depends, blocks, related, created, updated, tags, flow
id title status type domain scope priority depends blocks related created updated tags flow
0113 Service agent_runner_api: orquestador de workflows con worktrees + DoD pendiente app
agents
workflows
apps-infra
app alta
0115
0116
0117
0118
0008
0069
2026-05-18 2026-05-18
agents
service
worktrees
dod
claude-headless
0008

0113 — Service agent_runner_api

Problema

Hoy hay tres puntos donde se lanza Claude:

  1. apps/skill_tree/main.cpp::spawn_claude_terminal — abre wt.exe con claude --dangerously-skip-permissions. Termina sin trazabilidad.
  2. parallel-fix-issues skill — worktrees paralelos pero stateless.
  3. fn-orquestador (issue 0069) — autonomous loop dentro de Claude Code.

Ninguno persiste runs, evidencias o DoD. No hay manera de saber que workflows estan vivos cross-app.

Decision

Service Go nuevo apps/agent_runner_api/ puerto :8486, tag service. Single source of truth de:

  • workflows declarados (templates de prompt + DoD schema)
  • runs activos (worktree + subprocess Claude + status)
  • evidencias DoD (path/url/log/cmd output + validated_by)

Endpoints minimos:

  • POST /api/runs — crea worktree + lanza claude headless. Body: {issue_id|card_id, mode, kanban_app}.
  • GET /api/runs — lista runs (filtros status/app/since).
  • GET /api/runs/:id — detalle run.
  • GET /api/runs/:id/sse — stream progreso.
  • POST /api/runs/:id/evidence — agente adjunta evidencia.
  • POST /api/runs/:id/evidence/:eid/validate — humano aprueba.
  • POST /api/runs/:id/merge — TBD merge (todos items validated).
  • POST /api/runs/:id/abort — kill subprocess + worktree remove.
  • GET /api/health — 200 OK.

Schema agent_runs.db

Migrations en apps/agent_runner_api/migrations/:

  • 001_workflows.sql — templates: id, name, prompt_template, dod_schema_json, created_at.
  • 002_runs.sqlid, workflow_id, issue_id, card_id, kanban_app, branch, worktree_path, status, started_at, finished_at, agent_pid, agent_log_path.
  • 003_worktrees.sqlid, run_id, path, branch, created_at, removed_at.
  • 004_dod_items.sql — un row por item declarado: id, run_id, item_key, kind, expected, required, status (pending|done|validated|failed).
  • 005_dod_evidence.sql — un row por evidencia adjunta: id, dod_item_id, kind, payload_path, payload_url, payload_text, attached_at, validated_at, validated_by.

Aplicadas via embed.FS + applyMigrations() al arrancar.

Frontmatter app.md (service)

tags: [service, agents, go]
service:
  port: 8486
  health_endpoint: /api/health
  health_timeout_s: 3
  systemd_unit: agent_runner_api.service
  systemd_scope: user
  restart_policy: always
  runtime: systemd-user
  pc_targets:
    - aurgi-pc
    - home-wsl
  is_local_only: true

Criterios de aceptacion

  • apps/agent_runner_api/ scaffold Go (main.go, db.go, handlers.go, sse.go, agent_spawn.go).
  • Migrations 001-005 versionadas + aplicadas al arrancar (idempotente).
  • Endpoints arriba implementados con tests *_test.go.
  • systemd unit agent_runner_api.service con Restart=always.
  • app.md con trio + bloque service: completo (issue 0105).
  • fn doctor services-spec valida bloque.
  • Smoke test: POST /api/runs con issue dummy crea worktree real en /tmp/wt-test-<id>, persiste row en agent_runs, lanza echo subprocess (no claude real en test).
  • Cleanup en abort: subprocess killed + git worktree remove --force + row marcada aborted.
  • e2e_checks: build, migration apply, health, smoke run dummy, cleanup.
  • Documentado en docs/capabilities/agents.md (capability group agents, ver 0114).

Gotchas

  • git worktree add falla si el branch ya existe. Reset hard antes (mismo patron que autonomous_loop.md).
  • Worktree y main repo comparten .git/hooks/. Pre-commit puede bloquear. Permitir --no-verify SOLO si events_json[].decision="skip_hook" documentado.
  • claude --headless necesita PATH correcto (~/.local/bin). Service systemd corre con user env: verificar Environment=PATH=....
  • Subprocess Claude puede correr horas. NO bloquear handler HTTP: spawn async, devolver run_id inmediato, monitorear PID en goroutine.
  • SSE: clientes ImGui (kanban_cpp, skill_tree) deben reconectar con Last-Event-ID.
  • Paths protegidos (dev/autonomous_protected_paths.json) aplican igual aqui. Reusar logica de fn-orquestador.

Out of scope

  • UI propio del service (es backend puro; UIs son kanban_cpp + skill_tree).
  • Auth/auth tokens (local-only por ahora; agregar en issue separado si se expone fuera de localhost).
  • Webhook Gitea para auto-trigger desde commits.
  • Schedule cron para workflows recurrentes.