7eb7b3d0c8
Snapshot de WIP acumulado de sesiones previas antes de merge wave 1 del flow 0008 (kanban_cpp + agent_runner_api + DoD schema). Incluye: - dev/flows/0008-kanban-cpp-and-agent-workflows.md - dev/issues/0112-0119*.md (7 sub-issues) - WIP previo en cmd/fn/doctor.go, registry/*, modules/, cpp/, etc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5.5 KiB
5.5 KiB
id, title, status, type, domain, scope, priority, depends, blocks, related, created, updated, tags
| id | title | status | type | domain | scope | priority | depends | blocks | related | created | updated | tags | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0105 | Estandarizar bloque service: en app.md + indexer + fn doctor services-spec | in-progress | feature |
|
multi-app | alta |
|
|
2026-05-17 | 2026-05-17 |
|
0105 — Estandarizar service: en app.md
Problema
Diagnostico (2026-05-17): sqlite_api cayo 20h sin alerta. Causa: nadie monitoriza. Causa-de-causa: no hay forma uniforme de saber "esta app DEBE estar corriendo en este PC con este puerto y este health endpoint".
Hoy:
- 10 apps con
tag: serviceenregistry.db. - 8/10 con
systemctl active=inactivesegunfn doctor services(algunas porque viven solo en remoto, otras porque genuinamente murieron). portse descubre por--portenExecStartde un unit file que puede o no existir local.health_endpointsolo declarado endeploy_server/operations.dbpara 1 target (registry_api).systemd_unitse asume =<name>.service, no documentado.pc_targets(en que PCs DEBE correr) no existe en ninguna parte.
Consecuencia: imposible escribir un monitor que reconcilie "esperado vs real" sin hardcodear cada app.
Decision
Anadir bloque service: opcional al frontmatter de app.md. Obligatorio para apps con tag: service. Indexer parsea y persiste. fn doctor services-spec audita.
Schema del bloque
service:
# Endpoints HTTP (opcional — apps stdio/daemon dejan null o omiten)
port: 8484
health_endpoint: /api/health # ruta GET, 200 == sano
health_timeout_s: 3
# Identidad systemd (cuando aplica)
systemd_unit: sqlite_api.service # nombre exacto
systemd_scope: user # user|system|null (docker-compose)
restart_policy: always # always|on-failure|none
# Estrategia de runtime (extiende systemd_scope para casos no-systemd)
runtime: systemd-user # systemd-user|systemd-system|docker-compose|stdio|manual
# Donde DEBE correr — referencia pc_locations.pc_id
pc_targets:
- aurgi-pc
- home-wsl
# Banderas
is_local_only: false # true => no se monitoriza por SSH; siempre via mecanismo local
Reglas:
portnull si la app no expone HTTP (stdio MCP, daemons sin API).health_endpointnull si no hay http; monitor cae a check de proceso (systemd active + port listening).pc_targetsLISTA depc_iddepc_locations. Vacia => no se monitoriza.runtime: docker-compose=> monitor chequea contenedores viadocker compose pspor SSH al PC target.is_local_only: true=> monitor solo se ejecuta en el PC donde corre el daemon (no se intenta SSH al propio host).
Tareas
- Auditar 10 services existentes (port real, unit name, descripcion)
- Editar 10 app.md con bloque
service:realista - Migration: anadir columnas a tabla
apps(port INTEGER,health_endpoint TEXT,health_timeout_s INTEGER,systemd_unit TEXT,systemd_scope TEXT,restart_policy TEXT,runtime TEXT,is_local_only INTEGER) - Migration: nueva tabla
service_targets (app_id TEXT, pc_id TEXT, role TEXT DEFAULT 'primary', PRIMARY KEY(app_id, pc_id)) - Indexer: parsear bloque
service:desde frontmatter y rellenar columnas +service_targets fn doctor services-spec(Go func + subcommand): lista apps contag: servicey bloque incompleto. Salida tabwriter +--json- Test:
fn indexsobre fixture con bloque service produce filas correctas - Fix retroactivo:
~/.config/systemd/user/sqlite_api.serviceconRestart=always(noon-failure— TERM no es failure)
Materia: 10 apps actuales
| App | dir | port | health | unit | scope | pc_targets | runtime |
|---|---|---|---|---|---|---|---|
| sqlite_api | projects/fn_monitoring/apps/sqlite_api | 8484 | /api/status | sqlite_api.service | user | aurgi-pc, home-wsl | systemd-user |
| dag_engine | apps/dag_engine | 8090 | /api/dags | dag_engine.service | user | aurgi-pc, home-wsl | systemd-user |
| call_monitor | projects/fn_monitoring/apps/call_monitor | null | null | call_monitor.service | user | aurgi-pc, home-wsl | systemd-user |
| kanban | apps/kanban | 8095 | /api/board | kanban.service | user | aurgi-pc | systemd-user |
| deploy_server | apps/deploy_server | 9090 | /api/health | deploy_server.service | user | aurgi-pc | systemd-user |
| registry_mcp | apps/registry_mcp | null | null | registry_mcp.service | user | aurgi-pc | stdio (manual) |
| registry_api | apps/registry_api | 8420 | /api/status | null | null | organic-machine.com | docker-compose |
| footprint_geo_stack | apps/footprint_geo_stack | 3000 | null | null | null | aurgi-pc | docker-compose |
| element_matrix_chat | projects/element_agents/apps/element_matrix_chat | null | null | null | null | organic-machine.com | docker-compose |
| agents_and_robots | projects/element_agents/apps/agents_and_robots | null | null | agents_and_robots.service | system | organic-machine.com | systemd-remote |
DoD
- 10 app.md con bloque
service:valido (parseable, valores reales). fn indexpueblaapps.port/...yservice_targets.fn doctor services-specreportaOKpara los 10.- Migration aplica idempotente en
registry.dbde aurgi-pc + home-wsl. services_status_go_infraextendida para leer datos del nuevo schema (no hardcoded port discovery).
Bloquea
- 0106: app
services_monitor(UI + backendservices_api). Necesitaservice_targets+apps.port/health_endpointpoblados.