Files
fn_registry/apps/dag_engine/app.md
T
egutierrez eb30074792 chore: auto-commit (8 archivos)
- .claude/rules/registry_calls.md
- apps/dag_engine/README.md
- apps/dag_engine/app.md
- docs/capabilities/INDEX.md
- docs/capabilities/systemd.md
- docs/execution_standard.md
- dev/proposals_e2e_checks_0121/
- docs/capabilities/backends.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 00:31:30 +02:00

6.8 KiB

name, lang, domain, version, description, tags, uses_functions, uses_types, framework, entry_point, dir_path, service
name lang domain version description tags uses_functions uses_types framework entry_point dir_path service
dag_engine go infra 0.1.0 Motor de ejecucion de DAGs del fn_registry: CLI + servidor HTTP + scheduler cron. Schema YAML propio con `function:` para invocar funciones del registry (`fn run <id>`) y `command:` para shell. Historial en SQLite. Scheduler oficial del ecosistema.
service
dag
workflow
scheduler
web
cron
dag_parse_go_core
dag_validate_go_core
dag_topo_sort_go_core
dag_resolve_env_go_core
parse_cron_expr_go_core
next_cron_time_go_core
cron_ticker_go_infra
find_go_core
process_spawn_go_infra
process_wait_go_infra
dag_definition_go_core
dag_step_go_core
dag_validation_result_go_core
cron_schedule_go_core
process_handle_go_infra
process_result_go_infra
DagRun_go_infra
DagStepResult_go_infra
net/http + vite + react main.go apps/dag_engine
port health_endpoint health_timeout_s systemd_unit systemd_scope restart_policy runtime pc_targets is_local_only
8090 /api/dags 3 dag_engine.service user always systemd-user
aurgi-pc
home-wsl
false

Arquitectura

CLI + servidor web en un unico binario:

dag-engine run <path.yaml>      # ejecuta un DAG desde terminal
dag-engine list [dir]            # lista DAGs con schedule y estado
dag-engine status [dag_name]     # historial de ejecuciones
dag-engine validate <path.yaml>  # valida sin ejecutar
dag-engine server                # arranca HTTP + frontend web

Backend (Go)

  • net/http con ServeMux (Go 1.22+ pattern routing)
  • SQLite via go-sqlite3 para historial de runs
  • Executor: parse -> validate -> topo_sort -> spawn/wait por nivel -> store
  • Scheduler: cron_ticker por cada DAG con schedule

Frontend (Vite + React + Mantine)

  • DagList: tabla de DAGs con schedule, tags, ultimo status
  • DagDetail: metadata + "Run Now" + historial
  • RunDetail: timeline de steps con stdout/stderr expandible

Storage

SQLite dag_engine.db:

  • dag_runs: id, dag_name, status, trigger, started_at, finished_at, error
  • dag_step_results: id, run_id, step_name, status, exit_code, stdout, stderr, duration_ms

Build

cd frontend && pnpm install && pnpm build
cd .. && CGO_ENABLED=1 go build -tags fts5 -o dag-engine .

Uso

# CLI
./dag-engine run apps/dag_engine/dags_migrated/fn_backup.yaml
./dag-engine list apps/dag_engine/dags_migrated/

# Servidor web (production: gestionado por dag_engine.service systemd user unit)
./dag-engine server --port 8090 --dags-dir apps/dag_engine/dags_migrated/ --scheduler
# Browser: http://localhost:8090

Notas

Schema YAML propio (ver README.md seccion 3 + ejemplos en dags_migrated/). Steps tipo function: invocan fn run <id> y propagan function_id a dag_step_results para el bucle reactivo. Puerto default 8090.

2026-05-16 — Fix function-not-found en steps function: + panel Logs en RunDetail [done]

Sintoma: fn_backup y daily-registry-audit fallaron 3 noches seguidas con error: function "<id>" not found (tried as ID and name) aunque las funciones existen en registry.db raiz.

Raiz: servicio systemd dag_engine.service tiene WorkingDirectory=/home/lucas/fn_registry/apps/dag_engine. Binario fn resuelve registry.db por (1) FN_REGISTRY_ROOT, (2) root() walk-up buscando go.mod, (3) exe dir (cmd/fn/ops.go:1597-1628). Sin FN_REGISTRY_ROOT seteado, (2) encuentra el go.mod de apps/dag_engine/ y devuelve ese dir — donde habia una copia stale apps/dag_engine/registry.db (262 KB, May 15) sin las funciones recien creadas. Viola regla .claude/rules/db_locations.md (registry.db SOLO en raiz).

Fix:

  • Borrado apps/dag_engine/registry.db stale.
  • ~/.config/systemd/user/dag_engine.service: anadido Environment=FN_REGISTRY_ROOT=/home/lucas/fn_registry, FN_BIN=/home/lucas/fn_registry/fn, PATH=/usr/local/go/bin:/home/lucas/go/bin:..., HOME=/home/lucas. Sin PATH el step go vet fallaba con exec: "go": executable file not found in $PATH.
  • apps/dag_engine/executor.go: para steps function: el spawn exporta FN_REGISTRY_ROOT=<root> en env y, si step.dir/working_dir vacios, fija dir = fnRegistryRoot. Belt-and-suspenders: aunque alguien lance el binario sin systemd, los function: steps usan el root canonico.

Verificacion: POST /api/dags/daily-registry-audit/run -> step audit_capabilities pasa (387 ms) en vez de fallar con not-found. Restantes failures (audit_artefacts exit 1, fn_backup exit 4 sin respetar continue_on.exit_code) son bugs reales independientes — fuera de scope.

2026-05-16 — Panel Logs en RunDetail (frontend) [done]

  • apps/dag_engine/frontend/src/pages/RunDetail.tsx: nuevo <Paper> "Logs" al final con <Code block> scrollable (max-h 480) + CopyButton de Mantine (icono toggle copy/check teal).
  • Helper buildLogText(run, steps) compone texto plano: metadata del run (dag, path, status, trigger, started/finished ISO, duration ms, error) + por step ([status] name exit=N Nms, started, finished, error, stdout, stderr indentado 4 espacios).
  • Permite pegar log entero al LLM para debugging sin abrir N collapses del StepTimeline.
  • Build frontend pendiente: pnpm build rompe por errores preexistentes (StepTimeline.tsx:49 usa API legacy <Collapse in={opened}>; main.tsx:1 importa @mantine/core/styles.css sin tipos). Edit de RunDetail type-checkea limpio.

2026-05-16 — BBDDs canonicas (referencia rapida)

  • dag_engine.db: apps/dag_engine/dag_engine.db (+ WAL sidecars). Migrations en apps/dag_engine/store/migrations/ (001_init.sql, 002_step_function_id.sql). Tablas dag_runs, dag_step_results.
  • NO debe coexistir copia de registry.db en este dir (viola db_locations.md). Si reaparece: borrarla.

Lo siguiente que pega

  • audit_artefacts falla con exit 1 en daily-registry-audit — investigar stderr real (probablemente artefacto huerfano o git drift). Step independiente, no bloquea el resto del DAG.
  • fn_backup step run_backup_all sale con exit 4 y el DAG no respeta continue_on.exit_code: [4]. Bug en executor: parsear step.ContinueOn.ExitCode []int y comparar con result.ExitCode. Hoy solo se mira step.ContinueOn.Failure (bool).
  • Frontend pnpm build roto por API drift de Mantine en StepTimeline.tsx (<Collapse in={opened}>) y CSS type import en main.tsx. Fix junto con un refresh general de tipos.

Documentacion de usuario

Guia completa (formato YAML, anadir DAGs, troubleshooting, endpoints HTTP): apps/dag_engine/README.md.

Capability growth log

Una linea por bump SemVer. Bump-type segun .claude/commands/version.md:

  • major: breaking observable (CLI args, schema BBDD propia, formato wire).

  • minor: feature aditiva (nuevo panel, endpoint, opcion).

  • patch: bugfix sin cambio observable.

  • v0.1.0 (2026-05-18) — baseline.