chore: auto-commit (286 archivos)

- .claude/agents/fn-orquestador/SKILL.md
- .claude/commands/fn_claude.md
- .claude/rules/INDEX.md
- .claude/rules/cpp_apps.md
- .claude/rules/ids_naming.md
- CHANGELOG.md
- apps/dag_engine/README.md
- apps/dag_engine/api.go
- apps/dag_engine/dags_migrated/example.yaml
- apps/dag_engine/dags_migrated/example_lineage_tracking.yaml
- ...

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-16 16:33:22 +02:00
parent d6175964e4
commit 212875ed0d
290 changed files with 12703 additions and 19778 deletions
+25 -2
View File
@@ -155,9 +155,10 @@ handler_on:
| `name` | string | (obligatorio) | Identificador del step dentro del DAG. |
| `id` | string | "" | Override del id auto-generado. |
| `description` | string | "" | Texto libre. |
| `command` | string | "" | Comando shell (mutuamente excluyente con `script`). |
| `command` | string | "" | Comando shell (mutuamente excluyente con `script`/`function`). |
| `script` | string | "" | Bloque heredoc. Util para Python/Lua inline. |
| `args` | list[string] | [] | Args extra para `command`. |
| `function` | string | "" | ID de funcion del registry (ej `audit_capability_groups_go_infra`). Si set, executor invoca `${FN_REGISTRY_ROOT}/fn run <id> <args...>` y captura `function_id` en `dag_step_results`. Mutuamente exclusivo con `command`/`script`; si convive, gana `function`. |
| `args` | list[string] | [] | Args extra para `command` o para la `function`. |
| `shell` | string | hereda | Override del shell. |
| `dir` / `working_dir` | string | hereda | Working dir para este step. |
| `depends` | list[string] | [] | Steps que deben terminar OK antes. Si vacio + `type:graph`, arranca en paralelo. |
@@ -170,6 +171,28 @@ handler_on:
| `output` | string | "" | Nombre de variable donde guardar stdout (consumible por dependientes). |
| `tags` | list[string] | [] | Tags por step (UI). |
### Function steps (coherencia con el registry)
Un DAG idiomatico llama funciones del registry, no scripts ad-hoc. Cada step `function:` queda trazado en `call_monitor.calls` por el hook PostToolUse del agente y en `dag_step_results.function_id` del propio dag_engine — el bucle reactivo (issue 0085) tiene visibilidad end-to-end.
```yaml
steps:
- name: audit_capabilities
function: audit_capability_groups_go_infra
args: ["--json"]
description: "Audita drift entre tags de capability group y paginas madre"
```
Ventajas vs `command: ./fn run ...`:
- `function_id` se persiste como columna dedicada en `dag_step_results` (filtrable, agrupable).
- El frontend `dag_engine_ui` muestra badge + panel lateral con `uses_functions` (subfunciones que el step va a usar transitivamente).
- API: `GET /api/functions/{id}` devuelve `{id, name, description, signature, purity, domain, lang, uses_functions[], uses_types[]}` leyendo `registry.db` read-only. La UI consume este endpoint al expandir un step.
- Validator regex en `dag_validate`: `^[a-z0-9_]+_[a-z]+_[a-z]+$`. ID invalido = error.
- Variables de entorno: `FN_REGISTRY_ROOT` (default `/home/lucas/fn_registry`) localiza el binario `fn`. Override con `FN_BIN=/path/al/fn`.
Ejemplo completo: `~/.dagu/dags/example-fn-call.yaml`.
### Cron schedule
5 campos clasicos: `min hour dom mon dow`. Ejemplos:
+3
View File
@@ -15,6 +15,9 @@ func RegisterAPI(mux *http.ServeMux, executor *Executor, scheduler *Scheduler, h
mux.HandleFunc("GET /api/runs", handleListRuns(executor))
mux.HandleFunc("GET /api/runs/{id}", handleGetRun(executor))
// Function lookup proxy a registry.db (read-only).
mux.HandleFunc("GET /api/functions/{id}", handleGetFunction())
mux.HandleFunc("POST /api/scheduler/start", handleSchedulerStart(scheduler))
mux.HandleFunc("POST /api/scheduler/stop", handleSchedulerStop(scheduler))
mux.HandleFunc("GET /api/scheduler/status", handleSchedulerStatus(scheduler))
@@ -1,23 +0,0 @@
# Example Dagu DAG
# This is a simple example workflow
name: example
description: Example workflow to demonstrate Dagu capabilities
schedule:
# Run every day at 9:00 AM
- "0 9 * * *"
steps:
- name: hello
command: echo "Hello from Dagu!"
- name: list_files
command: ls -la /home/lucas/dagu/scripts
depends:
- hello
- name: date
command: date
depends:
- hello
@@ -1,178 +0,0 @@
name: example_lineage_tracking
description: |
Ejemplo completo de pipeline con lineage tracking usando marquez-cli.
Este DAG demuestra:
- Generación de Run ID único
- Eventos START, RUNNING, COMPLETE
- Tracking de inputs/outputs en cada paso
- Manejo de errores con evento FAIL
tags:
- example
- lineage
- marquez
schedule:
- "0 */6 * * *" # Cada 6 horas
env:
- MARQUEZ_URL: http://localhost:5000
- MARQUEZ_NAMESPACE: automatic-process
- JOB_NAME: example_lineage_tracking
- RUN_ID: ""
steps:
# PASO 0: Generar Run ID único para todo el pipeline
- name: init_run_id
description: Generate unique Run ID for this execution
command: |
RUN_ID=$(uuidgen)
echo "RUN_ID=$RUN_ID" >> $DAGU_ENV
echo "Generated Run ID: $RUN_ID"
# PASO 1: START event
- name: start_run
description: Send START event to Marquez
command: |
marquez-cli run start \
-job $JOB_NAME \
-run-id $RUN_ID \
-namespace $MARQUEZ_NAMESPACE \
-inputs "api://jsonplaceholder.typicode.com/users"
echo "✓ Run started with ID: $RUN_ID"
depends:
- init_run_id
# PASO 2: Extract - Fetch data from API
- name: extract_data
description: Fetch data from external API
command: |
echo "Fetching data from API..."
curl -s https://jsonplaceholder.typicode.com/users > /tmp/lineage_users.json
marquez-cli run running \
-job $JOB_NAME \
-run-id $RUN_ID \
-namespace $MARQUEZ_NAMESPACE \
-inputs "api://jsonplaceholder.typicode.com/users" \
-outputs "file:///tmp/lineage_users.json"
echo "✓ Data extracted: $(cat /tmp/lineage_users.json | jq '. | length') records"
depends:
- start_run
# PASO 3: Transform - Clean and transform data
- name: transform_data
description: Transform and clean the data
command: |
echo "Transforming data..."
jq '[.[] | {email: .email, name: .name, company: .company.name}]' \
/tmp/lineage_users.json > /tmp/lineage_users_clean.json
marquez-cli run running \
-job $JOB_NAME \
-run-id $RUN_ID \
-namespace $MARQUEZ_NAMESPACE \
-inputs "file:///tmp/lineage_users.json" \
-outputs "file:///tmp/lineage_users_clean.json"
echo "✓ Data transformed: $(cat /tmp/lineage_users_clean.json | jq '. | length') records"
depends:
- extract_data
# PASO 4: Load - Save to PostgreSQL
- name: load_data
description: Load data to PostgreSQL
command: |
echo "Loading data to PostgreSQL..."
# Crear tabla si no existe
psql -h localhost -p 5434 -U postgres -d postgres -c "
CREATE TABLE IF NOT EXISTS lineage_example (
email TEXT,
name TEXT,
company TEXT,
loaded_at TIMESTAMP DEFAULT NOW()
);
"
# Truncar tabla
psql -h localhost -p 5434 -U postgres -d postgres -c "TRUNCATE TABLE lineage_example;"
# Cargar datos
jq -r '.[] | [.email, .name, .company] | @csv' /tmp/lineage_users_clean.json | \
psql -h localhost -p 5434 -U postgres -d postgres -c "
COPY lineage_example (email, name, company) FROM STDIN WITH CSV;
"
RECORD_COUNT=$(psql -h localhost -p 5434 -U postgres -d postgres -t -c "SELECT COUNT(*) FROM lineage_example;")
marquez-cli run running \
-job $JOB_NAME \
-run-id $RUN_ID \
-namespace $MARQUEZ_NAMESPACE \
-inputs "file:///tmp/lineage_users_clean.json" \
-outputs "postgres://localhost:5434/postgres/public/lineage_example"
echo "✓ Data loaded: $(echo $RECORD_COUNT | xargs) records"
depends:
- transform_data
# PASO 5: COMPLETE event
- name: complete_run
description: Mark run as completed in Marquez
command: |
marquez-cli run complete \
-job $JOB_NAME \
-run-id $RUN_ID \
-namespace $MARQUEZ_NAMESPACE \
-inputs "api://jsonplaceholder.typicode.com/users" \
-outputs "postgres://localhost:5434/postgres/public/lineage_example"
echo "✓ Run completed successfully: $RUN_ID"
echo ""
echo "Verify lineage at: http://localhost:3001"
echo "Or run: marquez-cli lineage -name 'postgres://localhost:5434/postgres/public/lineage_example'"
depends:
- load_data
# PASO 6: Cleanup temporary files
- name: cleanup
description: Remove temporary files
command: |
rm -f /tmp/lineage_users.json /tmp/lineage_users_clean.json
echo "✓ Temporary files cleaned"
depends:
- complete_run
# Handler para errores
handlers:
failure:
- name: mark_as_failed
command: |
echo "❌ Pipeline failed, marking run as FAILED in Marquez"
if [ -n "$RUN_ID" ]; then
marquez-cli run fail \
-job $JOB_NAME \
-run-id $RUN_ID \
-namespace $MARQUEZ_NAMESPACE
echo "✓ Run marked as FAILED: $RUN_ID"
else
echo "⚠ No RUN_ID found, skipping FAIL event"
fi
success:
- name: notify_success
command: |
echo "🎉 Pipeline completed successfully!"
echo "Run ID: $RUN_ID"
echo "View lineage: http://localhost:3001"
# Configuración de logs
logCleanup:
enabled: true
retentionDays: 7
+8 -3
View File
@@ -1,11 +1,12 @@
name: fn_backup
description: Backup diario de fn_registry (registry.db + operations.db + vaults)
description: Backup diario de fn_registry (registry.db + operations.db + vaults) via funcion del registry
schedule:
- "0 3 * * *"
tags: [backup, registry, daily]
env:
- FN_REGISTRY_ROOT: /home/lucas/fn_registry
- BACKUP_ROOT: /home/lucas/backups/fn_registry
steps:
@@ -13,9 +14,13 @@ steps:
command: mkdir -p ${BACKUP_ROOT}
- name: run_backup_all
command: bash /home/lucas/fn_registry/bash/functions/pipelines/backup_all.sh ${BACKUP_ROOT}
description: "Snapshot atomico de registry.db + operations.db + vaults con retention 7/4/12"
function: backup_all_bash_pipelines
args: ["${BACKUP_ROOT}"]
continue_on:
exit_code: [4]
depends: [ensure_dirs]
- name: report_status
command: bash -c 'ls -lh ${BACKUP_ROOT}/registry/daily.0 ${BACKUP_ROOT}/operations/*/daily.0 2>/dev/null | tail -20'
depends: [run_backup_all]
@@ -1,23 +0,0 @@
name: test_claude_access
description: Test workflow created by Claude to verify access
tags:
- test
- claude
steps:
- name: verify_access
command: echo "✓ Claude tiene acceso completo para gestionar tus pipelines de Dagu!"
- name: show_info
command: |
echo "Usuario: $(whoami)"
echo "Fecha: $(date)"
echo "Directorio: $(pwd)"
depends:
- verify_access
- name: cleanup
command: echo "Pipeline de prueba completado exitosamente"
depends:
- show_info
+29 -10
View File
@@ -156,22 +156,41 @@ func (e *Executor) ExecuteDAG(ctx context.Context, dagPath string, trigger strin
func (e *Executor) executeStep(ctx context.Context, runID string, dag core.DagDefinition, step core.DagStep, daguEnvPath string, outputs map[string]string, mu *sync.Mutex) error {
stepID := generateID()
now := time.Now()
// Resolve command source: function (registry) takes precedence over command/script.
var command string
var stepFunctionID string
if step.Function != "" {
stepFunctionID = step.Function
fnBin := os.Getenv("FN_BIN")
if fnBin == "" {
root := os.Getenv("FN_REGISTRY_ROOT")
if root == "" {
root = "/home/lucas/fn_registry"
}
fnBin = root + "/fn"
}
parts := []string{fnBin, "run", step.Function}
parts = append(parts, step.Args...)
command = strings.Join(parts, " ")
} else if step.Command != "" {
command = step.Command
} else if step.Script != "" {
command = step.Script
}
e.store.InsertStepResult(&store.DagStepResult{
ID: stepID,
RunID: runID,
StepName: stepName(step),
Status: "running",
StartedAt: &now,
ID: stepID,
RunID: runID,
StepName: stepName(step),
FunctionID: stepFunctionID,
Status: "running",
StartedAt: &now,
})
// Build environment.
env := buildStepEnv(dag, step, daguEnvPath, outputs)
// Determine command.
command := step.Command
if command == "" && step.Script != "" {
command = step.Script
}
if command == "" {
e.store.UpdateStepResult(stepID, "skipped", 0, "", "", nil, 0, "no command or script")
return nil
+4 -4
View File
@@ -30,10 +30,10 @@ func handleGetDag(executor *Executor) http.HandlerFunc {
runs, _, _ := executor.store.ListRuns(dag.Name, 10, 0)
resp := map[string]interface{}{
"info": info,
"dag": dag,
"validation": validation,
"runs": runs,
"info": info,
"dag": dag,
"validation": validation,
"recent_runs": runs,
}
writeJSON(w, http.StatusOK, resp)
}
+38 -9
View File
@@ -2,15 +2,43 @@ package store
import (
"database/sql"
_ "embed"
"embed"
"fmt"
"io/fs"
"sort"
"strings"
"time"
_ "github.com/mattn/go-sqlite3"
)
//go:embed migrations/001_init.sql
var migrationSQL string
//go:embed migrations/*.sql
var migrationsFS embed.FS
// applyMigrations executes every embedded migrations/*.sql in order.
// Each statement is idempotent (IF NOT EXISTS / ADD COLUMN). Duplicate-column
// errors from re-running ALTER TABLE ADD COLUMN are tolerated.
func applyMigrations(conn *sql.DB) error {
files, err := fs.Glob(migrationsFS, "migrations/*.sql")
if err != nil {
return err
}
sort.Strings(files)
for _, f := range files {
b, err := migrationsFS.ReadFile(f)
if err != nil {
return fmt.Errorf("%s: read: %w", f, err)
}
if _, err := conn.Exec(string(b)); err != nil {
if strings.Contains(err.Error(), "duplicate column") ||
strings.Contains(err.Error(), "already exists") {
continue
}
return fmt.Errorf("%s: %w", f, err)
}
}
return nil
}
// DB wraps a SQLite connection for DAG run persistence.
type DB struct {
@@ -24,7 +52,7 @@ func Open(path string) (*DB, error) {
if err != nil {
return nil, fmt.Errorf("store: open %s: %w", path, err)
}
if _, err := conn.Exec(migrationSQL); err != nil {
if err := applyMigrations(conn); err != nil {
conn.Close()
return nil, fmt.Errorf("store: migrate: %w", err)
}
@@ -132,6 +160,7 @@ type DagStepResult struct {
ID string `json:"id"`
RunID string `json:"run_id"`
StepName string `json:"step_name"`
FunctionID string `json:"function_id,omitempty"`
Status string `json:"status"`
ExitCode int `json:"exit_code"`
Stdout string `json:"stdout,omitempty"`
@@ -154,9 +183,9 @@ func (db *DB) InsertStepResult(r *DagStepResult) error {
finishedAt = &s
}
_, err := db.conn.Exec(
`INSERT INTO dag_step_results (id, run_id, step_name, status, exit_code, stdout, stderr, started_at, finished_at, duration_ms, error)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`,
r.ID, r.RunID, r.StepName, r.Status, r.ExitCode, r.Stdout, r.Stderr,
`INSERT INTO dag_step_results (id, run_id, step_name, function_id, status, exit_code, stdout, stderr, started_at, finished_at, duration_ms, error)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`,
r.ID, r.RunID, r.StepName, r.FunctionID, r.Status, r.ExitCode, r.Stdout, r.Stderr,
startedAt, finishedAt, r.DurationMs, r.Error,
)
return err
@@ -179,7 +208,7 @@ func (db *DB) UpdateStepResult(id, status string, exitCode int, stdout, stderr s
// ListStepResults returns all step results for a given run.
func (db *DB) ListStepResults(runID string) ([]DagStepResult, error) {
rows, err := db.conn.Query(
`SELECT id, run_id, step_name, status, exit_code, stdout, stderr, started_at, finished_at, duration_ms, error
`SELECT id, run_id, step_name, function_id, status, exit_code, stdout, stderr, started_at, finished_at, duration_ms, error
FROM dag_step_results WHERE run_id=? ORDER BY started_at ASC`, runID,
)
if err != nil {
@@ -191,7 +220,7 @@ func (db *DB) ListStepResults(runID string) ([]DagStepResult, error) {
for rows.Next() {
var r DagStepResult
var startedAt, finishedAt sql.NullString
if err := rows.Scan(&r.ID, &r.RunID, &r.StepName, &r.Status, &r.ExitCode,
if err := rows.Scan(&r.ID, &r.RunID, &r.StepName, &r.FunctionID, &r.Status, &r.ExitCode,
&r.Stdout, &r.Stderr, &startedAt, &finishedAt, &r.DurationMs, &r.Error); err != nil {
return nil, err
}