feat(dag_engine): WS hub /api/ws/dagruns + migracion DAGs desde dagu

- events.go: DagRunHub broadcastea snapshot+deltas live (500ms tick, 5s recent finished window) sobre dag_runs + dag_step_results.
- api.go: handler GET /api/ws/dagruns upgrade WS, opt-in en RegisterAPI.
- store.go: expone Conn() para read-only desde el hub.
- main.go: construye DagRunHub al arrancar server.
- dags_migrated/: 5 YAMLs migrados desde ~/dagu/dags tras desinstalar dagu (issue 0095 step 1).

Smoke: snapshot inicial OK, trigger /api/dags/test_claude_access/run -> delta WS observa 3 step_results + run success en <1s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-15 16:36:34 +02:00
parent fe39de8b22
commit 7ecbee1175
11 changed files with 775 additions and 18 deletions
@@ -0,0 +1,23 @@
# Example Dagu DAG
# This is a simple example workflow
name: example
description: Example workflow to demonstrate Dagu capabilities
schedule:
# Run every day at 9:00 AM
- "0 9 * * *"
steps:
- name: hello
command: echo "Hello from Dagu!"
- name: list_files
command: ls -la /home/lucas/dagu/scripts
depends:
- hello
- name: date
command: date
depends:
- hello
@@ -0,0 +1,178 @@
name: example_lineage_tracking
description: |
Ejemplo completo de pipeline con lineage tracking usando marquez-cli.
Este DAG demuestra:
- Generación de Run ID único
- Eventos START, RUNNING, COMPLETE
- Tracking de inputs/outputs en cada paso
- Manejo de errores con evento FAIL
tags:
- example
- lineage
- marquez
schedule:
- "0 */6 * * *" # Cada 6 horas
env:
- MARQUEZ_URL: http://localhost:5000
- MARQUEZ_NAMESPACE: automatic-process
- JOB_NAME: example_lineage_tracking
- RUN_ID: ""
steps:
# PASO 0: Generar Run ID único para todo el pipeline
- name: init_run_id
description: Generate unique Run ID for this execution
command: |
RUN_ID=$(uuidgen)
echo "RUN_ID=$RUN_ID" >> $DAGU_ENV
echo "Generated Run ID: $RUN_ID"
# PASO 1: START event
- name: start_run
description: Send START event to Marquez
command: |
marquez-cli run start \
-job $JOB_NAME \
-run-id $RUN_ID \
-namespace $MARQUEZ_NAMESPACE \
-inputs "api://jsonplaceholder.typicode.com/users"
echo "✓ Run started with ID: $RUN_ID"
depends:
- init_run_id
# PASO 2: Extract - Fetch data from API
- name: extract_data
description: Fetch data from external API
command: |
echo "Fetching data from API..."
curl -s https://jsonplaceholder.typicode.com/users > /tmp/lineage_users.json
marquez-cli run running \
-job $JOB_NAME \
-run-id $RUN_ID \
-namespace $MARQUEZ_NAMESPACE \
-inputs "api://jsonplaceholder.typicode.com/users" \
-outputs "file:///tmp/lineage_users.json"
echo "✓ Data extracted: $(cat /tmp/lineage_users.json | jq '. | length') records"
depends:
- start_run
# PASO 3: Transform - Clean and transform data
- name: transform_data
description: Transform and clean the data
command: |
echo "Transforming data..."
jq '[.[] | {email: .email, name: .name, company: .company.name}]' \
/tmp/lineage_users.json > /tmp/lineage_users_clean.json
marquez-cli run running \
-job $JOB_NAME \
-run-id $RUN_ID \
-namespace $MARQUEZ_NAMESPACE \
-inputs "file:///tmp/lineage_users.json" \
-outputs "file:///tmp/lineage_users_clean.json"
echo "✓ Data transformed: $(cat /tmp/lineage_users_clean.json | jq '. | length') records"
depends:
- extract_data
# PASO 4: Load - Save to PostgreSQL
- name: load_data
description: Load data to PostgreSQL
command: |
echo "Loading data to PostgreSQL..."
# Crear tabla si no existe
psql -h localhost -p 5434 -U postgres -d postgres -c "
CREATE TABLE IF NOT EXISTS lineage_example (
email TEXT,
name TEXT,
company TEXT,
loaded_at TIMESTAMP DEFAULT NOW()
);
"
# Truncar tabla
psql -h localhost -p 5434 -U postgres -d postgres -c "TRUNCATE TABLE lineage_example;"
# Cargar datos
jq -r '.[] | [.email, .name, .company] | @csv' /tmp/lineage_users_clean.json | \
psql -h localhost -p 5434 -U postgres -d postgres -c "
COPY lineage_example (email, name, company) FROM STDIN WITH CSV;
"
RECORD_COUNT=$(psql -h localhost -p 5434 -U postgres -d postgres -t -c "SELECT COUNT(*) FROM lineage_example;")
marquez-cli run running \
-job $JOB_NAME \
-run-id $RUN_ID \
-namespace $MARQUEZ_NAMESPACE \
-inputs "file:///tmp/lineage_users_clean.json" \
-outputs "postgres://localhost:5434/postgres/public/lineage_example"
echo "✓ Data loaded: $(echo $RECORD_COUNT | xargs) records"
depends:
- transform_data
# PASO 5: COMPLETE event
- name: complete_run
description: Mark run as completed in Marquez
command: |
marquez-cli run complete \
-job $JOB_NAME \
-run-id $RUN_ID \
-namespace $MARQUEZ_NAMESPACE \
-inputs "api://jsonplaceholder.typicode.com/users" \
-outputs "postgres://localhost:5434/postgres/public/lineage_example"
echo "✓ Run completed successfully: $RUN_ID"
echo ""
echo "Verify lineage at: http://localhost:3001"
echo "Or run: marquez-cli lineage -name 'postgres://localhost:5434/postgres/public/lineage_example'"
depends:
- load_data
# PASO 6: Cleanup temporary files
- name: cleanup
description: Remove temporary files
command: |
rm -f /tmp/lineage_users.json /tmp/lineage_users_clean.json
echo "✓ Temporary files cleaned"
depends:
- complete_run
# Handler para errores
handlers:
failure:
- name: mark_as_failed
command: |
echo "❌ Pipeline failed, marking run as FAILED in Marquez"
if [ -n "$RUN_ID" ]; then
marquez-cli run fail \
-job $JOB_NAME \
-run-id $RUN_ID \
-namespace $MARQUEZ_NAMESPACE
echo "✓ Run marked as FAILED: $RUN_ID"
else
echo "⚠ No RUN_ID found, skipping FAIL event"
fi
success:
- name: notify_success
command: |
echo "🎉 Pipeline completed successfully!"
echo "Run ID: $RUN_ID"
echo "View lineage: http://localhost:3001"
# Configuración de logs
logCleanup:
enabled: true
retentionDays: 7
@@ -0,0 +1,21 @@
name: fn_backup
description: Backup diario de fn_registry (registry.db + operations.db + vaults)
schedule:
- "0 3 * * *"
env:
- FN_REGISTRY_ROOT: /home/lucas/fn_registry
- BACKUP_ROOT: /home/lucas/backups/fn_registry
steps:
- name: ensure_dirs
command: mkdir -p ${BACKUP_ROOT}
- name: run_backup_all
command: bash /home/lucas/fn_registry/bash/functions/pipelines/backup_all.sh ${BACKUP_ROOT}
continue_on:
exit_code: [4]
- name: report_status
command: bash -c 'ls -lh ${BACKUP_ROOT}/registry/daily.0 ${BACKUP_ROOT}/operations/*/daily.0 2>/dev/null | tail -20'
@@ -0,0 +1,51 @@
name: revision-viernes-finanzas
description: Revisión semanal de finanzas personales - ingesta, informe y push a Gitea
tags: [finanzas, semanal]
type: graph
schedule: "0 9 * * 5"
env:
- PROJECT_DIR: /home/lucas/analysis/finanzas_personales
- PYTHON: /home/lucas/analysis/finanzas_personales/.venv/bin/python
handler_on:
failure:
command: echo "[$(date)] FALLÓ revision-viernes-finanzas" >> /home/lucas/dagu/logs/failures.log
steps:
- id: ingest
description: Procesar archivos nuevos del inbox (BBVA xlsx + Revolut csv)
working_dir: ${PROJECT_DIR}
command: ./bin/ingest -skip-notebooks
continue_on:
failure: true
- id: informe
description: Generar informe semanal de cumplimiento del presupuesto
command: ${PYTHON} /home/lucas/dagu/scripts/informe_finanzas.py
depends: [ingest]
- id: git_push
description: Commit y push del informe a Gitea
working_dir: ${PROJECT_DIR}
script: |
#!/bin/bash
set -euo pipefail
if git diff --quiet data/04_output/informe_semanal.md 2>/dev/null && \
! git ls-files --others --exclude-standard | grep -q informe_semanal.md; then
echo "Sin cambios en el informe, skip push"
exit 0
fi
git add data/04_output/informe_semanal.md
git add data/03_processed/ 2>/dev/null || true
git commit -m "Informe semanal $(date +%Y-%m-%d)
Co-Authored-By: Dagu Automation <noreply@dagu.dev>"
git push origin master:main
echo "Push completado"
depends: [informe]
@@ -0,0 +1,23 @@
name: test_claude_access
description: Test workflow created by Claude to verify access
tags:
- test
- claude
steps:
- name: verify_access
command: echo "✓ Claude tiene acceso completo para gestionar tus pipelines de Dagu!"
- name: show_info
command: |
echo "Usuario: $(whoami)"
echo "Fecha: $(date)"
echo "Directorio: $(pwd)"
depends:
- verify_access
- name: cleanup
command: echo "Pipeline de prueba completado exitosamente"
depends:
- show_info