212875ed0d
- .claude/agents/fn-orquestador/SKILL.md - .claude/commands/fn_claude.md - .claude/rules/INDEX.md - .claude/rules/cpp_apps.md - .claude/rules/ids_naming.md - CHANGELOG.md - apps/dag_engine/README.md - apps/dag_engine/api.go - apps/dag_engine/dags_migrated/example.yaml - apps/dag_engine/dags_migrated/example_lineage_tracking.yaml - ... Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2.7 KiB
2.7 KiB
name, id, status, created, updated, priority, risk, related_issues, apps, trigger, schedule, expected_runtime_s, tags
| name | id | status | created | updated | priority | risk | related_issues | apps | trigger | schedule | expected_runtime_s | tags | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| hn-top-stories | 0001 | pending | 2026-05-16 | 2026-05-16 | high | low |
|
|
cron | */30 * * * * | 30 |
|
Goal
Probar end-to-end el stack: navegator AutoExtract -> recipe -> dag_engine schedule -> data_factory.runs -> matrix bot. Pagina cero-auth + cero-coste. Si esto funciona, todo el plumbing es solido.
Pre-requisitos
- Chrome lanzado con
--remote-debugging-port=9222(via navegator_dashboard "Open visible browser"). claudeCLI en PATH (auto-extract requiere LLM).- sqlite_api activo en
:8484. - dag_engine activo en
:8090. - (opcional) Bot Matrix en sala
#fn-registry-newspara el sink final.
Flow
- Lanzar Chrome via navegator (puerto 9222).
- AutoExtract panel: URL
https://news.ycombinator.com. Click "Open & Analyze". - Esperar ~10-20s. Verificar schema propuesto:
rank,title,url,points,comments,age. - Refinar selectors si IA proponen rotos. Test extraction -> preview rows >= 20.
- Save as recipe
hn_top.yaml(enprojects/navegator/profiles/default/recipes/). - Crear DAG
~/.dagu/dags/hn-top.yaml(manual o copy deapps/dag_engine/dags_migrated/):name: hn-top-stories description: Scrape HN top stories cada 30 min schedule: "*/30 * * * *" steps: - name: extract function: cdp_extract_recipe_py_pipelines args: ["projects/navegator/profiles/default/recipes/hn_top.yaml"] - Reload dag_engine + activar scheduler. Trigger Run Now una vez para probar.
- dag_engine_ui: verificar run con status=success + function_id correcto en step.
- data_factory: tab Extractors muestra nodo
hn_top_stories(creado por save recipe). Tab "All Runs" muestra runs nuevos. - (opcional) Anadir step transformer filtra
points > 100-> sink matrix bot.
Acceptance
- Recipe creada y validada (
validate_recipe_yaml_py_coreOK). - DAG corre OK 2 veces consecutivas via scheduler.
data_factory.runstiene >=2 entries connode_id='hn_top_stories'.cdp_extract_recipe_py_pipelinesaparece encall_monitor.calls.- Schema extraido cubre 6/6 fields (rank, title, url, points, comments, age).
- (opcional) Matrix bot recibe >=1 mensaje con top story filtrada.
Telemetria esperada
function_stats.cdp_extract_recipe_py_pipelines: calls_24h += 2.data_factory.runs: 2 nuevas filas contrigger='cron'.dag_engine.dag_step_results: stepextractconfunction_id='cdp_extract_recipe_py_pipelines'.call_monitor.calls: chain function call.
Notas
(rellenas tras correr)