fad4006f60
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4.6 KiB
4.6 KiB
id, title, status, type, domain, scope, priority, depends, blocks, related, created, updated, tags
| id | title | status | type | domain | scope | priority | depends | blocks | related | created | updated | tags | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 53 | Extraer jobs system de graph_explorer al registry (jobs_pool + cache + subprocess worker) | pendiente | feature |
|
registry-only | alta |
|
|
2026-05-09 | 2026-05-17 |
Contexto
projects/osint_graph/apps/graph_explorer/jobs.{cpp,h} (1366 + 97 lineas) implementa:
- Pool de N
std::threadworkers leyendo cola de jobs en SQLite (tablajobs). - Spawn de subprocess por job con wire protocol stdin (JSON ctx) / stderr (
PROGRESS:<float> <stage>) / stdout (JSON resultado) / exit code. - Cache addressable
<app_dir>/cache/<sha256[0:2]>/<sha256>.{html,md,...}. - Recovery: jobs que quedaron
runningde sesion anterior se marcanerroraljobs_init. dirty_counterque la UI lee para refrescar tras cambios.- Persistencia
JobRowcon created/started/finished/duration/progress/stage/error/result_json.
El proyecto online_data_recopilation (issue 0066) necesita el mismo sistema. En lugar de copy-paste, extraer al registry para que ambas apps importen.
Plan de extraccion
-
Identificar fronteras entre logica generica (extraer) y especifica de graph_explorer (queda local):
- Generico: thread pool, queue SQLite, subprocess spawn, wire protocol parser, cache sha256, recovery.
- Especifico: aplicar
entities/relations/node_updatesaloperations.dbdel grafo.
-
Funciones nuevas del registry:
ID Domain Que hace jobs_pool_cpp_corecore Thread pool generico parametrizable (workers, callback on_job(JobRow)). Tablajobsconfigurable de nombre.subprocess_worker_cpp_infrainfra Spawn subprocess + capturar stdin/stderr/stdout con wire protocol ( PROGRESS:, JSON final). DevuelveWorkerResult{stdout_json, error, exit_code}.job_cache_sha256_cpp_infrainfra cache_path(root, key) -> path,cache_put(root, key, bytes),cache_get(root, key). Layout<root>/<sha[0:2]>/<sha>.worker_manifest_loader_cpp_corecore Enumera <dir>/<id>/manifest.yaml, valida schema, devuelvevector<WorkerManifest>. -
Tipos nuevos:
JobRow_cpp_core— struct con campos comunes (id, worker_id, target_id, status, progress, stage, error, result_json, timestamps).WorkerManifest_cpp_core— struct (id, name, description, applies_to, emits, params, uses_functions).WorkerResult_cpp_infra— struct (stdout_json, stderr_log, exit_code, error).
-
Migracion graph_explorer:
- Reemplazar
jobs.cpp/hpor imports al registry. - El callback
on_jobqueda enentity_ops.cppaplicando entities/relations. - Test: lanzar enricher
fetch_webpage, verificar que sigue funcionando.
- Reemplazar
-
Validacion:
cd projects/osint_graph/apps/graph_explorer && cmake --build build+ tests existentes.
Schema tabla jobs (generico)
CREATE TABLE IF NOT EXISTS jobs (
id TEXT PRIMARY KEY,
worker_id TEXT NOT NULL, -- antes "enricher_id"
target_id TEXT NOT NULL, -- antes "node_id" (para odr es dataset_key, etc)
target_label TEXT, -- antes "node_name"
status TEXT NOT NULL, -- queued|running|done|error|cancelled
progress REAL DEFAULT 0,
stage TEXT,
error TEXT,
result_json TEXT,
params_json TEXT, -- params del manifest serializados
created_at INTEGER NOT NULL,
started_at INTEGER,
finished_at INTEGER
);
CREATE INDEX IF NOT EXISTS jobs_status_idx ON jobs(status);
CREATE INDEX IF NOT EXISTS jobs_worker_idx ON jobs(worker_id);
graph_explorer y odr_console comparten schema. Diferencia solo en interpretacion de target_id/result_json (callback per-app).
Riesgos
- graph_explorer es app C++ activa con tests pasando. Romper imports = romper produccion.
- Camino seguro: rama TBD
issue/0065-extract-jobs-to-registryen sub-repo de graph_explorer + sub-repo de fn_registry. Mergear ambos cuando build verde. - Feature flag NO aplica (cambio de codigo sin runtime toggle posible).
Criterios de aceptacion
- Funciones del registry creadas con tests + .md.
- graph_explorer compila y pasa tests existentes (32 WSL + 21 Win).
fetch_webpageenricher funciona end-to-end en graph_explorer tras refactor.- odr_console (issue 0066) puede importar
jobs_pool_cpp_corey lanzar 1 collector dummy. - Documentacion actualizada en
cpp/PATTERNS.mdmencionando jobs_pool como pieza estandar.
Out of scope
- Migrar el sistema a Go (issue futura si vale la pena).
- Cambiar wire protocol (ya estable, no romper enrichers existentes).