feat(infra): grupo claude-fleet — FleetView TUI + orquestacion de Claudes

Sistema FleetView para centralizar la flota de procesos Claude Code vivos en una sola ventana kitty + tmux (socket aislado -L fleet) con un panel TUI: - list_claude_fleet (+ tipo claude_fleet): escanea ~/.claude/sessions + goals + runtime, valida procesos vivos (anti-PID-reciclado), join por sessionId. - list_resumable_claudes (+ tipo resumable_claude): sesiones cerradas reanudables. - wrappers tmux: tmux_new_claude_window (con --resume), tmux_swap_window_into_console (preserva ancho del sidebar), tmux_map_claude_panes. - launch_kittyclaude: comando entrypoint; instala atajos alt+flechas/enter/n/0/k/r, mouse on, remain-on-exit off; fija el ancho del sidebar con hooks. - docs/capabilities/claude-fleet.md + entrada en el INDEX. Incluye ademas funciones datascience en progreso (excel/duckdb/postgres) y ajustes varios de docs e infra de otra sesion, agrupados aqui para no perderlos. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 00:04:41 +02:00
parent 7d395f39e5
commit 927437a8d8
58 changed files with 5961 additions and 2 deletions
@@ -0,0 +1,121 @@
+---
+name: duckdb_to_postgres
+kind: pipeline
+lang: py
+domain: pipelines
+version: "1.0.0"
+purity: impure
+signature: "def duckdb_to_postgres(duckdb_path: str, table: str, pg_dsn: str, pg_table: str = None, mode: str = 'replace', key_cols: list = None, batch_size: int = 5000) -> dict"
+description: "Pipeline que sincroniza una tabla DuckDB a PostgreSQL. Es lo que desbloquea que herramientas BI (Metabase, Grafana, Superset) lean datos que viven en DuckDB, porque NO hablan DuckDB nativo pero todas hablan PostgreSQL. Pasos: (a) lee el schema con duckdb_table_schema; (b) mapea tipos DuckDB->PostgreSQL (BIGINT/INTEGER->BIGINT, DOUBLE/FLOAT->DOUBLE PRECISION, VARCHAR/TEXT->TEXT, BOOLEAN->BOOLEAN, DATE->DATE, TIMESTAMP->TIMESTAMP, resto->TEXT) y genera CREATE TABLE IF NOT EXISTS con PRIMARY KEY si key_cols (DROP TABLE IF EXISTS antes si mode='replace'), aplicandolo con pg_apply_sql; (c) lee las filas con duckdb_query_readonly paginando con LIMIT/OFFSET e inserta en PostgreSQL con pg_insert_rows (add_snapshot_date=False) en lotes de batch_size, o con pg_upsert si hay key_cols y mode!='replace'. pg_upsert se importa detras de un check de import: sin el, el camino upsert no esta disponible pero replace/append funcionan. Compone funciones del registry sin reescribir su logica. Devuelve un dict sin lanzar: {status:'ok', pg_table, rows_synced, created} en exito y {status:'error', error} en fallo. Depende de duckdb (1.5.2) y psycopg2."
+tags: [duckdb, postgres, etl, sync, pipeline]
+uses_functions:
+  - duckdb_table_schema_py_infra
+  - duckdb_query_readonly_py_infra
+  - pg_apply_sql_py_infra
+  - pg_insert_rows_py_infra
+  - pg_upsert_py_infra
+uses_types: []
+returns: []
+returns_optional: false
+error_type: "error_py_core"
+imports: [os, re, sys, tempfile, duckdb, psycopg2]
+params:
+  - name: duckdb_path
+    desc: "ruta al archivo DuckDB de origen (se lee en modo read_only; debe existir)."
+  - name: table
+    desc: "nombre de la tabla DuckDB a sincronizar. Validado como identificador ^[A-Za-z_][A-Za-z0-9_]*$."
+  - name: pg_dsn
+    desc: "cadena de conexion PostgreSQL, p.ej. 'postgresql://user:pass@host:5432/db'."
+  - name: pg_table
+    desc: "nombre de la tabla destino en PostgreSQL. None (default) usa el mismo nombre que `table`. Validado como identificador."
+  - name: mode
+    desc: "'replace' (default) hace DROP TABLE IF EXISTS + CREATE + INSERT de todas las filas (snapshot completo). 'append'/'upsert' crean la tabla si no existe y luego: con key_cols usan pg_upsert (idempotente), sin key_cols hacen INSERT append-only. Otro valor devuelve {status:'error'}."
+  - name: key_cols
+    desc: "lista de columnas de la PRIMARY KEY. Se incluyen en el CREATE como PRIMARY KEY y, en modo != 'replace', habilitan el upsert idempotente. None/[] (default) = sin PK, solo INSERT. Deben existir en el schema DuckDB."
+  - name: batch_size
+    desc: "numero de filas por lote de insercion/upsert (default 5000). Debe ser un entero positivo."
+output: "dict. En exito: {status:'ok', pg_table:str, rows_synced:int, created:bool} donde rows_synced es el total de filas volcadas y created indica si se ejecuto el CREATE/DROP del schema. En error (sin lanzar): {status:'error', error:str}."
+tested: true
+tests:
+  - "test_map_tipos_duckdb_a_postgres"
+  - "test_build_ddl_con_pk_y_drop"
+  - "test_build_ddl_sin_pk_ni_drop"
+  - "test_identificador_tabla_invalido"
+  - "test_mode_invalido"
+  - "test_replace_sincroniza_filas"
+  - "test_upsert_idempotente_con_key_cols"
+test_file_path: "python/functions/pipelines/duckdb_to_postgres_test.py"
+file_path: "python/functions/pipelines/duckdb_to_postgres.py"
+---
+
+## Ejemplo
+
+```python
+import sys
+sys.path.insert(0, "python/functions")
+from pipelines.duckdb_to_postgres import duckdb_to_postgres
+
+# Snapshot completo: reemplaza la tabla destino en PostgreSQL con todas las filas
+# de la tabla DuckDB. Metabase/Grafana ya pueden leerla.
+res = duckdb_to_postgres(
+    "/tmp/almacen.duckdb",
+    "ventas",
+    "postgresql://captacion:****@127.0.0.1:5433/trends",
+    pg_table="ventas_diario",
+    mode="replace",
+)
+print(res)
+# {'status': 'ok', 'pg_table': 'ventas_diario', 'rows_synced': 1280, 'created': True}
+
+# Sync idempotente por clave: no duplica filas en re-ejecuciones.
+res2 = duckdb_to_postgres(
+    "/tmp/almacen.duckdb",
+    "clientes",
+    "postgresql://captacion:****@127.0.0.1:5433/trends",
+    mode="upsert",
+    key_cols=["id"],
+)
+print(res2)  # {'status': 'ok', 'pg_table': 'clientes', 'rows_synced': 540, 'created': True}
+```
+
+## Cuando usarla
+
+Cuando tienes datos en un archivo DuckDB y necesitas que una herramienta BI los
+lea: Metabase, Grafana y Superset NO hablan DuckDB nativo, pero todas hablan
+PostgreSQL. Es el ultimo eslabon del flujo `Excel -> DuckDB -> PostgreSQL`
+(precedido por `excel_to_duckdb_py_infra`). Usa `mode='replace'` para refrescos
+completos programados (un snapshot diario que recrea la tabla) y
+`mode='upsert' + key_cols` para sincronizaciones incrementales idempotentes que no
+duplican filas al re-ejecutar.
+
+## Gotchas
+
+- **DuckDB es single-writer**: el pipeline abre la base en read_only para leer, pero
+  si otro proceso la tiene bloqueada en escritura con version distinta del motor, la
+  apertura puede fallar; el error se devuelve en el dict, no se lanza.
+- **El modo read_only exige que el archivo DuckDB exista**: no lo crea. Un
+  `duckdb_path` inexistente devuelve `{status:'error', ...}` ya en el paso (a).
+- **Mapeo de tipos con posible perdida**: el mapeo DuckDB->PostgreSQL es conservador.
+  Tipos no contemplados (DECIMAL con escala, HUGEINT/UBIGINT de 128 bits, LIST/STRUCT/
+  MAP) caen a TEXT. Si el tipado fuerte importa aguas abajo (agregaciones numericas
+  en Metabase), revisa el schema con `duckdb_table_schema_py_infra` y ajusta los tipos
+  en DuckDB antes de sincronizar.
+- **`mode='replace'` es destructivo**: hace `DROP TABLE IF EXISTS` sobre la tabla
+  PostgreSQL destino antes de recrearla. Cualquier dato o indice manual de esa tabla
+  se pierde. Para sincronizaciones que deban preservar la tabla existente usa
+  `mode='append'`/`'upsert'` (CREATE TABLE IF NOT EXISTS, sin DROP).
+- **`pg_upsert` opcional**: se importa detras de un check de import. Si `pg_upsert_py_infra`
+  no esta en el entorno, `mode != 'replace'` con `key_cols` devuelve
+  `{status:'error', ...}` explicando que falta; el camino replace/append (sin upsert)
+  sigue funcionando.
+- **Upsert requiere PRIMARY KEY o UNIQUE** sobre las `key_cols` en PostgreSQL para que
+  `ON CONFLICT` funcione. El pipeline crea esa PRIMARY KEY en el CREATE cuando pasas
+  `key_cols`; si la tabla ya existia sin esa restriccion (`mode!='replace'` y tabla
+  preexistente), el upsert fallara — recrea con `mode='replace' + key_cols` una vez.
+- **Snapshot no transaccional entre lectura y escritura**: la lectura paginada de
+  DuckDB y la escritura a PostgreSQL no comparten transaccion. Si la tabla DuckDB
+  cambia a mitad del volcado (otro escritor), el resultado en PostgreSQL puede mezclar
+  estados. Sincroniza desde una base DuckDB estable (no mientras se ingesta).
+- **`pg_insert_rows` y `pg_apply_sql` lanzan** RuntimeError internamente; el pipeline
+  los envuelve en try/except y convierte el fallo a `{status:'error', ...}`. Nunca
+  propaga la excepcion al caller.
@@ -0,0 +1,311 @@
+"""Pipeline: sincroniza una tabla DuckDB a una tabla PostgreSQL.
+
+Esto es lo que desbloquea que herramientas BI (Metabase, Grafana, Superset) lean
+los datos que viven en un archivo DuckDB: esas herramientas NO hablan DuckDB
+nativo, pero todas hablan PostgreSQL. El pipeline lee el schema y las filas de la
+tabla DuckDB, crea (o recrea) la tabla equivalente en PostgreSQL con un mapeo de
+tipos DuckDB -> PostgreSQL, y vuelca las filas en lotes.
+
+Funcion impura de tipo pipeline: compone funciones del registry y NO reescribe su
+logica.
+  - duckdb_table_schema  -> lee columnas y tipos de la tabla DuckDB.
+  - duckdb_query_readonly -> lee las filas (paginadas con LIMIT/OFFSET).
+  - pg_apply_sql          -> aplica el DDL (CREATE/DROP) escrito a un .sql temporal.
+  - pg_insert_rows        -> inserta lotes (camino replace / append sin clave).
+  - pg_upsert (opcional)  -> upsert idempotente cuando hay key_cols y mode!='replace'.
+    pg_upsert se importa detras de un check: si todavia no esta en el registry, el
+    pipeline sigue funcionando para el camino replace/insert.
+
+Devuelve un dict sin lanzar, estilo del grupo: {status:'ok', ...} en exito y
+{status:'error', error:str} en fallo.
+"""
+
+import os
+import re
+import sys
+import tempfile
+
+# Las funciones del registry se importan, no se reescriben. sys.path apunta al
+# directorio de funciones del registry (mismo patron que usan las apps Python).
+_FUNCTIONS_DIR = os.path.join(
+    os.path.dirname(__file__), "..", ".."
+)  # python/
+_FUNCTIONS_DIR = os.path.abspath(os.path.join(_FUNCTIONS_DIR, "functions"))
+if _FUNCTIONS_DIR not in sys.path:
+    sys.path.insert(0, _FUNCTIONS_DIR)
+
+from infra.duckdb_query_readonly import duckdb_query_readonly  # noqa: E402
+from infra.duckdb_table_schema import duckdb_table_schema  # noqa: E402
+from infra.pg_apply_sql import pg_apply_sql  # noqa: E402
+from infra.pg_insert_rows import pg_insert_rows  # noqa: E402
+
+# pg_upsert puede no existir aun (lo construye otro agente en paralelo). Lo
+# cargamos detras de un check; sin el, el camino upsert no esta disponible pero
+# el resto del pipeline funciona.
+try:
+    from infra.pg_upsert import pg_upsert  # noqa: E402
+
+    _HAS_UPSERT = True
+except Exception:  # noqa: BLE001 - cualquier fallo de import deja el camino off
+    pg_upsert = None
+    _HAS_UPSERT = False
+
+_VALID_IDENT = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
+
+
+def _map_duckdb_type_to_pg(duck_type: str) -> str:
+    """Mapea un tipo DuckDB a su equivalente PostgreSQL.
+
+    El mapeo es conservador: tipos numericos/temporales/booleanos conocidos se
+    mapean a su equivalente PG natural; cualquier otro tipo (incluidos compuestos
+    como LIST/STRUCT/MAP, o DECIMAL con escala) cae a TEXT, que siempre acepta el
+    valor serializado. Puede haber perdida de tipado fuerte para esos casos.
+    """
+    t = (duck_type or "").strip().upper()
+    # Normalizar tipos parametrizados: DECIMAL(10,2) -> DECIMAL, VARCHAR(50) -> VARCHAR.
+    base = t.split("(")[0].strip()
+
+    mapping = {
+        "BIGINT": "BIGINT",
+        "INT8": "BIGINT",
+        "LONG": "BIGINT",
+        "INTEGER": "BIGINT",
+        "INT": "BIGINT",
+        "INT4": "BIGINT",
+        "SMALLINT": "BIGINT",
+        "INT2": "BIGINT",
+        "TINYINT": "BIGINT",
+        "INT1": "BIGINT",
+        "HUGEINT": "TEXT",  # 128-bit: no cabe en BIGINT, serializar a texto.
+        "UBIGINT": "TEXT",
+        "DOUBLE": "DOUBLE PRECISION",
+        "FLOAT8": "DOUBLE PRECISION",
+        "FLOAT": "DOUBLE PRECISION",
+        "FLOAT4": "DOUBLE PRECISION",
+        "REAL": "DOUBLE PRECISION",
+        "VARCHAR": "TEXT",
+        "TEXT": "TEXT",
+        "STRING": "TEXT",
+        "CHAR": "TEXT",
+        "BPCHAR": "TEXT",
+        "BOOLEAN": "BOOLEAN",
+        "BOOL": "BOOLEAN",
+        "LOGICAL": "BOOLEAN",
+        "DATE": "DATE",
+        "TIMESTAMP": "TIMESTAMP",
+        "DATETIME": "TIMESTAMP",
+        "TIMESTAMP_S": "TIMESTAMP",
+        "TIMESTAMP_MS": "TIMESTAMP",
+        "TIMESTAMP_NS": "TIMESTAMP",
+    }
+    return mapping.get(base, "TEXT")
+
+
+def _build_ddl(
+    pg_table: str,
+    columns: list,
+    key_cols: list,
+    drop_first: bool,
+) -> str:
+    """Construye el DDL CREATE (y opcional DROP) para la tabla destino en PG.
+
+    columns: lista de {name, type} (tipo DuckDB). key_cols: columnas de la PK
+    (puede ser None/[]). drop_first: si True antepone DROP TABLE IF EXISTS.
+    """
+    col_defs = []
+    for col in columns:
+        pg_type = _map_duckdb_type_to_pg(col["type"])
+        col_defs.append(f'    "{col["name"]}" {pg_type}')
+
+    pk_clause = ""
+    if key_cols:
+        pk_cols = ", ".join(f'"{c}"' for c in key_cols)
+        pk_clause = f",\n    PRIMARY KEY ({pk_cols})"
+
+    parts = []
+    if drop_first:
+        parts.append(f'DROP TABLE IF EXISTS "{pg_table}";')
+    parts.append(
+        f'CREATE TABLE IF NOT EXISTS "{pg_table}" (\n'
+        + ",\n".join(col_defs)
+        + pk_clause
+        + "\n);"
+    )
+    return "\n".join(parts)
+
+
+def duckdb_to_postgres(
+    duckdb_path: str,
+    table: str,
+    pg_dsn: str,
+    pg_table: str = None,
+    mode: str = "replace",
+    key_cols: list = None,
+    batch_size: int = 5000,
+) -> dict:
+    """Sincroniza una tabla DuckDB a PostgreSQL (puente para BI: Metabase/Grafana).
+
+    Args:
+        duckdb_path: ruta al archivo DuckDB de origen (se lee en modo read_only).
+        table: nombre de la tabla DuckDB a sincronizar. Validado como identificador.
+        pg_dsn: cadena de conexion PostgreSQL, p.ej.
+            "postgresql://user:pass@host:5432/db".
+        pg_table: nombre de la tabla destino en PostgreSQL. None (default) usa el
+            mismo nombre que `table`. Validado como identificador.
+        mode: 'replace' (default) hace DROP TABLE IF EXISTS + CREATE + INSERT de
+            todas las filas (snapshot completo). 'append'/'upsert' crean la tabla si
+            no existe (CREATE TABLE IF NOT EXISTS) y luego: si key_cols esta presente
+            usan pg_upsert (idempotente); si no, hacen INSERT append-only con
+            pg_insert_rows. Cualquier otro valor devuelve {status:'error', ...}.
+        key_cols: lista de columnas de la PRIMARY KEY. Se incluyen en el CREATE como
+            PRIMARY KEY y, en modo != 'replace', habilitan el upsert idempotente.
+            None/[] (default) = sin PK, solo INSERT.
+        batch_size: numero de filas por lote de insercion/upsert (default 5000).
+
+    Returns:
+        dict. En exito: {status:'ok', pg_table:str, rows_synced:int, created:bool}
+        donde rows_synced es el total de filas volcadas y created indica si se
+        ejecuto el CREATE/DROP del schema. En error (sin lanzar):
+        {status:'error', error:str}.
+    """
+    # --- Validaciones de entrada ---
+    if not isinstance(table, str) or not _VALID_IDENT.match(table):
+        return {"status": "error", "error": f"invalid table identifier: {table!r}"}
+
+    target = pg_table if pg_table is not None else table
+    if not isinstance(target, str) or not _VALID_IDENT.match(target):
+        return {"status": "error", "error": f"invalid pg_table identifier: {target!r}"}
+
+    if mode not in ("replace", "append", "upsert"):
+        return {
+            "status": "error",
+            "error": f"invalid mode: {mode!r} (expected 'replace'|'append'|'upsert')",
+        }
+
+    keys = list(key_cols) if key_cols else []
+    for k in keys:
+        if not isinstance(k, str) or not _VALID_IDENT.match(k):
+            return {"status": "error", "error": f"invalid key_col identifier: {k!r}"}
+
+    if not isinstance(batch_size, int) or batch_size <= 0:
+        return {"status": "error", "error": f"invalid batch_size: {batch_size!r}"}
+
+    use_upsert = bool(keys) and mode != "replace"
+    if use_upsert and not _HAS_UPSERT:
+        return {
+            "status": "error",
+            "error": (
+                "key_cols + mode!='replace' requiere pg_upsert_py_infra, que no "
+                "esta disponible en este entorno"
+            ),
+        }
+
+    # --- (a) Schema de la tabla DuckDB ---
+    schema = duckdb_table_schema(duckdb_path, table)
+    if schema.get("status") != "ok":
+        return {
+            "status": "error",
+            "error": f"no se pudo leer el schema de {table!r}: {schema.get('error')}",
+        }
+    columns = schema["columns"]
+    if not columns:
+        return {"status": "error", "error": f"la tabla {table!r} no tiene columnas"}
+
+    col_names = [c["name"] for c in columns]
+    # Validar que las key_cols existen en el schema.
+    for k in keys:
+        if k not in col_names:
+            return {
+                "status": "error",
+                "error": f"key_col {k!r} no esta en las columnas de {table!r}",
+            }
+
+    # --- (b) DDL: crear/recrear la tabla en PostgreSQL via pg_apply_sql ---
+    drop_first = mode == "replace"
+    ddl = _build_ddl(target, columns, keys, drop_first)
+    tmp_sql_path = None
+    try:
+        fd, tmp_sql_path = tempfile.mkstemp(suffix=".sql", prefix="duckdb_to_pg_")
+        with os.fdopen(fd, "w", encoding="utf-8") as fh:
+            fh.write(ddl)
+        pg_apply_sql(pg_dsn, tmp_sql_path)  # lanza RuntimeError si falla
+        created = True
+    except Exception as e:  # noqa: BLE001 - convertir el raise de pg_apply_sql a dict
+        return {"status": "error", "error": f"DDL fallo: {e}"}
+    finally:
+        if tmp_sql_path is not None and os.path.exists(tmp_sql_path):
+            try:
+                os.remove(tmp_sql_path)
+            except OSError:
+                pass
+
+    # --- (c) Leer filas de DuckDB y volcarlas en PostgreSQL por lotes ---
+    quoted = '"' + table.replace('"', '""') + '"'
+    offset = 0
+    rows_synced = 0
+    try:
+        while True:
+            page = duckdb_query_readonly(
+                duckdb_path,
+                f"SELECT * FROM {quoted} LIMIT ? OFFSET ?",
+                params=[batch_size, offset],
+                max_rows=batch_size,
+            )
+            if page.get("status") != "ok":
+                return {
+                    "status": "error",
+                    "error": f"lectura de filas fallo en offset {offset}: "
+                    f"{page.get('error')}",
+                }
+            batch = page["rows"]
+            if not batch:
+                break
+
+            if use_upsert:
+                res = pg_upsert(pg_dsn, target, batch, keys)
+                if res.get("status") != "ok":
+                    return {
+                        "status": "error",
+                        "error": f"pg_upsert fallo en offset {offset}: "
+                        f"{res.get('error')}",
+                    }
+                rows_synced += res.get("inserted", 0) + res.get("updated", 0)
+            else:
+                # pg_insert_rows lanza RuntimeError si falla; add_snapshot_date=False
+                # para no inyectar columnas que el schema DuckDB no tiene.
+                inserted = pg_insert_rows(
+                    pg_dsn, target, batch, add_snapshot_date=False
+                )
+                rows_synced += inserted
+
+            offset += len(batch)
+            if len(batch) < batch_size:
+                break
+    except Exception as e:  # noqa: BLE001 - convertir raises de pg_insert_rows a dict
+        return {"status": "error", "error": f"insercion fallo: {e}"}
+
+    return {
+        "status": "ok",
+        "pg_table": target,
+        "rows_synced": rows_synced,
+        "created": created,
+    }
+
+
+if __name__ == "__main__":
+    # Ejecucion directa con `fn run`: demo minima contra una base DuckDB temporal y
+    # un PostgreSQL apuntado por PG_TEST_DSN (si esta disponible).
+    import json
+
+    dsn = os.environ.get("PG_TEST_DSN")
+    if not dsn:
+        print(json.dumps({"status": "skipped", "reason": "PG_TEST_DSN no definido"}))
+        sys.exit(0)
+    demo_db = os.environ.get("DUCKDB_DEMO_PATH", "/tmp/duckdb_to_pg_demo.duckdb")
+    import duckdb  # noqa: E402
+
+    con = duckdb.connect(demo_db)
+    con.execute("CREATE OR REPLACE TABLE demo (id BIGINT, nombre VARCHAR, total DOUBLE)")
+    con.execute("INSERT INTO demo VALUES (1, 'ana', 10.5), (2, 'luis', 20.0)")
+    con.close()
+    print(json.dumps(duckdb_to_postgres(demo_db, "demo", dsn, mode="replace")))
@@ -0,0 +1,145 @@
+"""Tests para el pipeline duckdb_to_postgres.
+
+Los tests que tocan PostgreSQL hacen skip elegante si no hay PG_TEST_DSN. El mapeo
+de tipos y la construccion de DDL se prueban sin Postgres (logica pura interna).
+"""
+
+import os
+import sys
+
+import pytest
+
+sys.path.insert(0, os.path.dirname(__file__))
+
+import duckdb  # noqa: E402
+
+from duckdb_to_postgres import (  # noqa: E402
+    _build_ddl,
+    _map_duckdb_type_to_pg,
+    duckdb_to_postgres,
+)
+
+PG_DSN = os.environ.get("PG_TEST_DSN")
+
+
+# --- Tests sin Postgres: mapeo de tipos y DDL ---
+
+
+def test_map_tipos_duckdb_a_postgres():
+    assert _map_duckdb_type_to_pg("BIGINT") == "BIGINT"
+    assert _map_duckdb_type_to_pg("INTEGER") == "BIGINT"
+    assert _map_duckdb_type_to_pg("DOUBLE") == "DOUBLE PRECISION"
+    assert _map_duckdb_type_to_pg("FLOAT") == "DOUBLE PRECISION"
+    assert _map_duckdb_type_to_pg("VARCHAR") == "TEXT"
+    assert _map_duckdb_type_to_pg("TEXT") == "TEXT"
+    assert _map_duckdb_type_to_pg("BOOLEAN") == "BOOLEAN"
+    assert _map_duckdb_type_to_pg("DATE") == "DATE"
+    assert _map_duckdb_type_to_pg("TIMESTAMP") == "TIMESTAMP"
+    # Parametrizados normalizan al tipo base.
+    assert _map_duckdb_type_to_pg("DECIMAL(10,2)") == "TEXT"
+    assert _map_duckdb_type_to_pg("VARCHAR(50)") == "TEXT"
+    # Desconocido -> TEXT (con posible perdida de tipado).
+    assert _map_duckdb_type_to_pg("STRUCT(a INT)") == "TEXT"
+
+
+def test_build_ddl_con_pk_y_drop():
+    cols = [
+        {"name": "id", "type": "BIGINT"},
+        {"name": "nombre", "type": "VARCHAR"},
+    ]
+    ddl = _build_ddl("destino", cols, ["id"], drop_first=True)
+    assert "DROP TABLE IF EXISTS \"destino\";" in ddl
+    assert 'CREATE TABLE IF NOT EXISTS "destino"' in ddl
+    assert '"id" BIGINT' in ddl
+    assert '"nombre" TEXT' in ddl
+    assert 'PRIMARY KEY ("id")' in ddl
+
+
+def test_build_ddl_sin_pk_ni_drop():
+    cols = [{"name": "x", "type": "DOUBLE"}]
+    ddl = _build_ddl("t", cols, [], drop_first=False)
+    assert "DROP TABLE" not in ddl
+    assert '"x" DOUBLE PRECISION' in ddl
+    assert "PRIMARY KEY" not in ddl
+
+
+# --- Validaciones de entrada (sin Postgres) ---
+
+
+def test_identificador_tabla_invalido(tmp_path):
+    res = duckdb_to_postgres(str(tmp_path / "x.duckdb"), "t; DROP", "dsn")
+    assert res["status"] == "error"
+    assert "invalid table identifier" in res["error"]
+
+
+def test_mode_invalido(tmp_path):
+    db = tmp_path / "x.duckdb"
+    con = duckdb.connect(str(db))
+    con.execute("CREATE TABLE t (id BIGINT)")
+    con.close()
+    res = duckdb_to_postgres(str(db), "t", "dsn", mode="merge")
+    assert res["status"] == "error"
+    assert "invalid mode" in res["error"]
+
+
+# --- Tests end-to-end con Postgres ---
+
+
+@pytest.mark.skipif(not PG_DSN, reason="PG_TEST_DSN no definido")
+def test_replace_sincroniza_filas(tmp_path):
+    db = tmp_path / "src.duckdb"
+    con = duckdb.connect(str(db))
+    con.execute("CREATE TABLE ventas (id BIGINT, region VARCHAR, total DOUBLE)")
+    con.execute(
+        "INSERT INTO ventas VALUES (1,'norte',10.5),(2,'sur',20.0),(3,'norte',5.25)"
+    )
+    con.close()
+    pgt = "test_duckdb_to_pg_ventas"
+    res = duckdb_to_postgres(str(db), "ventas", PG_DSN, pg_table=pgt, mode="replace")
+    assert res["status"] == "ok", res
+    assert res["pg_table"] == pgt
+    assert res["rows_synced"] == 3
+    assert res["created"] is True
+
+    import psycopg2
+
+    conn = psycopg2.connect(PG_DSN)
+    try:
+        with conn.cursor() as cur:
+            cur.execute(f'SELECT COUNT(*) FROM "{pgt}"')
+            assert cur.fetchone()[0] == 3
+            cur.execute(f'DROP TABLE IF EXISTS "{pgt}"')
+        conn.commit()
+    finally:
+        conn.close()
+
+
+@pytest.mark.skipif(not PG_DSN, reason="PG_TEST_DSN no definido")
+def test_upsert_idempotente_con_key_cols(tmp_path):
+    db = tmp_path / "src.duckdb"
+    con = duckdb.connect(str(db))
+    con.execute("CREATE TABLE u (id BIGINT, v VARCHAR)")
+    con.execute("INSERT INTO u VALUES (1,'a'),(2,'b')")
+    con.close()
+    pgt = "test_duckdb_to_pg_upsert"
+    r1 = duckdb_to_postgres(
+        str(db), "u", PG_DSN, pg_table=pgt, mode="replace", key_cols=["id"]
+    )
+    assert r1["status"] == "ok", r1
+    # Re-sync en modo upsert: no debe duplicar (idempotente).
+    r2 = duckdb_to_postgres(
+        str(db), "u", PG_DSN, pg_table=pgt, mode="upsert", key_cols=["id"]
+    )
+    assert r2["status"] == "ok", r2
+
+    import psycopg2
+
+    conn = psycopg2.connect(PG_DSN)
+    try:
+        with conn.cursor() as cur:
+            cur.execute(f'SELECT COUNT(*) FROM "{pgt}"')
+            assert cur.fetchone()[0] == 2
+            cur.execute(f'DROP TABLE IF EXISTS "{pgt}"')
+        conn.commit()
+    finally:
+        conn.close()