chore: auto-commit (95 archivos)

- cmd/fn/doctor.go
- cmd/fn/main.go
- cpp/apps/primitives_gallery/playground/tables/CMakeLists.txt
- cpp/apps/primitives_gallery/playground/tables/data_table.cpp
- cpp/apps/primitives_gallery/playground/tables/data_table_logic.cpp
- cpp/apps/primitives_gallery/playground/tables/data_table_logic.h
- cpp/apps/primitives_gallery/playground/tables/self_test.cpp
- cpp/apps/primitives_gallery/playground/tables/tql.cpp
- cpp/apps/primitives_gallery/playground/tables/viz.cpp
- cpp/apps/primitives_gallery/playground/tables/viz.h
- ...

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-13 00:50:34 +02:00
parent ef60449e64
commit a802f59f55
189 changed files with 18964 additions and 330 deletions
@@ -0,0 +1,61 @@
---
name: vault_csv_profile
kind: function
lang: py
domain: datascience
version: "1.0.0"
purity: impure
signature: "def vault_csv_profile(vault_path: str, rel_path: str, db_path: str | None = None) -> dict"
description: "Perfila un CSV del vault: detecta encoding, lee schema con polars, extrae n_rows y columnas de fecha; persiste en csv_profiles y actualiza files_fts para búsqueda por contenido."
tags: [vault, csv, profiling, polars, encoding, datascience, fts]
uses_functions: []
uses_types: []
returns: []
returns_optional: false
error_type: "error_go_core"
imports: [sqlite3, time, pathlib, json, polars, chardet]
params:
- name: vault_path
desc: "Ruta absoluta a la raiz del vault donde vive el CSV y vault_index.db."
- name: rel_path
desc: "Ruta relativa al CSV dentro del vault (ej. 'data/raw/ventas.csv')."
- name: db_path
desc: "Override opcional de la ruta a vault_index.db. Por defecto <vault_path>/vault_index.db."
output: "Dict con: rel_path (str), cols (list de {name, dtype}), n_rows (int), encoding (str), date_min/date_max (ISO yyyy-mm-dd o None), persisted (bool)."
tested: true
tests:
- "test_csv_basic"
- "test_csv_date_detection"
- "test_csv_encoding_latin1"
- "test_csv_empty"
- "test_csv_persists_fts"
test_file_path: "python/functions/datascience/tests/test_vault_csv_profile.py"
file_path: "python/functions/datascience/vault_csv_profile.py"
---
## Ejemplo
```python
from vault_csv_profile import vault_csv_profile
result = vault_csv_profile("/vaults/mi_vault", "data/raw/ventas.csv")
# {
# "rel_path": "data/raw/ventas.csv",
# "cols": [{"name": "fecha", "dtype": "String"}, {"name": "importe", "dtype": "Float64"}],
# "n_rows": 1500,
# "encoding": "utf-8",
# "date_min": "2023-01-01",
# "date_max": "2023-12-31",
# "persisted": True
# }
```
## Notas
- Usa polars (lazy scan) como motor principal; pandas como fallback.
- Detección de encoding: chardet con confianza >= 0.6, luego intentos utf-8-sig → utf-8 → latin-1 → cp1252.
- Detección de fechas: columnas Date/Datetime nativas de polars, o columnas String con ≥80% de valores parseables como fecha.
- El FTS text incluye nombres de columnas + primeras 5 filas concatenadas.
- Upsert en csv_profiles por rel_path; el rowid de files_fts se ancla al rowid de la tabla files para que vault_search funcione correctamente.
- Si vault_index.db no existe, la función retorna el dict sin intentar persistir (persisted=False).
- Dependencias: polars, chardet (ambas instaladas en python/.venv con uv add).