feat: funciones Python infra y tipos Python (core, datascience, infra)

Infra: cache_to_file, cache_to_sqlite, http_download_file, http_get_json,
http_post_json, read_file_with_encoding, safe_extract_zip, scan_directory,
setup_logger, normalize_zip_filenames.
Tipos: 30+ tipos core (agent_action, context, task, message, parse_result...),
6 tipos datascience (entity_candidate, extraction_result...), 2 tipos infra.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-05 17:11:43 +02:00
parent 63a9cb5273
commit 9fd0ca9cac
110 changed files with 5714 additions and 0 deletions
@@ -0,0 +1,40 @@
---
name: http_download_file
kind: function
lang: py
domain: infra
version: "1.0.0"
purity: impure
signature: "http_download_file(url: str, dest_path: str, headers: dict[str, str] | None = None, timeout: float = 120.0, chunk_size: int = 8192) -> dict"
description: "Descarga un archivo por HTTP en streaming (sin cargar todo en memoria). Crea directorios intermedios si no existen. Retorna dict con path, size_bytes y content_type."
tags: [http, download, file, streaming, network, stdlib, infra]
uses_functions: []
uses_types: []
returns: []
returns_optional: false
error_type: "error_go_core"
imports: ["os", "urllib.error", "urllib.request"]
tested: true
tests:
- "mock de descarga con contenido binario"
- "directorio destino creado automaticamente"
- "retorno con size correcto"
- "timeout configurado en el request"
test_file_path: "python/functions/infra/http_download_file_test.py"
file_path: "python/functions/infra/http_download_file.py"
---
## Ejemplo
```python
result = http_download_file(
"https://example.com/report.pdf",
dest_path="/tmp/reports/report.pdf",
timeout=60.0,
)
print(f"Downloaded {result['size_bytes']} bytes to {result['path']}")
```
## Notas
Solo usa stdlib (urllib, os). La descarga se hace en chunks de `chunk_size` bytes para evitar consumo de memoria con archivos grandes. El timeout de 120s por defecto es mayor que http_get_json porque los archivos pueden ser pesados. Los directorios intermedios se crean con os.makedirs(exist_ok=True).