feat: funciones Python infra y tipos Python (core, datascience, infra)
Infra: cache_to_file, cache_to_sqlite, http_download_file, http_get_json, http_post_json, read_file_with_encoding, safe_extract_zip, scan_directory, setup_logger, normalize_zip_filenames. Tipos: 30+ tipos core (agent_action, context, task, message, parse_result...), 6 tipos datascience (entity_candidate, extraction_result...), 2 tipos infra. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,40 @@
|
||||
---
|
||||
name: http_download_file
|
||||
kind: function
|
||||
lang: py
|
||||
domain: infra
|
||||
version: "1.0.0"
|
||||
purity: impure
|
||||
signature: "http_download_file(url: str, dest_path: str, headers: dict[str, str] | None = None, timeout: float = 120.0, chunk_size: int = 8192) -> dict"
|
||||
description: "Descarga un archivo por HTTP en streaming (sin cargar todo en memoria). Crea directorios intermedios si no existen. Retorna dict con path, size_bytes y content_type."
|
||||
tags: [http, download, file, streaming, network, stdlib, infra]
|
||||
uses_functions: []
|
||||
uses_types: []
|
||||
returns: []
|
||||
returns_optional: false
|
||||
error_type: "error_go_core"
|
||||
imports: ["os", "urllib.error", "urllib.request"]
|
||||
tested: true
|
||||
tests:
|
||||
- "mock de descarga con contenido binario"
|
||||
- "directorio destino creado automaticamente"
|
||||
- "retorno con size correcto"
|
||||
- "timeout configurado en el request"
|
||||
test_file_path: "python/functions/infra/http_download_file_test.py"
|
||||
file_path: "python/functions/infra/http_download_file.py"
|
||||
---
|
||||
|
||||
## Ejemplo
|
||||
|
||||
```python
|
||||
result = http_download_file(
|
||||
"https://example.com/report.pdf",
|
||||
dest_path="/tmp/reports/report.pdf",
|
||||
timeout=60.0,
|
||||
)
|
||||
print(f"Downloaded {result['size_bytes']} bytes to {result['path']}")
|
||||
```
|
||||
|
||||
## Notas
|
||||
|
||||
Solo usa stdlib (urllib, os). La descarga se hace en chunks de `chunk_size` bytes para evitar consumo de memoria con archivos grandes. El timeout de 120s por defecto es mayor que http_get_json porque los archivos pueden ser pesados. Los directorios intermedios se crean con os.makedirs(exist_ok=True).
|
||||
Reference in New Issue
Block a user