47fac22230
- .claude/CLAUDE.md - .claude/commands/subagentes.md - .claude/rules/INDEX.md - .mcp.json - bash/functions/cybersecurity/analyze_dns.md - bash/functions/cybersecurity/audit_http_headers.md - bash/functions/cybersecurity/audit_ssh_config.md - bash/functions/cybersecurity/check_firewall.md - bash/functions/cybersecurity/detect_suspicious_users.md - bash/functions/cybersecurity/encrypt_file.md - ... Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3.3 KiB
3.3 KiB
name, kind, lang, domain, version, purity, signature, description, tags, uses_functions, uses_types, returns, returns_optional, error_type, imports, params, output, tested, tests, test_file_path, file_path
| name | kind | lang | domain | version | purity | signature | description | tags | uses_functions | uses_types | returns | returns_optional | error_type | imports | params | output | tested | tests | test_file_path | file_path | |||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| safetensors_inspect | function | py | ml | 1.0.0 | impure | def safetensors_inspect(path: str) -> dict | Lee SOLO el header de un archivo .safetensors sin cargar los tensores en RAM. Retorna metadata del modelo, lista de tensores con dtype/shape/offsets, tamano total y conteo. Util para inspeccionar checkpoints de varios GB sin agotarlos en memoria. |
|
false | error_go_core |
|
dict con claves: path (str ruta absoluta), metadata (dict con __metadata__ del header), tensors (list[dict] con name/dtype/shape/data_offsets por tensor), total_size_bytes (int), n_tensors (int) | true |
|
python/functions/ml/tests/test_safetensors_inspect.py | python/functions/ml/safetensors_inspect.py |
Ejemplo
from ml.safetensors_inspect import safetensors_inspect
info = safetensors_inspect("/models/sd-v1-5/model.safetensors")
print(info["n_tensors"]) # 1344
print(info["total_size_bytes"]) # 3_975_733_952 (~3.7 GB)
print(info["metadata"]) # {"format": "pt", "model_type": "stable_diffusion"}
# Ver los 5 primeros tensores
for t in info["tensors"][:5]:
print(t["name"], t["dtype"], t["shape"])
# model.diffusion_model.input_blocks.0.0.weight F16 [320, 4, 3, 3]
# model.diffusion_model.input_blocks.0.0.bias F16 [320]
# ...
Notas
Formato safetensors
[8 bytes: uint64 LE = N (longitud del header JSON)]
[N bytes: JSON con metadata y descriptores]
[datos binarios de los tensores (no se leen)]
El JSON tiene esta estructura:
{
"__metadata__": {"format": "pt", ...},
"tensor_name": {
"dtype": "F32",
"shape": [1024, 768],
"data_offsets": [0, 3145728]
},
...
}
data_offsets son relativos al inicio del bloque de datos (despues del header),
no al inicio del archivo. Para acceso lazy a un tensor concreto:
offset_en_archivo = 8 + header_len + data_offsets[0].
Por que no usar la libreria safetensors
Esta funcion solo usa stdlib (struct, json, os) para no requerir
instalaciones adicionales y ser ejecutable durante fn index. La libreria
oficial safetensors de HuggingFace cargaria los tensores en RAM al usar
safe_open sin framework=None. Esta implementacion es read-only sobre
el header y garantiza que no se carga ningun dato de tensor.
Dtypes comunes
| dtype | descripcion |
|---|---|
| F32 | float32 (full precision) |
| BF16 | bfloat16 (training, ampere+) |
| F16 | float16 (inference) |
| I32 | int32 |
| I64 | int64 |
| U8 | uint8 |