feat(go): html_to_markdown + extract_iocs

functions/core/html_to_markdown: convierte HTML a Markdown limpio (golang-only
sin dependencias externas). util como prep para LLMs y para indexar contenido
web.

functions/cybersecurity/extract_iocs + types/cybersecurity/ioc: extrae
indicators of compromise (IPs, domains, URLs, hashes, emails, CVEs,
crypto wallets) de texto libre. Devuelve []IOC tipado.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-04 11:51:51 +02:00
parent b643321778
commit 92297e02c5
7 changed files with 1471 additions and 0 deletions
+25
View File
@@ -0,0 +1,25 @@
---
name: ioc
lang: go
domain: cybersecurity
version: "1.0.0"
algebraic: product
definition: |
type IoC struct {
Type string
Value string
Start int
End int
Extra map[string]string
}
description: "Indicador de Compromiso extraido de texto. Type es uno de: email, ip_address, domain, file_hash, crypto_wallet, cve_id, mac_address, phone_number. Start y End son byte offsets en el texto original. Extra contiene campos adicionales dependientes del tipo (algorithm para file_hash, asset para crypto_wallet)."
tags: [ioc, cybersecurity, indicator, threat-intel]
uses_types: []
file_path: "functions/cybersecurity/extract_iocs.go"
---
## Notas
El struct IoC es el tipo de retorno de `ExtractIocs`. El campo `Extra` es nil para la mayoria de tipos; solo se puebla para:
- `file_hash`: `Extra["algorithm"]` = "md5" | "sha1" | "sha256" | "sha512"
- `crypto_wallet`: `Extra["asset"]` = "btc" | "eth"