Files
fn_registry/python/functions/cybersecurity/extract_ip_addresses.md
T
egutierrez cce7764510 feat(cybersecurity): 8 IoC regex extractors + extract_iocs pipeline puro
Extractores nuevos en python/functions/cybersecurity/:
- extract_ip_addresses (IPv4 + IPv6 con validacion ipaddress)
- extract_emails (RFC 5322 simplificado)
- extract_domains (FQDNs con TLD valido, lista estatica)
- extract_file_hashes (MD5/SHA1/SHA256/SHA512, algoritmo por longitud)
- extract_crypto_wallets (BTC legacy + bech32, ETH 0x+40hex)
- extract_cve_ids (CVE-YYYY-NNNN+)
- extract_mac_addresses (xx:xx:xx + xx-xx-xx, separador uniforme)
- extract_phone_numbers (E.164 + ES local 9 digitos)

Pipeline:
- extract_iocs corre todos, deduplica spans contenidos. Mantiene
  purity:pure (kind:function con uses_functions no vacio) porque la
  regla del registry exige que los pipelines sean impuros.

Todas devuelven list[dict] con value/start/end/type para que el
caller (issues 0038-0040) pueda reconciliar offsets con spans NER
sin reparsing.

Refs #0037

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 16:41:30 +02:00

1.7 KiB

name, kind, lang, domain, version, purity, signature, description, tags, uses_functions, uses_types, returns, returns_optional, error_type, imports, params, output, tested, tests, test_file_path, file_path
name kind lang domain version purity signature description tags uses_functions uses_types returns returns_optional error_type imports params output tested tests test_file_path file_path
extract_ip_addresses function py cybersecurity 1.0.0 pure def extract_ip_addresses(text: str) -> list[dict] Extrae direcciones IPv4 e IPv6 validas de un texto, con offsets start/end. Filtra candidatos invalidos via ipaddress (rechaza 999.999.999.999 y similares). No distingue privadas de publicas — el filtrado de relevancia es del caller.
ioc
ip
ipv4
ipv6
regex
extract
cybersecurity
python
false
re
ipaddress
name desc
text string de texto del que extraer IPs
lista de dicts con {value, start, end, type='ip_address'} por cada IP encontrada true
IPv4 valida y rangos limite
IPv4 invalida (>255 octeto) descartada
IPv6 forma completa y comprimida
IPv6 invalida descartada
Texto sin IPs
python/functions/cybersecurity/tests/test_extract_iocs.py python/functions/cybersecurity/extract_ip_addresses.py

Ejemplo

extract_ip_addresses("Server 192.168.1.1 talks to 8.8.8.8")
# [{"value": "192.168.1.1", "start": 7, "end": 18, "type": "ip_address"},
#  {"value": "8.8.8.8", "start": 28, "end": 35, "type": "ip_address"}]

extract_ip_addresses("not an IP: 999.999.999.999")
# []

Notas

Usa ipaddress.IPv4Address / IPv6Address para validacion estructural — descarta 999.999.999.999 y otras combinaciones sintacticamente plausibles pero invalidas. IPs privadas (10/8, 172.16/12, 192.168/16) se extraen igual; el filtrado de relevancia es responsabilidad del caller. Pure — solo regex compilado y ipaddress, sin red ni disco.