feat: integrar sanitizacion de input en runtime y config

- agents/runtime.go: campo sanitizeOpts en Agent, sanitizeInput() que
  llama a sanitize.Sanitize() y loguea warnings. Integrado en
  executeActions() y handleTaskEvent() antes de enviar al LLM.
  En modo reject, responde al usuario y corta el flujo.
- internal/config/schema.go: nuevo tipo SanitizeCfg dentro de SecurityCfg
  con campos enabled, mode, min_severity, disabled_patterns.

Protegido por feature flag prompt-injection-hardening (OFF).
Se activa por agente via security.sanitize.enabled en config.yaml.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-07 19:34:33 +00:00
parent e8dd7c41ed
commit 64d29e5968
2 changed files with 85 additions and 3 deletions
+12 -3
View File
@@ -280,9 +280,18 @@ type SSHTargetCfg struct {
// ── Security ──────────────────────────────────────────────────────────────
type SecurityCfg struct {
Roles map[string]RoleCfg `yaml:"roles"`
Audit AuditCfg `yaml:"audit"`
Secrets SecretsCfg `yaml:"secrets"`
Roles map[string]RoleCfg `yaml:"roles"`
Audit AuditCfg `yaml:"audit"`
Secrets SecretsCfg `yaml:"secrets"`
Sanitize SanitizeCfg `yaml:"sanitize"`
}
// SanitizeCfg controls prompt injection detection on incoming messages.
type SanitizeCfg struct {
Enabled bool `yaml:"enabled"` // enable sanitization (default false)
Mode string `yaml:"mode"` // warn | strip | reject (default warn)
MinSeverity string `yaml:"min_severity"` // low | medium | high (default medium)
DisabledPatterns []string `yaml:"disabled_patterns"` // pattern names to skip
}
type RoleCfg struct {