merge: issue/0016-skills-system — sistema de skills para agentes

Sistema completo de skills reutilizables que permiten a los agentes ejecutar flujos multi-paso complejos combinando tools, logica condicional y conocimiento de dominio. Componentes implementados: - pkg/skills/: tipos puros (SkillMeta, Skill, matching) - shell/skills/: loader (filesystem) + executor (scripts seguros) - tools/skilltools/: 4 tools de function calling (search, load, read, run) - internal/config/: SkillsCfg con filtros de categoria - agents/runtime.go: integracion opcional con agentes - skills/: 4 skills de ejemplo (deploy, log-analyzer, health-check, daily-report) - .claude/rules/create_skill.md: guia completa para crear skills Diferencia clave: tools son atomicas, skills son flujos declarativos. Arquitectura: pure core (pkg/), impure shell (shell/), contenido declarativo (skills/). Tests: matching, loader con path traversal protection, executor con allowlist. Commits del merge: - feat: tipos puros para sistema de skills - feat: loader y executor de skills en shell - feat: configuracion de skills en schema - feat: tools de function calling para skills - feat: integrar skills en runtime de agentes - feat: skills de ejemplo y README del sistema - docs: documentar sistema de skills - chore: cerrar issue 0016 — sistema de skills
2026-03-08 22:14:24 +00:00
parent 76d7619632 e6c1671177
commit 25c7ca7d85
20 changed files with 1888 additions and 286 deletions
@@ -48,16 +48,20 @@ pkg/decision/          motor de reglas puro
 pkg/llm/               tipos LLM puros
 pkg/message/           parse/format mensajes
 pkg/personality/       tipos de personalidad
+pkg/skills/            tipos puros de skills + matching
 shell/llm/             clientes LLM (anthropic, openai)
 shell/matrix/          cliente Matrix (mautrix-go)
 shell/ssh/             ejecutor SSH
 shell/mcp/             cliente y servidor MCP (Model Context Protocol)
+shell/skills/          loader (filesystem) + executor (scripts)
 shell/effects/         Runner: []Action → side effects
 shell/bus/             comunicacion inter-agente
 agents/runtime.go      Agent{}: ensambla core + shell
 agents/<id>/           agent.go (reglas puras) + config.yaml + prompts/system.md
 tools/                 tool registry + tool implementations (subpackages)
 tools/mcptools/        bridge: convierte MCP tools → tools.Tool
+tools/skilltools/      tools para interactuar con skills (search, load, run)
+skills/                contenido declarativo: SKILL.md + recursos (scripts, references, templates)
 internal/config/       schema.go + loader.go
 security/              grupos de usuarios/agentes + politicas de permisos (YAMLs)
 cmd/launcher/          entrypoint principal (rulesRegistry)
@@ -0,0 +1,199 @@
+# Regla: Crear nueva skill
+
+Guia para crear una nueva skill en `skills/`.
+
+## Prerequisitos
+
+- Entender la diferencia entre **tools** (funciones atomicas) y **skills** (flujos multi-paso)
+- Las skills son contenido declarativo (markdown + recursos), no codigo Go
+- Una skill combina tools existentes, logica condicional y conocimiento de dominio
+
+## Proceso
+
+### 1. Determinar categoria
+
+Elegir la categoria adecuada:
+- `devops/` — operaciones y deploy
+- `analysis/` — analisis de datos/logs
+- `communication/` — comunicacion y notificaciones
+- `coding/` — desarrollo y code review
+- `system/` — administracion del sistema
+
+Si ninguna aplica, crear nueva categoria.
+
+### 2. Crear estructura de directorios
+
+```bash
+mkdir -p skills/<categoria>/<skill-name>/{scripts,references,templates,assets}
+```
+
+Solo crear las subcarpetas que vayas a usar.
+
+### 3. Escribir SKILL.md
+
+Template:
+
+```markdown
+---
+name: skill-name
+description: >
+  Descripcion clara de que hace la skill y cuando debe activarse.
+  Esta descripcion es el mecanismo principal de triggering.
+  Idealmente < 100 palabras.
+---
+
+# <Nombre Descriptivo>
+
+Breve introduccion de la skill (1-2 parrafos).
+
+## Casos de uso
+
+- Caso 1
+- Caso 2
+- Caso 3
+
+## Proceso de ejecucion
+
+### 1. Paso inicial
+
+Descripcion del paso, que tools usar, ejemplos de codigo.
+
+```bash
+# ejemplo de comando
+ssh_command host="prod-01" command="systemctl status myapp"
+```
+
+### 2. Paso siguiente
+
+Continuar con los pasos...
+
+## Parametros requeridos
+
+Lista de parametros que el usuario debe proporcionar:
+- `param1`: descripcion
+- `param2`: descripcion
+
+Parametros opcionales:
+- `opt1`: descripcion (default: valor)
+
+## Ejemplo de uso
+
+Usuario: "Haz X"
+
+Agente:
+1. skill_search("X")
+2. skill_load("<skill-name>")
+3. Ejecutar pasos...
+4. Reportar resultado
+
+## Seguridad
+
+Consideraciones de seguridad especificas para esta skill.
+```
+
+### 4. Anadir recursos (opcional)
+
+#### Scripts (`scripts/`)
+
+Scripts ejecutables que la skill puede invocar:
+
+```bash
+#!/bin/bash
+# scripts/deploy.sh
+# Descripcion del script
+
+set -euo pipefail
+
+# Validar argumentos
+if [ $# -lt 1 ]; then
+  echo "Usage: $0 <service-name>"
+  exit 1
+fi
+
+# Implementacion...
+```
+
+**Importante**:
+- Usar shebang correcto (`#!/bin/bash`, `#!/usr/bin/env python3`, etc.)
+- Validar argumentos
+- Usar `set -euo pipefail` en bash
+- Exit codes claros (0 = exito, != 0 = error)
+
+#### Referencias (`references/`)
+
+Documentacion extensa que el agente puede consultar bajo demanda:
+
+```markdown
+# API Reference
+
+Documentacion detallada...
+
+Si > 300 lineas, agregar TOC al inicio.
+```
+
+#### Templates (`templates/`)
+
+Plantillas que la skill usa como base:
+
+```yaml
+# template-report.md
+# Report: {{title}}
+
+Generated: {{timestamp}}
+
+## Summary
+{{summary}}
+
+...
+```
+
+### 5. Probar la skill
+
+1. Habilitar skills en el config de un agente de prueba:
+
+```yaml
+skills:
+  enabled: true
+  path: "skills/"
+  categories: ["<categoria>"]
+
+tools:
+  skills:
+    allowed_interpreters: ["bash", "sh"]
+```
+
+2. Reiniciar el agente
+3. Probar buscando la skill: `skill_search("<query>")`
+4. Cargar la skill: `skill_load("<skill-name>")`
+5. Ejecutar el flujo completo siguiendo las instrucciones
+
+### 6. Documentar
+
+Actualizar `skills/README.md` si:
+- Creas una nueva categoria
+- La skill introduce un patron nuevo
+- Hay consideraciones de seguridad especiales
+
+## Reglas criticas
+
+- **Skills != Tools**: Las skills usan tools, no son tools
+- **SKILL.md < 500 lineas**: Si es mas largo, dividir en multiple skills o mover contenido a `references/`
+- **Description precisa**: La description en el frontmatter es critica para el matching
+- **Idempotencia**: Las skills deben ser seguras de ejecutar multiples veces si es posible
+- **Error handling**: Las instrucciones deben incluir que hacer en caso de error
+- **Rollback**: Si la skill hace cambios destructivos, incluir instrucciones de rollback
+
+## Ejemplos de skills validas
+
+Ver las skills existentes en `skills/`:
+- `skills/devops/deploy-service/` — deploy completo con rollback
+- `skills/analysis/log-analyzer/` — analisis de logs con metricas
+- `skills/system/health-check/` — verificacion de salud multi-servicio
+- `skills/communication/daily-report/` — generacion de reportes
+
+## Anti-patrones
+
+- Skill que solo ejecuta un comando SSH → usar tool `ssh_command` directamente
+- Skill con logica de negocio compleja → crear tool Go con tests
+- Skill que repite instrucciones del system prompt → innecesario
+- Scripts que requieren interaccion humana → las skills son automaticas
@@ -9,6 +9,7 @@ Guias operativas para LLMs que trabajan en este codebase. Cada regla describe co
 | **Crear agente** | [create_agent.md](create_agent.md) | Al crear un nuevo bot/agente Matrix completo |
 | **Crear herramienta** | [create_tool.md](create_tool.md) | Al añadir una nueva tool para LLM function calling |
 | **Crear comando** | [create_command.md](create_command.md) | Al añadir un comando directo (!xxx) a un agente |
+| **Crear skill** | [create_skill.md](create_skill.md) | Al crear una nueva skill (flujo multi-paso declarativo) |
 | **Crear issue** | [create_issue.md](create_issue.md) | Al crear un nuevo issue/feature request en `dev/issues/` |
 | **Arreglar issue** | [fix_issue.md](fix_issue.md) | Al implementar/arreglar un issue existente de `dev/issues/` |

@@ -17,6 +18,7 @@ Guias operativas para LLMs que trabajan en este codebase. Cada regla describe co
 - **Crear agente**: cuando el usuario pida crear un nuevo bot, agente, o asistente. Incluye la estructura de archivos, reglas puras, config YAML, system prompt y registro en el launcher.
 - **Crear herramienta**: cuando el usuario pida añadir una nueva herramienta/tool al sistema. Incluye el patron Def (puro) + Exec (impuro), registro en runtime.go y habilitacion en config.
 - **Crear comando**: cuando el usuario pida añadir un comando directo (!xxx) a un agente. Los comandos se resuelven sin pasar por reglas ni LLM.
+- **Crear skill**: cuando el usuario pida añadir una skill (flujo multi-paso declarativo). Las skills combinan tools, logica condicional y conocimiento de dominio en un SKILL.md con recursos opcionales.
 - **Crear issue**: cuando el usuario pida crear un nuevo issue, feature request o task. Usa el template en `.claude/templates/issue.md`.
 - **Arreglar issue**: cuando el usuario pida implementar, arreglar o trabajar en un issue existente. Incluye crear rama (`/git-branch`), implementar las tareas con tests, cerrar el issue, e integrar a master (`/git-push`).

@@ -31,6 +31,7 @@ import (
 	"github.com/enmanuel/agents/shell/matrix"
 	shellmcp "github.com/enmanuel/agents/shell/mcp"
 	shellmem "github.com/enmanuel/agents/shell/memory"
+	shellskills "github.com/enmanuel/agents/shell/skills"
 	"github.com/enmanuel/agents/shell/ssh"
 	"github.com/enmanuel/agents/tools"
 	toolclock "github.com/enmanuel/agents/tools/clock"
@@ -40,6 +41,7 @@ import (
 	toolmatrix "github.com/enmanuel/agents/tools/matrix"
 	toolmcp "github.com/enmanuel/agents/tools/mcptools"
 	toolmemory "github.com/enmanuel/agents/tools/memorytools"
+	toolskills "github.com/enmanuel/agents/tools/skilltools"
 	toolssh "github.com/enmanuel/agents/tools/ssh"
 	toolweather "github.com/enmanuel/agents/tools/weather"
 )
@@ -95,6 +97,9 @@ type Agent struct {
 	// Shared knowledge store — non-nil when shared_knowledge is enabled
 	sharedKnowledgeStore *shellknowledge.FileStore

+	// Skills loader — non-nil when skills are enabled
+	skillLoader *shellskills.Loader
+
 	// Sanitization options — nil when sanitization is disabled
 	sanitizeOpts *sanitize.Options

@@ -277,8 +282,28 @@ func New(cfg *config.AgentConfig, rules []decision.Rule, agentACL acl.ACL, logge
 		}
 	}

+	// Skills loader
+	var skillLoader *shellskills.Loader
+	var skillExecutor *shellskills.Executor
+	if cfg.Skills.Enabled {
+		skillsPath := cfg.Skills.SkillsPath
+		if skillsPath == "" {
+			skillsPath = "skills/"
+		}
+		skillLoader = shellskills.NewLoader(skillsPath)
+
+		// Skills executor for scripts
+		allowedInterpreters := cfg.Tools.Skills.AllowedInterpreters
+		timeout := cfg.Skills.Timeout
+		if timeout == 0 {
+			timeout = 60 * time.Second
+		}
+		skillExecutor = shellskills.NewExecutor(allowedInterpreters, timeout)
+		logger.Info("skills enabled", "path", skillsPath, "categories", cfg.Skills.Categories)
+	}
+
 	// Tool registry — register tools enabled in config
-	toolReg := buildToolRegistry(cfg, sshExec, matrixClient, memStore, kStore, sharedKStore, mcpManager, roomCtx, logger)
+	toolReg := buildToolRegistry(cfg, sshExec, matrixClient, memStore, kStore, sharedKStore, mcpManager, skillLoader, skillExecutor, roomCtx, logger)

 	// Rate limiting for tools
 	if cfg.Security.ToolRateLimit.Enabled {
@@ -322,6 +347,7 @@ func New(cfg *config.AgentConfig, rules []decision.Rule, agentACL acl.ACL, logge
 		memStore:             memStore,
 		knowledgeStore:       kStore,
 		sharedKnowledgeStore: sharedKStore,
+		skillLoader:          skillLoader,
 		windowSize:           windowSize,
 		roomCtx:              roomCtx,
 	}
@@ -1031,6 +1057,8 @@ func buildToolRegistry(
 	kStore *shellknowledge.FileStore,
 	sharedKStore *shellknowledge.FileStore,
 	mcpManager *shellmcp.Manager,
+	skillLoader *shellskills.Loader,
+	skillExecutor *shellskills.Executor,
 	roomCtx *toolmemory.RoomContext,
 	logger *slog.Logger,
 ) *tools.Registry {
@@ -1116,5 +1144,16 @@ func buildToolRegistry(
 		}
 	}

+	// Skills tools — register skill search, load, read, and run tools
+	if skillLoader != nil {
+		reg.Register(toolskills.NewSkillSearch(skillLoader, cfg.Skills.Categories))
+		reg.Register(toolskills.NewSkillLoad(skillLoader))
+		reg.Register(toolskills.NewSkillReadResource(skillLoader))
+		if skillExecutor != nil {
+			reg.Register(toolskills.NewSkillRunScript(skillLoader, skillExecutor))
+		}
+		logger.Debug("registered skills tools")
+	}
+
 	return reg
 }
@@ -20,7 +20,7 @@ afectados y notas de implementacion.
 | 13 | Hot reload                   | [0013-hot-reload.md](completed/0013-hot-reload.md)                   | completado |
 | 14 | Template agent standardize   | [0014-template-agent-standardize.md](0014-template-agent-standardize.md) | pendiente |
 | 15 | Multi-platform Telegram      | [0015-multi-platform-telegram.md](0015-multi-platform-telegram.md)   | pendiente  |
-| 16 | Skills system                | [0016-skills-system.md](0016-skills-system.md)                       | pendiente  |
+| 16 | Skills system                | [0016-skills-system.md](completed/0016-skills-system.md)             | completado |
 | 17 | MCP client tools             | [0017-mcp-client-tools.md](completed/0017-mcp-client-tools.md)      | completado |
 | 18 | Shared knowledge             | [0018-shared-knowledge.md](completed/0018-shared-knowledge.md)       | completado |
 | 19 | Prompt injection hardening   | [0019-prompt-injection-hardening.md](completed/0019-prompt-injection-hardening.md) | completado |
@@ -18,6 +18,7 @@ type AgentConfig struct {
 	Resilience  ResilienceCfg  `yaml:"resilience"`
 	Storage     StorageCfg     `yaml:"storage"`
 	Memory      MemoryCfg      `yaml:"memory"`
+	Skills      SkillsCfg      `yaml:"skills"`
 }

 // ── Identity ──────────────────────────────────────────────────────────────
@@ -130,6 +131,7 @@ type ToolsCfg struct {
 	Memory          MemoryToolCfg          `yaml:"memory"`
 	Knowledge       KnowledgeToolCfg       `yaml:"knowledge"`
 	SharedKnowledge SharedKnowledgeToolCfg `yaml:"shared_knowledge"`
+	Skills          SkillsToolCfg          `yaml:"skills"`
 }

 type MatrixToolCfg struct {
@@ -478,6 +480,19 @@ type MemoryToolCfg struct {
 	Enabled bool `yaml:"enabled"`
 }

+// ── Skills ────────────────────────────────────────────────────────────────
+
+type SkillsCfg struct {
+	Enabled    bool     `yaml:"enabled"`     // enable skills system (default false)
+	SkillsPath string   `yaml:"path"`        // path to skills directory (default: "skills/")
+	Categories []string `yaml:"categories"`  // filter: only load skills from these categories (empty = all)
+	Timeout    time.Duration `yaml:"timeout"` // timeout for script execution (default: 60s)
+}
+
+type SkillsToolCfg struct {
+	AllowedInterpreters []string `yaml:"allowed_interpreters"` // allowlist for skill script execution (default: ["bash", "sh"])
+}
+
 // ── Special Agents ────────────────────────────────────────────────────────

 // SpecialConfig is the root configuration for a special agent (no Matrix identity).
@@ -0,0 +1,103 @@
+package skills
+
+import (
+	"sort"
+	"strings"
+)
+
+// Match retorna las skills mas relevantes para una query dada.
+// Implementacion inicial: keyword matching simple contra name + description.
+// La query y las skills son procesadas en lowercase para matching case-insensitive.
+//
+// El scoring es basico:
+// - Match exacto en name: 1.0
+// - Match parcial en name: 0.8
+// - Match en description: 0.6 * (palabras coincidentes / palabras totales)
+// - Sin match: 0.0
+//
+// Retorna las skills ordenadas por confidence descendente.
+func Match(query string, skills []SkillMeta) []SkillMatch {
+	query = strings.ToLower(strings.TrimSpace(query))
+	if query == "" {
+		return nil
+	}
+
+	queryWords := strings.Fields(query)
+	var matches []SkillMatch
+
+	for _, skill := range skills {
+		confidence := scoreSkill(queryWords, skill)
+		if confidence > 0 {
+			matches = append(matches, SkillMatch{
+				Skill:      skill,
+				Confidence: confidence,
+			})
+		}
+	}
+
+	sort.Sort(ByConfidence(matches))
+	return matches
+}
+
+// scoreSkill calcula el score de relevancia de una skill para las palabras de query.
+func scoreSkill(queryWords []string, skill SkillMeta) float64 {
+	nameLower := strings.ToLower(skill.Name)
+	descLower := strings.ToLower(skill.Description)
+
+	// Match exacto en name
+	queryStr := strings.Join(queryWords, " ")
+	if nameLower == queryStr {
+		return 1.0
+	}
+
+	// Match parcial en name (todas las palabras de query aparecen en name)
+	nameMatches := 0
+	for _, word := range queryWords {
+		if strings.Contains(nameLower, word) {
+			nameMatches++
+		}
+	}
+	if nameMatches == len(queryWords) {
+		return 0.8
+	}
+
+	// Match en description (contar palabras coincidentes)
+	descWords := strings.Fields(descLower)
+	descMatches := 0
+	for _, qword := range queryWords {
+		for _, dword := range descWords {
+			if strings.Contains(dword, qword) || strings.Contains(qword, dword) {
+				descMatches++
+				break
+			}
+		}
+	}
+
+	if descMatches > 0 {
+		ratio := float64(descMatches) / float64(len(queryWords))
+		return 0.6 * ratio
+	}
+
+	return 0.0
+}
+
+// FilterByCategory retorna solo las skills que pertenecen a las categorias especificadas.
+// Si categories esta vacio, retorna todas las skills sin filtrar.
+func FilterByCategory(skills []SkillMeta, categories []string) []SkillMeta {
+	if len(categories) == 0 {
+		return skills
+	}
+
+	catSet := make(map[string]bool)
+	for _, cat := range categories {
+		catSet[strings.ToLower(cat)] = true
+	}
+
+	var filtered []SkillMeta
+	for _, skill := range skills {
+		if catSet[strings.ToLower(skill.Category)] {
+			filtered = append(filtered, skill)
+		}
+	}
+	return filtered
+}
@@ -0,0 +1,136 @@
+package skills
+
+import (
+	"testing"
+)
+
+func TestMatch(t *testing.T) {
+	skills := []SkillMeta{
+		{Name: "deploy-service", Description: "Deploy a service via SSH to a remote server", Category: "devops"},
+		{Name: "log-analyzer", Description: "Analyze logs for errors and patterns", Category: "analysis"},
+		{Name: "health-check", Description: "Check the health of services and systems", Category: "system"},
+		{Name: "daily-report", Description: "Generate daily report with metrics", Category: "communication"},
+	}
+
+	tests := []struct {
+		name          string
+		query         string
+		expectMatches int
+		firstMatch    string // expected first match name
+	}{
+		{
+			name:          "exact match in name",
+			query:         "deploy-service",
+			expectMatches: 1,
+			firstMatch:    "deploy-service",
+		},
+		{
+			name:          "partial match in name",
+			query:         "deploy",
+			expectMatches: 1,
+			firstMatch:    "deploy-service",
+		},
+		{
+			name:          "match in description",
+			query:         "analyze logs",
+			expectMatches: 2, // log-analyzer and daily-report (both have similar words)
+			firstMatch:    "log-analyzer",
+		},
+		{
+			name:          "multiple matches",
+			query:         "service",
+			expectMatches: 2, // deploy-service and health-check (services)
+		},
+		{
+			name:          "no match",
+			query:         "nonexistent",
+			expectMatches: 0,
+		},
+		{
+			name:          "empty query",
+			query:         "",
+			expectMatches: 0,
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			matches := Match(tt.query, skills)
+
+			if len(matches) != tt.expectMatches {
+				t.Errorf("expected %d matches, got %d", tt.expectMatches, len(matches))
+			}
+
+			if tt.firstMatch != "" && len(matches) > 0 {
+				if matches[0].Skill.Name != tt.firstMatch {
+					t.Errorf("expected first match %q, got %q", tt.firstMatch, matches[0].Skill.Name)
+				}
+			}
+
+			// Verify confidence is in valid range
+			for _, match := range matches {
+				if match.Confidence < 0 || match.Confidence > 1 {
+					t.Errorf("invalid confidence: %f (must be 0-1)", match.Confidence)
+				}
+			}
+
+			// Verify matches are sorted by confidence descending
+			for i := 1; i < len(matches); i++ {
+				if matches[i].Confidence > matches[i-1].Confidence {
+					t.Errorf("matches not sorted: %f > %f", matches[i].Confidence, matches[i-1].Confidence)
+				}
+			}
+		})
+	}
+}
+
+func TestFilterByCategory(t *testing.T) {
+	skills := []SkillMeta{
+		{Name: "deploy-service", Category: "devops"},
+		{Name: "log-analyzer", Category: "analysis"},
+		{Name: "health-check", Category: "system"},
+		{Name: "daily-report", Category: "communication"},
+	}
+
+	tests := []struct {
+		name       string
+		categories []string
+		expectLen  int
+	}{
+		{
+			name:       "no filter (all skills)",
+			categories: nil,
+			expectLen:  4,
+		},
+		{
+			name:       "single category",
+			categories: []string{"devops"},
+			expectLen:  1,
+		},
+		{
+			name:       "multiple categories",
+			categories: []string{"devops", "system"},
+			expectLen:  2,
+		},
+		{
+			name:       "nonexistent category",
+			categories: []string{"nonexistent"},
+			expectLen:  0,
+		},
+		{
+			name:       "case insensitive",
+			categories: []string{"DEVOPS"},
+			expectLen:  1,
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			filtered := FilterByCategory(skills, tt.categories)
+
+			if len(filtered) != tt.expectLen {
+				t.Errorf("expected %d skills, got %d", tt.expectLen, len(filtered))
+			}
+		})
+	}
+}
@@ -0,0 +1,35 @@
+package skills
+
+// SkillMeta es la metadata extraida del frontmatter YAML del SKILL.md.
+// Es la representacion minima de una skill que siempre esta en contexto.
+type SkillMeta struct {
+	Name        string `yaml:"name"`
+	Description string `yaml:"description"`
+	Category    string // derivado de la ruta del directorio (devops, analysis, etc.)
+}
+
+// Skill es la representacion completa de una skill cargada.
+// Incluye metadata, instrucciones completas y rutas a recursos.
+type Skill struct {
+	Meta         SkillMeta
+	Instructions string   // cuerpo markdown del SKILL.md (sin frontmatter)
+	BasePath     string   // ruta absoluta al directorio de la skill
+	Scripts      []string // rutas relativas a scripts/ (ej: ["deploy.sh", "rollback.sh"])
+	References   []string // rutas relativas a references/
+	Templates    []string // rutas relativas a templates/
+	Assets       []string // rutas relativas a assets/
+}
+
+// SkillMatch indica si una skill es relevante para un contexto dado.
+// Se usa como resultado de la funcion Match.
+type SkillMatch struct {
+	Skill      SkillMeta
+	Confidence float64 // 0.0 - 1.0
+}
+
+// ByConfidence implementa sort.Interface para ordenar SkillMatch por confidence descendente.
+type ByConfidence []SkillMatch
+
+func (a ByConfidence) Len() int           { return len(a) }
+func (a ByConfidence) Swap(i, j int)      { a[i], a[j] = a[j], a[i] }
+func (a ByConfidence) Less(i, j int) bool { return a[i].Confidence > a[j].Confidence }
@@ -0,0 +1,110 @@
+package skills
+
+import (
+	"bytes"
+	"context"
+	"fmt"
+	"os/exec"
+	"path/filepath"
+	"strings"
+	"time"
+)
+
+// Executor ejecuta scripts de skills de forma segura con allowlist de interpreters.
+type Executor struct {
+	allowedInterpreters []string
+	timeout             time.Duration
+}
+
+// NewExecutor crea un nuevo Executor con la configuracion dada.
+// Si allowedInterpreters esta vacio, se usa un default de ["bash", "sh"].
+func NewExecutor(allowedInterpreters []string, timeout time.Duration) *Executor {
+	if len(allowedInterpreters) == 0 {
+		allowedInterpreters = []string{"bash", "sh"}
+	}
+	if timeout == 0 {
+		timeout = 60 * time.Second
+	}
+	return &Executor{
+		allowedInterpreters: allowedInterpreters,
+		timeout:             timeout,
+	}
+}
+
+// Run ejecuta un script de skill con los argumentos dados.
+// scriptPath es la ruta absoluta al script.
+// args son los argumentos pasados al script.
+//
+// El script debe tener una extension reconocida (.sh, .bash, .py, etc.) o
+// un shebang que indique el interprete.
+//
+// Retorna stdout+stderr combinados y error si falla.
+func (e *Executor) Run(ctx context.Context, scriptPath string, args []string) (string, error) {
+	// Inferir interprete desde extension
+	interpreter, err := e.inferInterpreter(scriptPath)
+	if err != nil {
+		return "", err
+	}
+
+	// Validar que el interprete esta en la allowlist
+	if !e.isAllowed(interpreter) {
+		return "", fmt.Errorf("interpreter not allowed: %s (allowed: %v)", interpreter, e.allowedInterpreters)
+	}
+
+	// Construir comando
+	cmdArgs := append([]string{scriptPath}, args...)
+	cmd := exec.CommandContext(ctx, interpreter, cmdArgs...)
+
+	var stdout, stderr bytes.Buffer
+	cmd.Stdout = &stdout
+	cmd.Stderr = &stderr
+
+	// Aplicar timeout
+	timeoutCtx, cancel := context.WithTimeout(ctx, e.timeout)
+	defer cancel()
+
+	cmd = exec.CommandContext(timeoutCtx, interpreter, cmdArgs...)
+	cmd.Stdout = &stdout
+	cmd.Stderr = &stderr
+
+	err = cmd.Run()
+	output := stdout.String() + stderr.String()
+
+	if timeoutCtx.Err() == context.DeadlineExceeded {
+		return output, fmt.Errorf("script timeout exceeded (%s)", e.timeout)
+	}
+
+	if err != nil {
+		return output, fmt.Errorf("script failed: %w", err)
+	}
+
+	return output, nil
+}
+
+// inferInterpreter detecta el interprete a usar desde la extension del archivo.
+func (e *Executor) inferInterpreter(path string) (string, error) {
+	ext := strings.ToLower(filepath.Ext(path))
+
+	switch ext {
+	case ".sh", ".bash":
+		return "bash", nil
+	case ".py":
+		return "python3", nil
+	case ".rb":
+		return "ruby", nil
+	case ".js":
+		return "node", nil
+	default:
+		return "", fmt.Errorf("unsupported script extension: %s", ext)
+	}
+}
+
+// isAllowed verifica si un interprete esta en la allowlist.
+func (e *Executor) isAllowed(interpreter string) bool {
+	for _, allowed := range e.allowedInterpreters {
+		if allowed == interpreter {
+			return true
+		}
+	}
+	return false
+}
@@ -0,0 +1,127 @@
+package skills
+
+import (
+	"context"
+	"os"
+	"path/filepath"
+	"strings"
+	"testing"
+	"time"
+)
+
+func TestExecutor(t *testing.T) {
+	tmpDir := t.TempDir()
+
+	// Create a simple bash script
+	scriptPath := filepath.Join(tmpDir, "test.sh")
+	scriptContent := `#!/bin/bash
+echo "Hello from script"
+echo "Args: $@"
+`
+	if err := os.WriteFile(scriptPath, []byte(scriptContent), 0755); err != nil {
+		t.Fatal(err)
+	}
+
+	// Create a script that times out
+	timeoutScriptPath := filepath.Join(tmpDir, "timeout.sh")
+	timeoutContent := `#!/bin/bash
+sleep 10
+`
+	if err := os.WriteFile(timeoutScriptPath, []byte(timeoutContent), 0755); err != nil {
+		t.Fatal(err)
+	}
+
+	// Create a failing script
+	failScriptPath := filepath.Join(tmpDir, "fail.sh")
+	failContent := `#!/bin/bash
+exit 1
+`
+	if err := os.WriteFile(failScriptPath, []byte(failContent), 0755); err != nil {
+		t.Fatal(err)
+	}
+
+	executor := NewExecutor([]string{"bash", "sh"}, 2*time.Second)
+
+	// Test successful execution
+	t.Run("successful_execution", func(t *testing.T) {
+		ctx := context.Background()
+		output, err := executor.Run(ctx, scriptPath, []string{"arg1", "arg2"})
+		if err != nil {
+			t.Fatalf("Run failed: %v", err)
+		}
+
+		if !strings.Contains(output, "Hello from script") {
+			t.Errorf("expected 'Hello from script' in output, got: %q", output)
+		}
+
+		if !strings.Contains(output, "Args: arg1 arg2") {
+			t.Errorf("expected 'Args: arg1 arg2' in output, got: %q", output)
+		}
+	})
+
+	// Test timeout
+	t.Run("timeout", func(t *testing.T) {
+		ctx := context.Background()
+		_, err := executor.Run(ctx, timeoutScriptPath, nil)
+		if err == nil {
+			t.Error("expected timeout error")
+		}
+
+		if !strings.Contains(err.Error(), "timeout") {
+			t.Errorf("expected timeout error, got: %v", err)
+		}
+	})
+
+	// Test script failure
+	t.Run("script_failure", func(t *testing.T) {
+		ctx := context.Background()
+		_, err := executor.Run(ctx, failScriptPath, nil)
+		if err == nil {
+			t.Error("expected script failure error")
+		}
+	})
+
+	// Test disallowed interpreter
+	t.Run("disallowed_interpreter", func(t *testing.T) {
+		pyScriptPath := filepath.Join(tmpDir, "test.py")
+		pyContent := `#!/usr/bin/env python3
+print("hello")
+`
+		if err := os.WriteFile(pyScriptPath, []byte(pyContent), 0755); err != nil {
+			t.Fatal(err)
+		}
+
+		ctx := context.Background()
+		_, err := executor.Run(ctx, pyScriptPath, nil)
+		if err == nil {
+			t.Error("expected error for disallowed interpreter")
+		}
+
+		if !strings.Contains(err.Error(), "not allowed") {
+			t.Errorf("expected 'not allowed' error, got: %v", err)
+		}
+	})
+
+	// Test allowed python interpreter
+	t.Run("allowed_python", func(t *testing.T) {
+		pyExecutor := NewExecutor([]string{"python3"}, 2*time.Second)
+
+		pyScriptPath := filepath.Join(tmpDir, "hello.py")
+		pyContent := `#!/usr/bin/env python3
+print("Hello from Python")
+`
+		if err := os.WriteFile(pyScriptPath, []byte(pyContent), 0755); err != nil {
+			t.Fatal(err)
+		}
+
+		ctx := context.Background()
+		output, err := pyExecutor.Run(ctx, pyScriptPath, nil)
+		if err != nil {
+			t.Fatalf("Run failed: %v", err)
+		}
+
+		if !strings.Contains(output, "Hello from Python") {
+			t.Errorf("expected 'Hello from Python' in output, got: %q", output)
+		}
+	})
+}
@@ -0,0 +1,223 @@
+package skills
+
+import (
+	"bufio"
+	"fmt"
+	"os"
+	"path/filepath"
+	"strings"
+
+	"github.com/enmanuel/agents/pkg/skills"
+	"gopkg.in/yaml.v3"
+)
+
+// Loader descubre y carga skills desde un directorio base.
+type Loader struct {
+	basePath string
+}
+
+// NewLoader crea un nuevo Loader apuntando al directorio de skills.
+func NewLoader(basePath string) *Loader {
+	return &Loader{basePath: basePath}
+}
+
+// LoadMeta carga solo la metadata (nivel 1) de todas las skills.
+// Recorre el directorio base buscando SKILL.md y extrae el frontmatter YAML.
+func (l *Loader) LoadMeta() ([]skills.SkillMeta, error) {
+	var metas []skills.SkillMeta
+
+	// Recorre categorias (devops/, analysis/, etc.)
+	categories, err := os.ReadDir(l.basePath)
+	if err != nil {
+		return nil, fmt.Errorf("read skills dir: %w", err)
+	}
+
+	for _, catEntry := range categories {
+		if !catEntry.IsDir() {
+			continue
+		}
+
+		category := catEntry.Name()
+		catPath := filepath.Join(l.basePath, category)
+
+		// Recorre skills dentro de la categoria
+		skillDirs, err := os.ReadDir(catPath)
+		if err != nil {
+			continue
+		}
+
+		for _, skillEntry := range skillDirs {
+			if !skillEntry.IsDir() {
+				continue
+			}
+
+			skillName := skillEntry.Name()
+			skillPath := filepath.Join(catPath, skillName)
+			skillMdPath := filepath.Join(skillPath, "SKILL.md")
+
+			// Verificar que existe SKILL.md
+			if _, err := os.Stat(skillMdPath); os.IsNotExist(err) {
+				continue
+			}
+
+			// Parsear metadata
+			meta, _, err := parseSkillMD(skillMdPath)
+			if err != nil {
+				continue // skip invalid skills
+			}
+
+			meta.Category = category
+			metas = append(metas, meta)
+		}
+	}
+
+	return metas, nil
+}
+
+// LoadSkill carga una skill completa (nivel 2) por nombre.
+// Retorna el struct Skill con metadata, instrucciones y listado de recursos.
+func (l *Loader) LoadSkill(name string) (*skills.Skill, error) {
+	// Buscar en todas las categorias
+	categories, err := os.ReadDir(l.basePath)
+	if err != nil {
+		return nil, fmt.Errorf("read skills dir: %w", err)
+	}
+
+	for _, catEntry := range categories {
+		if !catEntry.IsDir() {
+			continue
+		}
+
+		category := catEntry.Name()
+		skillPath := filepath.Join(l.basePath, category, name)
+		skillMdPath := filepath.Join(skillPath, "SKILL.md")
+
+		if _, err := os.Stat(skillMdPath); os.IsNotExist(err) {
+			continue
+		}
+
+		// Parsear skill completa
+		meta, instructions, err := parseSkillMD(skillMdPath)
+		if err != nil {
+			return nil, fmt.Errorf("parse %s: %w", skillMdPath, err)
+		}
+
+		meta.Category = category
+
+		skill := &skills.Skill{
+			Meta:         meta,
+			Instructions: instructions,
+			BasePath:     skillPath,
+			Scripts:      listFiles(filepath.Join(skillPath, "scripts")),
+			References:   listFiles(filepath.Join(skillPath, "references")),
+			Templates:    listFiles(filepath.Join(skillPath, "templates")),
+			Assets:       listFiles(filepath.Join(skillPath, "assets")),
+		}
+
+		return skill, nil
+	}
+
+	return nil, fmt.Errorf("skill not found: %s", name)
+}
+
+// ReadResource lee un recurso especifico (nivel 3) de una skill.
+// path es relativo a la skill (ej: "scripts/deploy.sh", "references/api.md").
+func (l *Loader) ReadResource(skillName, resourcePath string) (string, error) {
+	skill, err := l.LoadSkill(skillName)
+	if err != nil {
+		return "", err
+	}
+
+	fullPath := filepath.Join(skill.BasePath, resourcePath)
+
+	// Validar que el path esta dentro de la skill (evitar path traversal)
+	absBasePath, err := filepath.Abs(skill.BasePath)
+	if err != nil {
+		return "", fmt.Errorf("abs base path: %w", err)
+	}
+
+	absFullPath, err := filepath.Abs(fullPath)
+	if err != nil {
+		return "", fmt.Errorf("abs resource path: %w", err)
+	}
+
+	if !strings.HasPrefix(absFullPath, absBasePath) {
+		return "", fmt.Errorf("path traversal detected: %s", resourcePath)
+	}
+
+	content, err := os.ReadFile(absFullPath)
+	if err != nil {
+		return "", fmt.Errorf("read resource: %w", err)
+	}
+
+	return string(content), nil
+}
+
+// parseSkillMD extrae el frontmatter YAML y el cuerpo markdown de un SKILL.md.
+func parseSkillMD(path string) (skills.SkillMeta, string, error) {
+	f, err := os.Open(path)
+	if err != nil {
+		return skills.SkillMeta{}, "", err
+	}
+	defer f.Close()
+
+	scanner := bufio.NewScanner(f)
+	var yamlLines []string
+	var bodyLines []string
+	inYAML := false
+	yamlClosed := false
+
+	for scanner.Scan() {
+		line := scanner.Text()
+
+		if strings.TrimSpace(line) == "---" {
+			if !inYAML {
+				inYAML = true
+				continue
+			} else {
+				inYAML = false
+				yamlClosed = true
+				continue
+			}
+		}
+
+		if inYAML {
+			yamlLines = append(yamlLines, line)
+		} else if yamlClosed {
+			bodyLines = append(bodyLines, line)
+		}
+	}
+
+	if err := scanner.Err(); err != nil {
+		return skills.SkillMeta{}, "", err
+	}
+
+	// Parse YAML frontmatter
+	var meta skills.SkillMeta
+	yamlStr := strings.Join(yamlLines, "\n")
+	if err := yaml.Unmarshal([]byte(yamlStr), &meta); err != nil {
+		return skills.SkillMeta{}, "", fmt.Errorf("parse yaml: %w", err)
+	}
+
+	// Cuerpo markdown
+	body := strings.Join(bodyLines, "\n")
+
+	return meta, body, nil
+}
+
+// listFiles retorna una lista de archivos (rutas relativas) dentro de un directorio.
+// Si el directorio no existe, retorna una lista vacia.
+func listFiles(dir string) []string {
+	entries, err := os.ReadDir(dir)
+	if err != nil {
+		return nil
+	}
+
+	var files []string
+	for _, entry := range entries {
+		if !entry.IsDir() {
+			files = append(files, entry.Name())
+		}
+	}
+	return files
+}
@@ -0,0 +1,131 @@
+package skills
+
+import (
+	"os"
+	"path/filepath"
+	"testing"
+)
+
+func TestLoader(t *testing.T) {
+	// Create temporary skills directory structure
+	tmpDir := t.TempDir()
+
+	// Create a test skill
+	skillDir := filepath.Join(tmpDir, "devops", "test-skill")
+	if err := os.MkdirAll(skillDir, 0755); err != nil {
+		t.Fatal(err)
+	}
+
+	// Write SKILL.md
+	skillMD := `---
+name: test-skill
+description: A test skill for unit testing
+---
+
+# Test Skill
+
+This is the instructions body.
+It has multiple lines.
+`
+	skillMDPath := filepath.Join(skillDir, "SKILL.md")
+	if err := os.WriteFile(skillMDPath, []byte(skillMD), 0644); err != nil {
+		t.Fatal(err)
+	}
+
+	// Create scripts/ directory with a test script
+	scriptsDir := filepath.Join(skillDir, "scripts")
+	if err := os.MkdirAll(scriptsDir, 0755); err != nil {
+		t.Fatal(err)
+	}
+	scriptPath := filepath.Join(scriptsDir, "test.sh")
+	if err := os.WriteFile(scriptPath, []byte("#!/bin/bash\necho test"), 0755); err != nil {
+		t.Fatal(err)
+	}
+
+	// Create references/ directory with a test reference
+	refsDir := filepath.Join(skillDir, "references")
+	if err := os.MkdirAll(refsDir, 0755); err != nil {
+		t.Fatal(err)
+	}
+	refPath := filepath.Join(refsDir, "api.md")
+	if err := os.WriteFile(refPath, []byte("# API Reference"), 0644); err != nil {
+		t.Fatal(err)
+	}
+
+	loader := NewLoader(tmpDir)
+
+	// Test LoadMeta
+	t.Run("LoadMeta", func(t *testing.T) {
+		metas, err := loader.LoadMeta()
+		if err != nil {
+			t.Fatalf("LoadMeta failed: %v", err)
+		}
+
+		if len(metas) != 1 {
+			t.Fatalf("expected 1 skill, got %d", len(metas))
+		}
+
+		meta := metas[0]
+		if meta.Name != "test-skill" {
+			t.Errorf("expected name 'test-skill', got %q", meta.Name)
+		}
+		if meta.Category != "devops" {
+			t.Errorf("expected category 'devops', got %q", meta.Category)
+		}
+		if meta.Description != "A test skill for unit testing" {
+			t.Errorf("expected description 'A test skill for unit testing', got %q", meta.Description)
+		}
+	})
+
+	// Test LoadSkill
+	t.Run("LoadSkill", func(t *testing.T) {
+		skill, err := loader.LoadSkill("test-skill")
+		if err != nil {
+			t.Fatalf("LoadSkill failed: %v", err)
+		}
+
+		if skill.Meta.Name != "test-skill" {
+			t.Errorf("expected name 'test-skill', got %q", skill.Meta.Name)
+		}
+
+		if skill.Instructions == "" {
+			t.Error("instructions should not be empty")
+		}
+
+		if len(skill.Scripts) != 1 || skill.Scripts[0] != "test.sh" {
+			t.Errorf("expected Scripts=['test.sh'], got %v", skill.Scripts)
+		}
+
+		if len(skill.References) != 1 || skill.References[0] != "api.md" {
+			t.Errorf("expected References=['api.md'], got %v", skill.References)
+		}
+	})
+
+	// Test LoadSkill nonexistent
+	t.Run("LoadSkill_nonexistent", func(t *testing.T) {
+		_, err := loader.LoadSkill("nonexistent")
+		if err == nil {
+			t.Error("expected error for nonexistent skill")
+		}
+	})
+
+	// Test ReadResource
+	t.Run("ReadResource", func(t *testing.T) {
+		content, err := loader.ReadResource("test-skill", "scripts/test.sh")
+		if err != nil {
+			t.Fatalf("ReadResource failed: %v", err)
+		}
+
+		if content != "#!/bin/bash\necho test" {
+			t.Errorf("unexpected content: %q", content)
+		}
+	})
+
+	// Test ReadResource path traversal protection
+	t.Run("ReadResource_path_traversal", func(t *testing.T) {
+		_, err := loader.ReadResource("test-skill", "../../../etc/passwd")
+		if err == nil {
+			t.Error("expected error for path traversal attempt")
+		}
+	})
+}
@@ -1,62 +1,96 @@
-# Skills
+# Sistema de Skills

-Sistema de skills reutilizables para agentes. Las skills son paquetes de instrucciones, scripts y recursos que guian al agente para completar tareas complejas multi-paso.
+Sistema de skills reutilizables para agentes Matrix. Las skills son paquetes de instrucciones, scripts y recursos que amplian las capacidades de un agente mas alla de las tools de function calling.

 ## Diferencia entre Tools y Skills

-| | Tools | Skills |
-|---|---|---|
-| **Nivel** | Funcion atomica | Flujo multi-paso |
-| **Invocacion** | Function calling del LLM | El agente busca y carga bajo demanda |
-| **Ejemplo** | `ssh_command`, `http_get` | "deploy-service", "log-analyzer" |
-| **Ubicacion** | `tools/<nombre>/` | `skills/<categoria>/<nombre>/` |
+- **Tools** (`tools/`) — funciones atomicas que el LLM invoca via function calling (ssh_command, http_get, clock, etc.)
+- **Skills** (`skills/`) — flujos completos de trabajo multi-paso que combinan tools, logica condicional y conocimiento de dominio
+
+Ejemplo:
+- Tool: `ssh_command` — ejecuta un comando SSH
+- Skill: `deploy-service` — usa ssh_command, http_get y logica para hacer un deploy completo

 ## Estructura de una skill

 ```
-skills/<categoria>/<nombre>/
-├── SKILL.md              ← obligatorio (frontmatter YAML + instrucciones)
-├── scripts/              ← opcional, codigo ejecutable
+skills/<categoria>/<skill-name>/
+├── SKILL.md              ← obligatorio (frontmatter YAML + instrucciones markdown)
+├── LICENSE.txt            ← opcional
+├── scripts/              ← opcional, codigo ejecutable (bash, python, etc.)
 ├── references/           ← opcional, docs de referencia
-├── templates/            ← opcional, plantillas
-└── assets/               ← opcional, archivos estaticos
+├── templates/            ← opcional, plantillas/assets
+└── assets/               ← opcional, fuentes, iconos, etc.
 ```

-## SKILL.md — formato
+### SKILL.md — formato

 ```yaml
 ---
-name: nombre-skill
+name: skill-name
 description: >
-  Descripcion de que hace y cuando activarse.
+  Descripcion clara de que hace la skill y cuando debe activarse.
+  Esta descripcion es el mecanismo principal de triggering.
 ---

 # Instrucciones

-Cuerpo markdown con instrucciones completas (< 500 lineas idealmente).
+Cuerpo markdown con las instrucciones completas.
+Idealmente < 500 lineas.
 ```

-## Carga progresiva
+## Carga progresiva (3 niveles)

-1. **Metadata** (name + description) — siempre en contexto del agente
-2. **Instrucciones** (cuerpo SKILL.md) — cuando la skill se activa
-3. **Recursos** (scripts/, references/, etc.) — bajo demanda
+El sistema carga skills de forma progresiva para optimizar el uso del contexto del LLM:

-## Categorias
+1. **Metadata** (name + description) — siempre en contexto (~100 palabras). El agente la lee para decidir si activar la skill.
+2. **Cuerpo del SKILL.md** — se carga cuando la skill se activa. Instrucciones principales.
+3. **Recursos bundled** (scripts/, references/, etc.) — se cargan bajo demanda. El SKILL.md indica cuando leer cada archivo.

-| Categoria | Descripcion |
-|-----------|-------------|
-| `devops/` | Operaciones, deploy, infraestructura |
-| `analysis/` | Analisis de datos, logs, metricas |
-| `communication/` | Notificaciones, reportes, mensajeria |
-| `coding/` | Desarrollo, code review, refactoring |
-| `system/` | Administracion de sistemas, monitoreo |
+## Carpetas opcionales

-## Crear una nueva skill
+| Carpeta | Proposito |
+|---------|-----------|
+| `scripts/` | Codigo ejecutable que el agente corre (bash, python). Puede ejecutarlos sin cargarlos en contexto. |
+| `references/` | Documentacion extensa, leida solo cuando es relevante. Si > 300 lineas, agregar TOC al inicio. |
+| `templates/` | Plantillas que la skill usa como base para generar outputs. |
+| `assets/` | Archivos estaticos (fuentes, iconos, imagenes). |

-1. Crear directorio: `skills/<categoria>/<nombre>/`
-2. Crear `SKILL.md` con frontmatter YAML (name + description) y cuerpo markdown
-3. Opcionalmente agregar scripts/, references/, templates/, assets/
-4. La skill estara disponible automaticamente para agentes con `skills.enabled: true`
+## Categorias de skills

-Ver regla completa en `.claude/rules/create_skill.md` (pendiente).
+- **`devops/`** — operaciones y deploy
+- **`analysis/`** — analisis de datos/logs
+- **`communication/`** — comunicacion y notificaciones
+- **`coding/`** — desarrollo y code review
+- **`system/`** — administracion del sistema
+
+## Uso desde agentes
+
+Los agentes pueden interactuar con skills via function calling:
+
+1. **`skill_search`** — busca skills relevantes por query
+2. **`skill_load`** — carga instrucciones completas de una skill
+3. **`skill_read_resource`** — lee un recurso especifico (script, reference, template)
+4. **`skill_run_script`** — ejecuta un script de la skill con argumentos
+
+## Configuracion
+
+Las skills se configuran por agente en el YAML de configuracion:
+
+```yaml
+skills:
+  enabled: true
+  path: "skills/"
+  categories: ["devops", "system"]  # filtro opcional
+```
+
+## Seguridad
+
+- Los scripts de skills tienen las mismas restricciones que ssh_command
+- Allowlist de interpreters permitidos (bash, python3, sh)
+- Timeout obligatorio en ejecucion
+- Sin acceso directo a secretos
+
+## Crear nuevas skills
+
+Ver `.claude/rules/create_skill.md` para la guia completa de creacion de skills.
@@ -1,70 +1,123 @@
 ---
 name: log-analyzer
 description: >
-  Analiza logs de servicios para encontrar errores, patrones y anomalias.
-  Usa esta skill cuando el usuario pida revisar logs, buscar errores en un
-  servicio, diagnosticar problemas, o entender que paso en un periodo de tiempo.
-  Funciona con journalctl, archivos de log, y logs de Docker.
+  Analiza logs de servicios buscando patrones de errores, warnings y anomalias.
+  Genera un resumen estructurado con metricas, errores frecuentes y recomendaciones.
 ---

-# Analisis de logs
+# Log Analyzer Skill

-## Prerequisitos
+Esta skill analiza logs de servicios y genera reportes estructurados con hallazgos y recomendaciones.

- Tool `ssh_command` habilitada (para logs remotos)
- Acceso al servidor donde estan los logs
+## Casos de uso

-## Flujo
+- Analizar logs de un servicio que esta fallando
+- Buscar patrones de errores recurrentes
+- Generar metricas de salud de un servicio
+- Detectar anomalias en logs

-### 1. Identificar fuente de logs
+## Proceso de analisis

-Preguntar o inferir del contexto:
- **Servicio**: nombre del servicio o contenedor
- **Periodo**: rango de tiempo a analizar (default: ultima hora)
- **Servidor**: host donde corren los logs
- **Tipo**: systemd (journalctl), archivo (/var/log/...), o Docker
+### 1. Obtener los logs

-### 2. Obtener logs
+Opciones:
+- Via SSH: `ssh_command` con `journalctl` o `tail`
+- Via HTTP: `http_get` si el servicio expone logs via API
+- Desde archivo local: `file_read` (si el agente tiene la tool)

-Segun el tipo de fuente:
-
-**Systemd (journalctl):**
+Ejemplo con journalctl:
 ```bash
-ssh_command: "journalctl -u <servicio> --since '<periodo>' --no-pager"
+journalctl -u <service-name> --since "1 hour ago" -n 1000
 ```

-**Archivo de log:**
-```bash
-ssh_command: "tail -n 500 /var/log/<servicio>/<archivo>.log"
-# o con filtro de tiempo:
-ssh_command: "awk '/2024-01-15 14:00/,/2024-01-15 15:00/' /var/log/<servicio>.log"
+### 2. Parsear los logs
+
+Identifica el formato de logs:
+- JSON estructurado
+- Formato de systemd
+- Logs planos con timestamp
+
+Extrae campos clave:
+- Timestamp
+- Nivel de log (ERROR, WARN, INFO, DEBUG)
+- Mensaje
+- Stack traces (si aplica)
+
+### 3. Analizar patrones
+
+Busca:
+- Errores recurrentes (agrupa por mensaje similar)
+- Picos de actividad (timeframes con muchos logs)
+- Errores criticos (FATAL, PANIC, segfaults)
+- Timeouts y connection errors
+- Excepciones no manejadas
+
+### 4. Generar metricas
+
+Calcula:
+- Total de lineas analizadas
+- Conteo por nivel (ERROR, WARN, INFO)
+- Top 10 errores mas frecuentes
+- Timeline de errores (distribucion temporal)
+- Rate de errores (errores por minuto)
+
+### 5. Generar reporte
+
+Formato del reporte:
+
+```markdown
+## Log Analysis Report
+
+**Service**: <service-name>
+**Period**: <start> - <end>
+**Total lines**: <N>
+
+### Metrics
+
+- Errors: <N> (<percentage>%)
+- Warnings: <N> (<percentage>%)
+- Info: <N> (<percentage>%)
+- Error rate: <N> errors/min
+
+### Top Errors
+
+1. <error-message> (<N> occurrences)
+2. <error-message> (<N> occurrences)
+...
+
+### Critical Issues
+
+- <description>
+- <description>
+
+### Recommendations
+
+- <recommendation>
+- <recommendation>
 ```

-**Docker:**
-```bash
-ssh_command: "docker logs --since <periodo> <contenedor> 2>&1 | tail -500"
-```
+## Parametros requeridos

-### 3. Analisis
+- `source`: "ssh", "http", o "file"
+- `service_name`: nombre del servicio (si source=ssh)
+- `host`: servidor (si source=ssh)
+- `log_url`: URL de logs (si source=http)
+- `file_path`: ruta al archivo (si source=file)
+- `timeframe`: "1 hour", "24 hours", "7 days", etc.

-Buscar en los logs:
- **Errores**: lineas con ERROR, FATAL, panic, exception, traceback
- **Warnings**: lineas con WARN, warning
- **Patrones repetitivos**: errores que se repiten (agrupar por tipo)
- **Timestamps**: cuando empezaron los problemas
- **Correlaciones**: errores que ocurren juntos o en secuencia
+Parametros opcionales:
+- `filter`: patron regex para filtrar lineas
+- `max_lines`: limite de lineas a analizar (default: 10000)
+- `output_format`: "markdown" o "json"

-### 4. Reporte
+## Ejemplo de uso

-Presentar al usuario:
-1. **Resumen**: estado general (saludable / con problemas / critico)
-2. **Errores encontrados**: listado agrupado por tipo con conteo
-3. **Timeline**: cuando empezaron y si siguen ocurriendo
-4. **Causa probable**: si se puede inferir del contexto
-5. **Recomendacion**: accion sugerida (restart, fix config, escalar, etc.)
+Usuario: "Analiza los logs de myapp en prod-server-01 de la ultima hora"

-## Tips
-
- Si los logs son muy extensos, obtener primero un conteo de errores y luego los detalles de los mas frecuentes
- Usar `grep -c` para contar antes de traer lineas completas
- Para logs grandes, usar `tail` o rangos de tiempo acotados
+Agente:
+1. skill_search("analyze logs")
+2. skill_load("log-analyzer")
+3. ssh_command para obtener logs via journalctl
+4. Parsear y analizar logs
+5. Generar reporte markdown
+6. Enviar reporte al usuario
@@ -1,84 +1,166 @@
 ---
 name: daily-report
 description: >
-  Genera y envia un reporte diario con el estado de servicios, metricas clave
-  y eventos relevantes. Usa esta skill cuando el usuario pida un resumen del
-  dia, estado general de los sistemas, o un reporte periodico. Tambien puede
-  activarse automaticamente via schedule.
+  Genera y envia un reporte diario con metricas de servicios, estado de salud,
+  incidentes recientes y tareas pendientes. Puede enviarse via Matrix a un room
+  especifico o guardarse como archivo.
 ---

-# Reporte diario
+# Daily Report Skill

-## Prerequisitos
+Esta skill genera reportes diarios automaticos consolidando informacion de multiples fuentes.

- Tool `ssh_command` para consultar estado de servicios
- Tool `http_get` para health checks
- Tool `matrix_send` para enviar el reporte a una room
+## Proposito

-## Flujo
+- Proveer visibilidad diaria del estado de servicios
+- Consolidar metricas de diferentes fuentes
+- Alertar sobre anomalias o degradacion
+- Tracking de incidentes y resoluciones

-### 1. Recopilar datos
+## Fuentes de datos

-Ejecutar en paralelo cuando sea posible:
+El reporte puede incluir datos de:
+- Estado de servicios (via SSH + systemctl)
+- Metricas HTTP (via health endpoints)
+- Analisis de logs (via log-analyzer skill)
+- Uso de recursos (CPU, memoria, disco via SSH)
+- Incidentes recientes (desde base de datos o API)

-**Estado de servicios:**
-```bash
-ssh_command: "systemctl list-units --type=service --state=running --no-pager | grep -E '<servicios>'"
-ssh_command: "systemctl list-units --type=service --state=failed --no-pager"
-```
-
-**Uso de recursos:**
-```bash
-ssh_command: "df -h / | tail -1"
-ssh_command: "free -h | grep Mem"
-ssh_command: "uptime"
-```
-
-**Errores recientes:**
-```bash
-ssh_command: "journalctl --priority=err --since '24 hours ago' --no-pager | wc -l"
-ssh_command: "journalctl --priority=err --since '24 hours ago' --no-pager | tail -10"
-```
-
-**Health checks HTTP:**
-```
-http_get: "<url>/health" para cada servicio con endpoint
-```
-
-### 2. Formatear reporte
-
-Generar markdown con:
+## Estructura del reporte

 ```markdown
-## Reporte diario — <fecha>
+# Daily Report - <date>

-### Estado de servicios
-| Servicio | Estado | Uptime |
-|----------|--------|--------|
-| servicio-1 | OK | 5d 3h |
-| servicio-2 | FAILED | - |
+## Services Status

-### Recursos
- **Disco**: 45% usado (55GB libres)
- **Memoria**: 3.2GB / 8GB (40%)
- **Load average**: 0.5, 0.3, 0.2
+| Service | Host | Status | Uptime |
+|---------|------|--------|--------|
+| myapp   | prod-01 | running | 15d 3h |
+| worker  | prod-02 | running | 2d 8h  |

-### Errores (ultimas 24h)
- Total: 15 errores
- Mas frecuente: "connection timeout" (8 veces)
+## Health Metrics

-### Alertas
- servicio-2 esta caido desde las 14:30
- Disco al 85% en /var/log (limpiar)
+- Total requests: <N>
+- Error rate: <percentage>%
+- Avg response time: <N>ms
+- P99 latency: <N>ms
+
+## Incidents
+
+- [RESOLVED] Database connection timeout - 14:30 - Fixed by restarting pool
+- [OPEN] High memory usage on worker - Since 18:00
+
+## Warnings
+
+- Service X disk usage: 85%
+- Service Y error rate: 3.2% (threshold: 2%)
+
+## System Resources
+
+| Host | CPU | Memory | Disk |
+|------|-----|--------|------|
+| prod-01 | 45% | 62% | 71% |
+| prod-02 | 23% | 48% | 55% |
+
+## Recommendations
+
+- Investigate memory leak in worker service
+- Plan disk cleanup on prod-01
+
+---
+Generated by <agent-name> at <timestamp>
 ```

-### 3. Enviar
+## Proceso de generacion

-Enviar el reporte formateado a la room configurada o a la room donde fue solicitado.
+### 1. Recopilar datos de servicios

-## Personalizacion
+Para cada servicio configurado:
+```bash
+systemctl status <service> --no-pager
+```

-El reporte puede adaptarse segun:
- Lista de servicios a monitorear (del config del agente)
- Servidores a consultar (de ssh.allowed_targets)
- Umbrales de alerta (disco > 80%, memoria > 90%, etc.)
+Extrae: estado, uptime, ultimos logs
+
+### 2. Verificar health endpoints
+
+Si el servicio expone /health o /metrics:
+```bash
+http_get http://<host>:<port>/health
+```
+
+### 3. Analizar logs recientes
+
+Usa `log-analyzer` skill para cada servicio:
+- Ultimas 24h de logs
+- Conteo de errores/warnings
+- Errores criticos
+
+### 4. Obtener metricas de sistema
+
+```bash
+# CPU y memoria
+top -bn1 | head -20
+
+# Disco
+df -h
+```
+
+### 5. Consolidar y formatear
+
+- Genera el markdown del reporte
+- Aplica template si existe (templates/daily-report.md)
+- Incluye timestamp y firma del agente
+
+### 6. Enviar reporte
+
+Opciones:
+- Enviar a Matrix room (via send_message)
+- Guardar como archivo (via file_write)
+- Enviar via email (si hay tool de email)
+
+## Configuracion
+
+El agente debe tener configurado:
+- Lista de servicios a monitorear
+- Hosts donde corren
+- Health endpoints (opcional)
+- Destination room o file path para el reporte
+
+Ejemplo de config (en el agent config YAML):
+```yaml
+daily_report:
+  services:
+    - name: myapp
+      host: prod-01
+      health_url: http://localhost:8080/health
+    - name: worker
+      host: prod-02
+  destination:
+    type: matrix
+    room_id: "!reportroom:matrix.org"
+  schedule: "0 9 * * *"  # 9am diario
+```
+
+## Parametros
+
+Parametros opcionales al ejecutar manualmente:
+- `date`: fecha del reporte (default: today)
+- `services`: lista de servicios a incluir (default: todos configurados)
+- `destination`: override del destino (room_id o file_path)
+- `include_recommendations`: true/false (default: true)
+
+## Ejemplo de uso
+
+Usuario: "Genera el reporte diario"
+
+Agente:
+1. skill_search("daily report")
+2. skill_load("daily-report")
+3. Recopilar datos de todos los servicios configurados
+4. Generar markdown del reporte
+5. Enviar al room configurado o mostrar al usuario
+
+## Automatizacion
+
+Esta skill esta disenada para ejecutarse via cron. Ver `crons/daily-report/` para la configuracion de la automatizacion.
@@ -1,89 +1,111 @@
 ---
 name: deploy-service
 description: >
-  Despliega un servicio en un servidor remoto via SSH. Usa esta skill cuando
-  el usuario pida hacer deploy, actualizar un servicio, o subir cambios a
-  produccion/staging. Cubre: pull de codigo, build, restart del servicio,
-  y verificacion post-deploy.
+  Deploy de un servicio via SSH a un servidor remoto. Verifica que el servicio
+  este corriendo, hace backup de la version anterior, actualiza el binario,
+  reinicia el servicio y valida que responda correctamente.
 ---

-# Deploy de servicio via SSH
+# Deploy Service Skill
+
+Esta skill guia el proceso completo de deploy de un servicio a produccion via SSH.

 ## Prerequisitos

- El agente debe tener la tool `ssh_command` habilitada
- El servidor destino debe estar configurado en `ssh.allowed_targets`
- El usuario debe tener permisos de deploy (verificar roles si RBAC esta activo)
+- Acceso SSH al servidor de destino
+- El servicio debe estar configurado como systemd unit
+- El binario compilado debe estar disponible localmente o via URL

-## Flujo
+## Proceso de deploy

-### 1. Confirmar parametros
+### 1. Verificar estado del servicio

-Antes de ejecutar, confirmar con el usuario:
- **Servicio**: nombre del servicio a desplegar
- **Servidor**: host destino (debe estar en allowed_targets)
- **Branch/tag**: rama o tag a desplegar (default: main)
- **Ruta**: directorio del servicio en el servidor
-
-### 2. Pre-checks
-
-Ejecutar en el servidor remoto:
-```bash
-# Verificar conectividad
-ssh_command: "echo 'OK'" en el host destino
-
-# Verificar que el directorio existe
-ssh_command: "test -d /path/to/service && echo 'exists'"
-
-# Verificar estado actual del servicio
-ssh_command: "systemctl is-active nombre-servicio || true"
-```
-
-Si algun pre-check falla, informar al usuario y no continuar.
-
-### 3. Deploy
-
-Ejecutar secuencialmente:
-```bash
-# Pull de cambios
-ssh_command: "cd /path/to/service && git fetch origin && git checkout <branch> && git pull"
-
-# Build (si aplica)
-ssh_command: "cd /path/to/service && make build"
-# o: "cd /path/to/service && go build -o bin/service ./cmd/..."
-# o: "cd /path/to/service && docker-compose build"
-
-# Restart del servicio
-ssh_command: "sudo systemctl restart nombre-servicio"
-```
-
-### 4. Verificacion post-deploy
+Usa `ssh_command` para verificar el estado actual del servicio:

 ```bash
-# Esperar 5 segundos para que el servicio arranque
-ssh_command: "sleep 5 && systemctl is-active nombre-servicio"
-
-# Verificar logs recientes (buscar errores)
-ssh_command: "journalctl -u nombre-servicio --since '1 min ago' --no-pager | tail -20"
-
-# Health check HTTP si aplica
-http_get: "http://servidor:puerto/health"
+systemctl status <service-name>
 ```

-### 5. Reportar resultado
+Si el servicio no existe, pregunta al usuario si debe crearlo.

-Informar al usuario:
- Estado del deploy (exitoso/fallido)
- Version desplegada (commit hash o tag)
- Estado del servicio post-deploy
- Cualquier warning en los logs
+### 2. Crear backup de la version anterior

-## Rollback
-
-Si el deploy falla o el servicio no arranca:
 ```bash
-ssh_command: "cd /path/to/service && git checkout <commit-anterior>"
-ssh_command: "sudo systemctl restart nombre-servicio"
+cp /path/to/service /path/to/service.backup.$(date +%Y%m%d-%H%M%S)
 ```

-Informar al usuario que se hizo rollback y el motivo.
+### 3. Detener el servicio
+
+```bash
+systemctl stop <service-name>
+```
+
+### 4. Actualizar el binario
+
+Opciones:
+- Si el binario esta local: usa `scp` o `ssh_command` con heredoc
+- Si el binario esta en URL: usa `ssh_command` con `wget` o `curl`
+
+```bash
+# Ejemplo con URL
+wget -O /path/to/service <binary-url>
+chmod +x /path/to/service
+```
+
+### 5. Reiniciar el servicio
+
+```bash
+systemctl start <service-name>
+```
+
+### 6. Verificar que el servicio responde
+
+Espera 5 segundos y verifica:
+
+```bash
+systemctl is-active <service-name>
+```
+
+Si el servicio expone un endpoint HTTP, usa `http_get` para verificar que responde:
+
+```bash
+curl -f http://localhost:<port>/health
+```
+
+### 7. Rollback en caso de error
+
+Si el servicio no arranca o no responde:
+
+1. Detener el servicio
+2. Restaurar el backup
+3. Reiniciar con la version anterior
+4. Notificar al usuario del error
+
+## Parametros requeridos
+
+El usuario debe proporcionar:
+- `host`: servidor de destino (ej: "prod-server-01")
+- `service_name`: nombre del systemd unit (ej: "myapp.service")
+- `service_path`: ruta al binario en el servidor (ej: "/usr/local/bin/myapp")
+- `binary_source`: "local" o URL del binario
+
+Parametros opcionales:
+- `health_endpoint`: endpoint HTTP para verificar salud (ej: "http://localhost:8080/health")
+- `post_deploy_command`: comando adicional a ejecutar despues del deploy
+
+## Seguridad
+
+- Valida que el host este en la allowlist de SSH del agente
+- Valida que el binario tenga checksum correcto (si se proporciona)
+- Nunca ejecutes comandos arbitrarios sin validar
+
+## Ejemplo de uso
+
+Usuario: "Haz deploy de myapp a prod-server-01"
+
+Agente:
+1. skill_search("deploy service")
+2. skill_load("deploy-service")
+3. Preguntar parametros faltantes
+4. Ejecutar el proceso paso a paso
+5. Reportar resultado
@@ -1,95 +1,187 @@
 ---
 name: health-check
 description: >
-  Verifica la salud de uno o varios servicios y servidores. Usa esta skill
-  cuando el usuario pregunte si algo esta funcionando, si un servicio esta
-  arriba, o pida un chequeo de salud general. Tambien util despues de un
-  deploy o un incidente.
+  Verifica la salud de servicios y sistemas. Valida que servicios esten corriendo,
+  endpoints HTTP respondan, uso de recursos este dentro de limites, y no haya
+  errores criticos en logs recientes.
 ---

-# Health check de servicios
+# Health Check Skill

-## Prerequisitos
+Esta skill realiza verificaciones de salud completas de servicios y sistemas.

- Tool `ssh_command` habilitada
- Tool `http_get` para endpoints HTTP
- Servidores en `ssh.allowed_targets`
+## Proposito

-## Flujo
+- Verificar que servicios criticos esten corriendo
+- Validar que endpoints HTTP respondan correctamente
+- Detectar uso excesivo de recursos (CPU, memoria, disco)
+- Identificar errores criticos en logs recientes
+- Generar reporte de salud con score y recomendaciones

-### 1. Determinar alcance
+## Verificaciones realizadas

-Inferir del contexto del usuario:
- **Servicio especifico**: checkear solo ese servicio
- **Servidor especifico**: checkear todos los servicios en ese servidor
- **General**: checkear todos los servidores y servicios conocidos
+### 1. Estado de servicios (systemd)

-### 2. Checks basicos por servidor
-
-Para cada servidor:
+Para cada servicio configurado:
 ```bash
-# Conectividad
-ssh_command: "echo OK"
-
-# Uptime y load
-ssh_command: "uptime"
-
-# Disco
-ssh_command: "df -h / /var /tmp 2>/dev/null | tail -n +2"
-
-# Memoria
-ssh_command: "free -m | grep -E 'Mem|Swap'"
-
-# Procesos zombie o en estado D
-ssh_command: "ps aux | awk '{if ($8 ~ /Z|D/) print}' | head -5"
+systemctl is-active <service-name>
 ```

-### 3. Checks por servicio
+Estado esperado: `active`

-Para cada servicio:
+### 2. Health endpoints HTTP
+
+Si el servicio expone endpoint de salud:
 ```bash
-# Estado systemd
-ssh_command: "systemctl is-active <servicio>"
-
-# Tiempo activo
-ssh_command: "systemctl show <servicio> --property=ActiveEnterTimestamp --value"
-
-# Puertos abiertos
-ssh_command: "ss -tlnp | grep <puerto>"
-
-# Ultimos errores
-ssh_command: "journalctl -u <servicio> --priority=err --since '1 hour ago' --no-pager | tail -5"
+http_get http://<host>:<port>/health
 ```

-**Health check HTTP** (si tiene endpoint):
-```
-http_get: "http://<host>:<puerto>/health"
+Validaciones:
+- Status code: 200
+- Response time: < 1000ms
+- Body contiene: `"status": "ok"` (o similar)
+
+### 3. Recursos del sistema
+
+```bash
+# CPU usage
+top -bn1 | grep "Cpu(s)" | awk '{print $2}'
+
+# Memory usage
+free -m | awk 'NR==2{printf "%.0f", $3*100/$2}'
+
+# Disk usage
+df -h / | awk 'NR==2{print $5}' | sed 's/%//'
 ```

-### 4. Evaluar y reportar
+Thresholds:
+- CPU: warning >70%, critical >90%
+- Memory: warning >80%, critical >95%
+- Disk: warning >85%, critical >95%

-Clasificar cada componente:
+### 4. Logs recientes (ultimos 15 minutos)

-| Estado | Criterio |
-|--------|----------|
-| OK | Servicio activo, sin errores recientes, recursos normales |
-| WARNING | Servicio activo pero con errores recientes, o recursos > 80% |
-| CRITICAL | Servicio caido, disco lleno, o memoria agotada |
-
-Reportar al usuario con formato claro:
-```
-Health Check — <fecha hora>
-
-[OK] servidor-1: load 0.3, disco 45%, mem 40%
-  [OK] servicio-a: activo (uptime 5d), 0 errores
-  [WARNING] servicio-b: activo, 3 errores en ultima hora
-[CRITICAL] servidor-2: no responde SSH
+```bash
+journalctl -u <service> --since "15 minutes ago" | grep -i "error\|fatal\|panic"
 ```

-### 5. Recomendaciones
+Validacion: sin errores criticos en los ultimos 15 minutos

-Si hay problemas, sugerir acciones:
- Servicio caido → "Intentar restart: `systemctl restart <servicio>`"
- Disco lleno → "Limpiar logs antiguos o expandir disco"
- Memoria alta → "Revisar procesos con mayor consumo"
- Errores frecuentes → "Revisar logs con la skill log-analyzer"
+### 5. Conectividad de red (opcional)
+
+Si el servicio depende de servicios externos:
+```bash
+# Test conectividad
+curl -f --max-time 5 http://<dependency-host>/health
+```
+
+## Formato del reporte
+
+```markdown
+# Health Check Report - <timestamp>
+
+## Overall Health: <HEALTHY|DEGRADED|CRITICAL>
+
+Score: <N>/100
+
+## Service Status
+
+| Service | Status | Health Endpoint | Response Time |
+|---------|--------|----------------|---------------|
+| myapp   | running | OK (200) | 45ms |
+| worker  | running | OK (200) | 32ms |
+
+## System Resources
+
+| Metric | Value | Status |
+|--------|-------|--------|
+| CPU Usage | 45% | OK |
+| Memory Usage | 62% | OK |
+| Disk Usage | 71% | OK |
+
+## Issues Found
+
+- None
+
+## Warnings
+
+- Disk usage on / approaching 75% threshold
+
+## Recommendations
+
+- Monitor disk usage trend
+- Consider log rotation policy
+
+---
+Next check: <timestamp>
+```
+
+## Score calculation
+
+Score total (0-100):
+- Services running: 40 puntos (dividido entre servicios)
+- Health endpoints OK: 30 puntos (dividido entre endpoints)
+- Resources within limits: 20 puntos
+- No critical errors in logs: 10 puntos
+
+Estado general:
+- HEALTHY: score >= 90
+- DEGRADED: score >= 70 && < 90
+- CRITICAL: score < 70
+
+## Parametros
+
+Parametros requeridos:
+- `services`: lista de servicios a verificar (default: todos configurados)
+
+Parametros opcionales:
+- `include_resources`: verificar recursos del sistema (default: true)
+- `include_logs`: verificar logs recientes (default: true)
+- `log_timeframe`: ventana de logs a verificar (default: "15 minutes ago")
+- `output_format`: "markdown" o "json" (default: "markdown")
+
+## Configuracion
+
+Ejemplo de configuracion en agent YAML:
+
+```yaml
+health_check:
+  services:
+    - name: myapp
+      host: localhost
+      health_url: http://localhost:8080/health
+      dependencies:
+        - http://db.example.com:5432
+    - name: worker
+      host: localhost
+      health_url: http://localhost:8081/health
+  thresholds:
+    cpu_warning: 70
+    cpu_critical: 90
+    memory_warning: 80
+    memory_critical: 95
+    disk_warning: 85
+    disk_critical: 95
+  check_interval: "5m"
+```
+
+## Ejemplo de uso
+
+Usuario: "Verifica la salud de todos los servicios"
+
+Agente:
+1. skill_search("health check")
+2. skill_load("health-check")
+3. Ejecutar verificaciones en orden
+4. Calcular score
+5. Generar reporte
+6. Enviar al usuario
+
+## Alertas automaticas
+
+Esta skill puede configurarse para ejecutarse periodicamente via cron y alertar solo si:
+- Score < 90 (DEGRADED o CRITICAL)
+- Algun servicio esta down
+- Recursos exceden threshold critico
+
+Ver `crons/health-check/` para la configuracion de automatizacion.
@@ -0,0 +1,195 @@
+package skilltools
+
+import (
+	"context"
+	"encoding/json"
+	"fmt"
+	"strings"
+
+	"github.com/enmanuel/agents/pkg/skills"
+	shellskills "github.com/enmanuel/agents/shell/skills"
+	"github.com/enmanuel/agents/tools"
+)
+
+// NewSkillSearch creates a skill_search tool that finds relevant skills.
+func NewSkillSearch(loader *shellskills.Loader, categories []string) tools.Tool {
+	return tools.Tool{
+		Def: tools.Def{
+			Name:        "skill_search",
+			Description: "Search for skills relevant to a query. Returns a list of skills with their names, descriptions, and relevance scores. Use this when you need to find a skill to help with a task.",
+			Parameters: []tools.Param{
+				{Name: "query", Type: "string", Description: "Search query describing the task or capability needed", Required: true},
+			},
+		},
+		Exec: func(ctx context.Context, args map[string]any) tools.Result {
+			query := tools.GetString(args, "query")
+			if query == "" {
+				return tools.Result{Err: fmt.Errorf("query is required")}
+			}
+
+			// Load all skill metadata
+			metas, err := loader.LoadMeta()
+			if err != nil {
+				return tools.Result{Err: fmt.Errorf("load skills metadata: %w", err)}
+			}
+
+			// Filter by categories if configured
+			metas = skills.FilterByCategory(metas, categories)
+
+			// Match skills to query
+			matches := skills.Match(query, metas)
+
+			if len(matches) == 0 {
+				return tools.Result{Output: "No skills found matching the query."}
+			}
+
+			// Format output
+			var lines []string
+			lines = append(lines, fmt.Sprintf("Found %d relevant skill(s):\n", len(matches)))
+			for i, match := range matches {
+				if i >= 5 {
+					break // limit to top 5
+				}
+				lines = append(lines, fmt.Sprintf("%d. **%s** (category: %s, confidence: %.2f)",
+					i+1, match.Skill.Name, match.Skill.Category, match.Confidence))
+				lines = append(lines, fmt.Sprintf("   %s\n", match.Skill.Description))
+			}
+
+			return tools.Result{Output: strings.Join(lines, "\n")}
+		},
+	}
+}
+
+// NewSkillLoad creates a skill_load tool that loads full instructions for a skill.
+func NewSkillLoad(loader *shellskills.Loader) tools.Tool {
+	return tools.Tool{
+		Def: tools.Def{
+			Name:        "skill_load",
+			Description: "Load the complete instructions for a skill. This returns the full markdown content of the skill, which you should follow to complete the task. Use this after finding a skill with skill_search.",
+			Parameters: []tools.Param{
+				{Name: "skill_name", Type: "string", Description: "Name of the skill to load", Required: true},
+			},
+		},
+		Exec: func(ctx context.Context, args map[string]any) tools.Result {
+			skillName := tools.GetString(args, "skill_name")
+			if skillName == "" {
+				return tools.Result{Err: fmt.Errorf("skill_name is required")}
+			}
+
+			skill, err := loader.LoadSkill(skillName)
+			if err != nil {
+				return tools.Result{Err: fmt.Errorf("load skill: %w", err)}
+			}
+
+			// Format output with metadata + instructions
+			var output strings.Builder
+			output.WriteString(fmt.Sprintf("# Skill: %s\n\n", skill.Meta.Name))
+			output.WriteString(fmt.Sprintf("**Category**: %s\n\n", skill.Meta.Category))
+			output.WriteString(fmt.Sprintf("**Description**: %s\n\n", skill.Meta.Description))
+
+			if len(skill.Scripts) > 0 {
+				output.WriteString(fmt.Sprintf("**Scripts available**: %s\n", strings.Join(skill.Scripts, ", ")))
+			}
+			if len(skill.References) > 0 {
+				output.WriteString(fmt.Sprintf("**References available**: %s\n", strings.Join(skill.References, ", ")))
+			}
+			if len(skill.Templates) > 0 {
+				output.WriteString(fmt.Sprintf("**Templates available**: %s\n", strings.Join(skill.Templates, ", ")))
+			}
+
+			output.WriteString("\n---\n\n")
+			output.WriteString(skill.Instructions)
+
+			return tools.Result{Output: output.String()}
+		},
+	}
+}
+
+// NewSkillReadResource creates a skill_read_resource tool that reads a specific resource.
+func NewSkillReadResource(loader *shellskills.Loader) tools.Tool {
+	return tools.Tool{
+		Def: tools.Def{
+			Name:        "skill_read_resource",
+			Description: "Read a specific resource file from a skill (script, reference doc, template, or asset). Use this to load additional documentation or code referenced in the skill instructions.",
+			Parameters: []tools.Param{
+				{Name: "skill_name", Type: "string", Description: "Name of the skill", Required: true},
+				{Name: "resource_path", Type: "string", Description: "Path to the resource relative to the skill directory (e.g., 'scripts/deploy.sh', 'references/api.md')", Required: true},
+			},
+		},
+		Exec: func(ctx context.Context, args map[string]any) tools.Result {
+			skillName := tools.GetString(args, "skill_name")
+			resourcePath := tools.GetString(args, "resource_path")
+
+			if skillName == "" || resourcePath == "" {
+				return tools.Result{Err: fmt.Errorf("skill_name and resource_path are required")}
+			}
+
+			content, err := loader.ReadResource(skillName, resourcePath)
+			if err != nil {
+				return tools.Result{Err: fmt.Errorf("read resource: %w", err)}
+			}
+
+			return tools.Result{Output: content}
+		},
+	}
+}
+
+// NewSkillRunScript creates a skill_run_script tool that executes a skill script.
+func NewSkillRunScript(loader *shellskills.Loader, executor *shellskills.Executor) tools.Tool {
+	return tools.Tool{
+		Def: tools.Def{
+			Name:        "skill_run_script",
+			Description: "Execute a script from a skill with the given arguments. The script must be in the skill's scripts/ directory and use an allowed interpreter. Returns the script output.",
+			Parameters: []tools.Param{
+				{Name: "skill_name", Type: "string", Description: "Name of the skill", Required: true},
+				{Name: "script_name", Type: "string", Description: "Name of the script file (e.g., 'deploy.sh')", Required: true},
+				{Name: "args", Type: "array", Description: "Array of arguments to pass to the script", Required: false},
+			},
+		},
+		Exec: func(ctx context.Context, args map[string]any) tools.Result {
+			skillName := tools.GetString(args, "skill_name")
+			scriptName := tools.GetString(args, "script_name")
+
+			if skillName == "" || scriptName == "" {
+				return tools.Result{Err: fmt.Errorf("skill_name and script_name are required")}
+			}
+
+			// Parse args array
+			var scriptArgs []string
+			if argsRaw, ok := args["args"]; ok {
+				argsJSON, _ := json.Marshal(argsRaw)
+				_ = json.Unmarshal(argsJSON, &scriptArgs)
+			}
+
+			// Load skill to get base path
+			skill, err := loader.LoadSkill(skillName)
+			if err != nil {
+				return tools.Result{Err: fmt.Errorf("load skill: %w", err)}
+			}
+
+			// Verify script exists
+			scriptFound := false
+			for _, s := range skill.Scripts {
+				if s == scriptName {
+					scriptFound = true
+					break
+				}
+			}
+			if !scriptFound {
+				return tools.Result{Err: fmt.Errorf("script not found in skill: %s", scriptName)}
+			}
+
+			// Execute script
+			scriptPath := fmt.Sprintf("%s/scripts/%s", skill.BasePath, scriptName)
+			output, err := executor.Run(ctx, scriptPath, scriptArgs)
+			if err != nil {
+				return tools.Result{
+					Output: output,
+					Err:    fmt.Errorf("script execution failed: %w", err),
+				}
+			}
+
+			return tools.Result{Output: output}
+		},
+	}
+}