95 lines
3.3 KiB
Markdown
95 lines
3.3 KiB
Markdown
---
|
|
name: services_api
|
|
lang: go
|
|
domain: infra
|
|
version: 0.1.0
|
|
description: "Backend HTTP que monitoriza apps con tag 'service' en el PC local y en PCs remotos via SSH. Loop de checks cada 15s, persistencia en operations.db. Frontend dedicado: app C++ services_monitor. Issue 0106."
|
|
tags: [service, monitoring, sqlite, ssh, systemd, http]
|
|
uses_functions:
|
|
- sqlite_apply_versioned_migrations_go_infra
|
|
- ssh_exec_go_infra
|
|
uses_types:
|
|
- ssh_conn_go_infra
|
|
framework: "net/http"
|
|
entry_point: "main.go"
|
|
dir_path: "apps/services_api"
|
|
service:
|
|
port: 8485
|
|
health_endpoint: /api/health
|
|
health_timeout_s: 3
|
|
systemd_unit: services_api.service
|
|
systemd_scope: user
|
|
restart_policy: always
|
|
runtime: systemd-user
|
|
pc_targets:
|
|
- aurgi-pc
|
|
- home-wsl
|
|
is_local_only: false
|
|
---
|
|
|
|
# services_api
|
|
|
|
## Que hace
|
|
|
|
Monitor cross-PC de los services del registry (issue 0106).
|
|
|
|
1. Lee `registry.db` (read-only) → apps con `tag: service` + `service_targets`.
|
|
2. Loop cada 15s: por cada (app, pc) chequea systemd + port listening + HTTP health.
|
|
3. Local (pc == self): `systemctl is-active`, TCP dial, `http.Client`.
|
|
4. Remoto: mismo set de comandos a traves de `ssh_exec_go_infra`. Si no hay ruta SSH → `no-route`.
|
|
5. Persistencia en `operations.db`: `service_state` (estado actual) + `service_transition` (cambios).
|
|
|
|
## Cuando usarla
|
|
|
|
- Saber de un vistazo que services estan vivos en cada PC.
|
|
- Detectar caidas silenciosas (ej. `sqlite_api` murio 20h sin alerta el 2026-05-17).
|
|
- Auditar cobertura por PC antes de un deploy.
|
|
|
|
## Lanzar
|
|
|
|
```bash
|
|
cd apps/services_api
|
|
CGO_ENABLED=1 go build -tags fts5 -o services_api .
|
|
./services_api --bind 127.0.0.1:8485
|
|
# o ./services_api --once para un solo ciclo (smoke)
|
|
```
|
|
|
|
Abrir `http://127.0.0.1:8485` en el navegador.
|
|
|
|
## Endpoints
|
|
|
|
| Ruta | Que devuelve |
|
|
|---|---|
|
|
| `GET /api/health` | `{"status":"ok","self_pc":"..."}` |
|
|
| `GET /api/services` | Snapshot completo (`services[]`) + `self_pc` + `ts` |
|
|
| `POST /api/check` | Fuerza un ciclo de checks; bloquea hasta completar |
|
|
| `GET /api/pcs` | Lista PCs con `services_count` |
|
|
|
|
## Gotchas
|
|
|
|
- **PCs remotos sin entrada en `~/.ssh/config`** quedan en `no-route`/`unknown`. Anadir alias antes de esperar status real (ej. `aurgi-pc` no esta en config hoy).
|
|
- **Solo lectura de registry.db**; no escribe en el. Las migraciones son sobre la propia operations.db de la app (`apps/services_api/operations.db`, tabla `service_state` + `service_transition`).
|
|
- **`overall`** se computa por runtime:
|
|
- `systemd-*` / `stdio`: requiere `systemctl is-active == active`. Si hay puerto, ademas TCP + HTTP 2xx/3xx.
|
|
- `docker-compose` / `manual`: solo puerto + HTTP.
|
|
- **Reload de targets** cada 5 min (background) sin reiniciar el proceso. Cambios en `app.md` requieren `./fn index` primero.
|
|
- **No expone tools mutadoras** en v1 (start/stop/restart). Solo alerta. Issue 0106 marca auto-fix detras de feature flag para v2.
|
|
|
|
## Validacion
|
|
|
|
```bash
|
|
./services_api --once && \
|
|
sqlite3 operations.db \
|
|
"SELECT app_id, pc_id, overall FROM service_state ORDER BY app_id, pc_id;"
|
|
```
|
|
|
|
|
|
## Capability growth log
|
|
|
|
Una linea por bump SemVer. Bump-type segun `.claude/commands/version.md`:
|
|
- `major`: breaking observable (CLI args, schema BBDD propia, formato wire).
|
|
- `minor`: feature aditiva (nuevo panel, endpoint, opcion).
|
|
- `patch`: bugfix sin cambio observable.
|
|
|
|
- v0.1.0 (2026-05-18) — baseline.
|