Files
Egutierrez 52999ecb86 feat: media_analytics — ETL PC+VPS → ClickHouse + Grafana
2 ETLs cada 5min suben snapshots (Jellyfin, *arr, Prowlarr, gnula,
popelis users/mylist/events) a ClickHouse en el VPS, visualizado en
Grafana (grafana.datardos.com). Ingesta PC via tunel SSH; popelis
via ETL local en el VPS. Usa clickhouse_insert_rows_py_infra.
2026-05-30 14:55:48 +02:00

66 lines
3.9 KiB
Markdown

---
name: media_analytics
lang: py
domain: infra
version: 0.1.0
description: "Analitica del media stack: 2 ETLs cada 5min suben snapshots (Jellyfin, *arr, Prowlarr, gnula, popelis users/mylist/events) a ClickHouse en el VPS, visualizado en Grafana. Ingesta PC via tunel SSH; popelis via ETL local en VPS."
tags: [analytics, clickhouse, grafana, etl, media, popelis, jellyfin, service]
uses_functions:
- clickhouse_insert_rows_py_infra
uses_types: []
framework: ""
entry_point: "etl_pc.py"
dir_path: "apps/media_analytics"
repo_url: ""
---
## Arquitectura
```
PC (Docker Desktop) VPS datardos (coolify net)
───────────────────── ──────────────────────────
Jellyfin :8096 ─┐ ClickHouse (interno + 127.0.0.1:8123)
Radarr/Sonarr ─┤ etl_pc.py (5min) ──ssh──► analytics.* (11 tablas snapshot)
Prowlarr ─┤ tunel SSH 18123→8123 ▲
gnula_catalog ──┘ │ etl_vps.py (5min, root)
popelis-db (Postgres) ── users/mylist/events
Grafana :3000 ──► grafana.datardos.com (Traefik+LE)
```
## Componentes
| Pieza | Dónde | Qué hace |
|---|---|---|
| `etl_pc.py` | PC, systemd-user `media-analytics-etl.timer` (5min) | extrae Jellyfin (items/users/user_items/sessions), Radarr/Sonarr (history/queue), Prowlarr (indexers), gnula SQLite → push a CH via túnel SSH. Usa `clickhouse_insert_rows_py_infra`. |
| `etl_vps.py` | VPS, systemd `media-analytics-vps.timer` (5min, root) | lee popelis-db (users, mylist snapshot; events incremental por id) → CH HTTP local. Standalone (VPS sin registry). |
| `deploy/docker-compose.yml` | VPS `/opt/analytics` | ClickHouse (interno coolify + 127.0.0.1:8123) + Grafana (Traefik grafana.datardos.com). |
| `deploy/clickhouse/schema.sql` | VPS | 11 tablas: jellyfin_{items,users,user_items,sessions}, arr_{history,queue}, prowlarr_indexers, gnula_movies, popelis_{users,mylist,events}. |
| `deploy/grafana/provisioning/` | VPS | datasource ClickHouse (uid `clickhouse`) + dashboard `Media Stack Analytics` (12 paneles). |
## Secretos
- `pass datardos-vps/clickhouse` (user analytics) · `pass datardos-vps/grafana` (admin).
- PC: `~/.config/popelis/analytics.env` (chmod600; CH pass + JF/arr keys — el timer no usa GPG).
- VPS: `/opt/analytics/.env` (chmod600; CH_PASSWORD, GF_PASSWORD).
## Ejecutar manual
```bash
# PC ETL
/home/lucas/fn_registry/python/.venv/bin/python3 apps/media_analytics/etl_pc.py # real
/home/lucas/fn_registry/python/.venv/bin/python3 apps/media_analytics/etl_pc.py --dry # solo extrae
# VPS ETL
ssh datardos 'sudo python3 /opt/analytics/etl_vps.py'
# Redeploy infra VPS
rsync -az apps/media_analytics/deploy/ datardos:/opt/analytics/ && ssh datardos 'cd /opt/analytics && sudo docker compose up -d'
```
## Visualización
https://grafana.datardos.com (admin / `pass datardos-vps/grafana`). Dashboard "Media Stack Analytics".
## Gotchas
- **Eventos "play" NO van por popelis** (la reproducción es directa a Jellyfin `/jf`): se capturan del lado Jellyfin (`jellyfin_sessions` + `jellyfin_user_items.play_count`). `popelis_events` cubre login/logout/mylist/user_created (instrumentado en popelis-api).
- ClickHouse HTTP escucha **solo 127.0.0.1 del VPS** (no público). El PC entra por túnel SSH efímero (`ssh -N -L 18123:127.0.0.1:8123`). Grafana usa el nativo :9000 por la red coolify.
- Snapshots son **append con snapshot_ts** → análisis temporal del estado. Eventos son hechos (event_ts) con dedup `ReplacingMergeTree(event_id)`.
- Int64 de ClickHouse vuelve como **string** en JSON (gotcha de `clickhouse_query`/Grafana).
- El timer del PC necesita `ssh datardos` sin passphrase (key sin passphrase o agente cargado).