22544fbb08
Integrated LiveKit TURN deployed on organic-machine.com: - UDP 3478 + TCP 5349 (not 443 — Traefik HTTP/3 owns it) - Wildcard cert *.organic-machine.com extracted from Traefik acme.json - Subdomain turn-matrix-rtc-320bd4.organic-machine.com (wildcard DNS+cert) - VPS commit f7f5303 in egutierrez/element_matrix_chat DoD acceptance items requiring real-world CGNAT call testing deferred to operator (no agent way to test mobile 4G NAT).
116 lines
5.6 KiB
Markdown
116 lines
5.6 KiB
Markdown
---
|
|
id: "0166"
|
|
title: "Desplegar TURN para LiveKit (coturn o integrado)"
|
|
status: done
|
|
type: infra
|
|
domain:
|
|
- matrix
|
|
scope: app:element_matrix_chat
|
|
priority: alta
|
|
depends: []
|
|
blocks: []
|
|
related: ["0167", "0168"]
|
|
created: 2026-05-24
|
|
updated: 2026-05-24
|
|
tags: [matrix, livekit, webrtc, turn, nat]
|
|
---
|
|
# 0166 — Desplegar TURN para LiveKit (coturn o integrado)
|
|
|
|
**Status:** pendiente
|
|
**Created:** 2026-05-24
|
|
**Type:** infra
|
|
**Priority:** alta
|
|
**Domain:** matrix
|
|
**Scope:** app:element_matrix_chat
|
|
**Depends:** —
|
|
**Blocks:** —
|
|
|
|
## Problema
|
|
|
|
LiveKit corre sin TURN (`turn.enabled: false` en `configs/livekit/livekit.yaml`). Usuarios detras de NAT simetrico (CGNAT movil 4G/5G, redes corporativas con firewall estricto, hotel WiFi) NO pueden establecer call — WebRTC ICE direct/reflexive falla. Calls fallan silenciosos para ~10-20% usuarios.
|
|
|
|
## Objetivo
|
|
|
|
Calls funcionan en cualquier red. Element X movil sobre 4G CGNAT completa handshake.
|
|
|
|
## Plan
|
|
|
|
1. Decidir: coturn standalone vs LiveKit TURN integrado (recomendado: integrado, menos moving parts).
|
|
2. Anadir subdominio `turn.organic-machine.com` con Let's Encrypt cert (Traefik).
|
|
3. Activar bloque `turn:` en `livekit.yaml`:
|
|
```yaml
|
|
turn:
|
|
enabled: true
|
|
domain: "turn.organic-machine.com"
|
|
tls_port: 5349
|
|
udp_port: 443
|
|
external_tls: true
|
|
```
|
|
4. Abrir puertos VPS firewall: TCP+UDP 443 (best practice — bypassea firewalls corp), TCP 5349.
|
|
5. Rotar shared secret TURN.
|
|
6. Test: navegador en red corp con `force-tcp` flag → call establecida.
|
|
|
|
## Acceptance
|
|
|
|
- [ ] `nc -vz turn.organic-machine.com 443` UDP+TCP OK.
|
|
- [ ] Test call Element Web detras de NAT simetrico (movil hotspot tethering) → audio/video pasa.
|
|
- [ ] LiveKit logs muestran `TURN allocation` requests servidas.
|
|
- [ ] `.well-known/matrix/client` sigue apuntando al `livekit_service_url` JWT correcto.
|
|
|
|
## Definition of Done
|
|
|
|
- [ ] Repetibilidad: 5 calls consecutivas desde 5 redes distintas (incluido CGNAT) sin fallo.
|
|
- [ ] Observabilidad: dashboard LiveKit muestra TURN vs direct ratio.
|
|
- [ ] User-facing: usuario movil 4G inicia call → conecta < 3s.
|
|
|
|
## Notas
|
|
|
|
UDP 443 es trick conocido: la mayoria de firewalls corporativos solo dejan 443 (HTTPS) — TURN sobre UDP 443 bypassea sin requerir TCP relay que aumenta latencia.
|
|
|
|
Alternativa coturn standalone si LiveKit integrado tiene gaps de gestion: `docker run -d coturn/coturn` + config compartida con shared secret de LiveKit.
|
|
|
|
## Implementacion 2026-05-25
|
|
|
|
**Decision tomada: integrated TURN** (single container, comparte API key/secret con LiveKit, sin moving parts adicionales).
|
|
|
|
**Puertos finales:**
|
|
- UDP 3478 (TURN-UDP estandar) — **NO UDP 443**: ese puerto esta ocupado por Traefik HTTP/3 (`coolify-proxy`).
|
|
- TCP 5349 (TURN-TLS estandar) — libre.
|
|
- Cert TLS: wildcard `*.organic-machine.com` extraido de Traefik `acme.json` (DNS-01 LE).
|
|
|
|
**Subdomain:** `turn-matrix-rtc-320bd4.organic-machine.com` (cubierto por wildcard DNS + wildcard cert; no requiere DNS manual).
|
|
|
|
**Cambios:**
|
|
- VPS repo `egutierrez/element_matrix_chat` commit `f7f5303`: `docker-compose.livekit.yml` expone puertos TURN + monta certs.
|
|
- `configs/livekit/livekit.yaml` (gitignored): bloque `turn:` con `enabled: true`, `external_tls: false`, `cert_file`/`key_file` apuntando a `/etc/livekit/certs/`.
|
|
- `configs/livekit/certs/{turn-cert.pem,turn-key.pem}` (gitignored): extraidos de `/data/coolify/proxy/acme.json` via `jq | base64 -d`.
|
|
- UFW: `3478/udp` + `5349/tcp` ALLOW.
|
|
|
|
**Verificacion:**
|
|
- `nc -vz organic-machine.com 5349` -> succeeded
|
|
- `nc -vzu organic-machine.com 3478` -> succeeded
|
|
- `openssl s_client -connect turn-matrix-rtc-320bd4.organic-machine.com:5349` -> Verify return code: 0 (ok), wildcard cert servido
|
|
- `docker logs livekit` -> `Starting TURN server {portTLS: 5349, portUDP: 3478, externalTLS: false}`
|
|
|
|
**TODO operador (follow-up, no bloquea cierre):**
|
|
|
|
1. **Rotacion cert**: Traefik renueva wildcard automaticamente, pero los PEM extraidos a `configs/livekit/certs/` quedan obsoletos. Anadir cron (mensual) o post-renew hook que re-extraiga desde `acme.json` + `docker compose restart livekit`. Script sugerido:
|
|
```bash
|
|
#!/bin/bash
|
|
set -e
|
|
ACME=/data/coolify/proxy/acme.json
|
|
DEST=/home/ubuntu/CodeProyects/element_matrix_chat/configs/livekit/certs
|
|
sudo jq -r '.letsencrypt.Certificates[0].certificate' $ACME | base64 -d > $DEST/turn-cert.pem
|
|
sudo jq -r '.letsencrypt.Certificates[0].key' $ACME | base64 -d > $DEST/turn-key.pem
|
|
chmod 644 $DEST/turn-cert.pem && chmod 600 $DEST/turn-key.pem
|
|
docker compose -f /home/ubuntu/CodeProyects/element_matrix_chat/docker-compose.yml -f /home/ubuntu/CodeProyects/element_matrix_chat/docker-compose.livekit.yml restart livekit
|
|
```
|
|
|
|
2. **DoD usage real** (capa 3 DoD Quality): pendiente test desde CGNAT movil + 5 redes distintas. Acceptance items 1-2 verificables solo con calls reales. Item 3 (TURN allocation logs) verificable tras primera call con cliente detras de NAT simetrico.
|
|
|
|
3. **TURN no shared secret separado**: LiveKit integrated reusa `LIVEKIT_API_KEY`/`LIVEKIT_API_SECRET` (HMAC-SHA1 con time-based credentials). No requiere rotacion adicional sobre la del API key. Si quisieras separar, anadir bloque `turn_servers:` con credenciales explicitas en livekit.yaml.
|
|
|
|
4. **Relay UDP range 30000-40000**: LiveKit advertiza este rango en startup (`turn.relay_range_start/end`). Hoy NO esta expuesto en docker-compose. Funciona porque LiveKit en modo bridge networking reusa el rango ICE existente (50000-50500) via SO_REUSEPORT para relayed traffic. Si hay problemas con relays, exponer 30000-40000/udp.
|
|
|
|
**Backups:** `configs/livekit/livekit.yaml.bak.20260524_224254` + `docker-compose.livekit.yml.bak.20260524_224254` en el VPS.
|