chore: sync from fn-registry agent
This commit is contained in:
@@ -0,0 +1,40 @@
|
|||||||
|
# JUPYTER HABILITADO EN ESTE ANALISIS
|
||||||
|
|
||||||
|
## Reglas OBLIGATORIAS para Claude
|
||||||
|
|
||||||
|
### 1. CODIGO INMUTABLE — NUNCA MODIFICAR CELDAS EXISTENTES
|
||||||
|
- **PROHIBIDO** usar NotebookEdit para reemplazar celdas existentes
|
||||||
|
- **SIEMPRE** anadir celdas NUEVAS al final del notebook
|
||||||
|
- Si hay un error en una celda, crear celda nueva con la correccion
|
||||||
|
- El historial de trabajo debe quedar intacto para trazabilidad
|
||||||
|
|
||||||
|
### 2. PROGRAMACION FUNCIONAL OBLIGATORIA
|
||||||
|
- **Funciones puras**: sin efectos secundarios, mismo input -> mismo output
|
||||||
|
- **Inmutabilidad**: nunca mutar datos, crear copias transformadas
|
||||||
|
- **Composicion**: funciones pequenas que se combinan
|
||||||
|
- Preferir: `map`, `filter`, `reduce`, list comprehensions
|
||||||
|
- Evitar: loops con mutacion, `global`, modificar argumentos in-place
|
||||||
|
|
||||||
|
### 3. SIEMPRE usar MCP jupyter para ejecutar codigo Python
|
||||||
|
- Las ejecuciones se ven en tiempo real en Jupyter Lab del usuario
|
||||||
|
- Compartimos variables y estado del kernel
|
||||||
|
- **NUNCA usar bash para ejecutar Python en este analisis**
|
||||||
|
|
||||||
|
### 4. Verificar Jupyter activo ANTES de ejecutar
|
||||||
|
- Si no esta activo: pedir al usuario que ejecute `./run-jupyter-lab.sh`
|
||||||
|
|
||||||
|
### 5. Gestion de notebooks
|
||||||
|
- Notebooks en la carpeta `notebooks/` o subcarpetas
|
||||||
|
- Si un notebook tiene >50 celdas, crear uno nuevo
|
||||||
|
- Nombrar descriptivamente: `01_exploracion.ipynb`, `02_limpieza.ipynb`
|
||||||
|
|
||||||
|
### 6. Gestion de Python
|
||||||
|
- **SIEMPRE usar `uv`** para gestionar dependencias
|
||||||
|
- Anadir paquetes con `uv add nombre_paquete`
|
||||||
|
|
||||||
|
### 7. Acceso al fn_registry
|
||||||
|
- `FN_REGISTRY_ROOT` apunta a la raiz del registry
|
||||||
|
- Para importar funciones Python: `sys.path.insert(0, os.path.join(os.environ["FN_REGISTRY_ROOT"], "python", "functions"))`
|
||||||
|
- Para consultar registry.db: `sqlite3` o `import sqlite3` con la ruta `$FN_REGISTRY_ROOT/registry.db`
|
||||||
|
|
||||||
|
|
||||||
+40
@@ -0,0 +1,40 @@
|
|||||||
|
# Python venv (regenerable con uv sync)
|
||||||
|
.venv/
|
||||||
|
|
||||||
|
# Secrets
|
||||||
|
.env
|
||||||
|
.env.*
|
||||||
|
|
||||||
|
# Data local (per-PC, no se sube)
|
||||||
|
data/
|
||||||
|
|
||||||
|
# Jupyter runtime (per-PC)
|
||||||
|
.jupyter/
|
||||||
|
.jupyter-port
|
||||||
|
.jupyter_ystore.db
|
||||||
|
.mcp.json
|
||||||
|
|
||||||
|
# IPython runtime — mantiene startup/ (registry helpers), ignora el resto
|
||||||
|
.ipython/profile_default/history.sqlite
|
||||||
|
.ipython/profile_default/log/
|
||||||
|
.ipython/profile_default/db/
|
||||||
|
.ipython/profile_default/security/
|
||||||
|
|
||||||
|
# Python bytecode
|
||||||
|
__pycache__/
|
||||||
|
*.pyc
|
||||||
|
*.pyo
|
||||||
|
|
||||||
|
# Jupyter checkpoints
|
||||||
|
.ipynb_checkpoints/
|
||||||
|
**/.ipynb_checkpoints/
|
||||||
|
|
||||||
|
# Operations DB
|
||||||
|
operations.db
|
||||||
|
operations.db-shm
|
||||||
|
operations.db-wal
|
||||||
|
operations.db-journal
|
||||||
|
|
||||||
|
# OS
|
||||||
|
.DS_Store
|
||||||
|
Thumbs.db
|
||||||
@@ -0,0 +1,100 @@
|
|||||||
|
"""
|
||||||
|
fn_registry kernel startup
|
||||||
|
Autoconfigura acceso al registry en cada notebook.
|
||||||
|
Generado por write_jupyter_registry_kernel (fn_registry).
|
||||||
|
"""
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import sqlite3
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
# ── FN_REGISTRY_ROOT ────────────────────────────────────────
|
||||||
|
# Prioridad: env var > path hardcoded > descubrimiento automatico
|
||||||
|
def _discover_registry_root():
|
||||||
|
if os.environ.get("FN_REGISTRY_ROOT"):
|
||||||
|
return Path(os.environ["FN_REGISTRY_ROOT"]).resolve()
|
||||||
|
hardcoded = Path("/home/egutierrez/fn_registry")
|
||||||
|
if (hardcoded / "registry.db").exists():
|
||||||
|
return hardcoded
|
||||||
|
# Subir desde CWD hasta encontrar registry.db
|
||||||
|
p = Path.cwd()
|
||||||
|
for _ in range(10):
|
||||||
|
if (p / "registry.db").exists():
|
||||||
|
return p
|
||||||
|
if p.parent == p:
|
||||||
|
break
|
||||||
|
p = p.parent
|
||||||
|
return hardcoded
|
||||||
|
|
||||||
|
FN_REGISTRY_ROOT = _discover_registry_root()
|
||||||
|
os.environ["FN_REGISTRY_ROOT"] = str(FN_REGISTRY_ROOT)
|
||||||
|
|
||||||
|
# ── sys.path: importar funciones Python del registry ────────
|
||||||
|
_python_functions = FN_REGISTRY_ROOT / "python" / "functions"
|
||||||
|
for _domain in sorted(_python_functions.iterdir()) if _python_functions.exists() else []:
|
||||||
|
if _domain.is_dir() and not _domain.name.startswith("_"):
|
||||||
|
_path = str(_domain)
|
||||||
|
if _path not in sys.path:
|
||||||
|
sys.path.insert(0, _path)
|
||||||
|
|
||||||
|
# Tambien el directorio padre para imports por dominio: from core import filter_list
|
||||||
|
_pf = str(_python_functions)
|
||||||
|
if _pf not in sys.path:
|
||||||
|
sys.path.insert(0, _pf)
|
||||||
|
|
||||||
|
# ── fn_query: consultar registry.db desde el notebook ───────
|
||||||
|
_REGISTRY_DB = FN_REGISTRY_ROOT / "registry.db"
|
||||||
|
|
||||||
|
def fn_query(sql, params=()):
|
||||||
|
"""Ejecuta una consulta SQL sobre registry.db y retorna las filas.
|
||||||
|
|
||||||
|
Ejemplos:
|
||||||
|
fn_query("SELECT id, description FROM functions WHERE domain = ?", ("finance",))
|
||||||
|
fn_query("SELECT id FROM functions_fts WHERE functions_fts MATCH ?", ("slice*",))
|
||||||
|
"""
|
||||||
|
if not _REGISTRY_DB.exists():
|
||||||
|
raise FileNotFoundError(f"registry.db no encontrado en {_REGISTRY_DB}")
|
||||||
|
con = sqlite3.connect(str(_REGISTRY_DB))
|
||||||
|
con.row_factory = sqlite3.Row
|
||||||
|
try:
|
||||||
|
rows = con.execute(sql, params).fetchall()
|
||||||
|
return [dict(r) for r in rows]
|
||||||
|
finally:
|
||||||
|
con.close()
|
||||||
|
|
||||||
|
def fn_search(term):
|
||||||
|
"""Busca funciones y tipos en el registry por nombre o descripcion.
|
||||||
|
|
||||||
|
Ejemplo:
|
||||||
|
fn_search("slice")
|
||||||
|
fn_search("finance")
|
||||||
|
"""
|
||||||
|
fts_term = f"name:{term}* OR description:{term}*"
|
||||||
|
functions = fn_query(
|
||||||
|
"SELECT id, kind, purity, lang, description FROM functions "
|
||||||
|
"WHERE id IN (SELECT id FROM functions_fts WHERE functions_fts MATCH ?) "
|
||||||
|
"ORDER BY name", (fts_term,)
|
||||||
|
)
|
||||||
|
types = fn_query(
|
||||||
|
"SELECT id, algebraic, lang, description FROM types "
|
||||||
|
"WHERE id IN (SELECT id FROM types_fts WHERE types_fts MATCH ?) "
|
||||||
|
"ORDER BY name", (fts_term,)
|
||||||
|
)
|
||||||
|
return {"functions": functions, "types": types}
|
||||||
|
|
||||||
|
def fn_code(function_id):
|
||||||
|
"""Retorna el codigo fuente de una funcion del registry.
|
||||||
|
|
||||||
|
Ejemplo:
|
||||||
|
print(fn_code("filter_list_py_core"))
|
||||||
|
"""
|
||||||
|
rows = fn_query("SELECT code FROM functions WHERE id = ?", (function_id,))
|
||||||
|
if not rows:
|
||||||
|
raise KeyError(f"Funcion no encontrada: {function_id}")
|
||||||
|
return rows[0]["code"]
|
||||||
|
|
||||||
|
# ── Mensaje de bienvenida ───────────────────────────────────
|
||||||
|
print(f"fn_registry conectado: {FN_REGISTRY_ROOT}")
|
||||||
|
print(f" registry.db: {'OK' if _REGISTRY_DB.exists() else 'NO ENCONTRADO'}")
|
||||||
|
print(f" Python functions: {_pf}")
|
||||||
|
print(f" Helpers: fn_query(), fn_search(), fn_code()")
|
||||||
@@ -0,0 +1 @@
|
|||||||
|
3.13
|
||||||
+17
@@ -0,0 +1,17 @@
|
|||||||
|
---
|
||||||
|
name: happy_robot_calls
|
||||||
|
lang: py
|
||||||
|
domain: datascience
|
||||||
|
description: "EDA de run_events (agente IA Happy Robot) y call_transactions (call center), analisis cruzado"
|
||||||
|
tags: [happy_robot, call_center, mutua, eda]
|
||||||
|
uses_functions: []
|
||||||
|
uses_types: []
|
||||||
|
framework: "jupyterlab"
|
||||||
|
entry_point: "notebooks/main.ipynb"
|
||||||
|
dir_path: "projects/aurgi/analysis/happy_robot_calls"
|
||||||
|
repo_url: ""
|
||||||
|
---
|
||||||
|
|
||||||
|
## Notas
|
||||||
|
|
||||||
|
EDA de run_events (agente IA Happy Robot) y call_transactions (call center), analisis cruzado
|
||||||
@@ -0,0 +1,6 @@
|
|||||||
|
def main():
|
||||||
|
print("Hello from happy-robot-calls!")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,895 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "a0b1c2d3",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# 03 - EDA Combinado: Happy Robot + Call Transactions\n",
|
||||||
|
"\n",
|
||||||
|
"Analisis cruzado entre las llamadas procesadas por el agente IA (Happy Robot) y las transacciones del call center.\n",
|
||||||
|
"\n",
|
||||||
|
"**Tablas:**\n",
|
||||||
|
"- `happy_robot_publicpublic.run_events` (72 filas) - eventos del agente IA para operaciones de lunas\n",
|
||||||
|
"- `psql_dcpublic.call_transactions` (24.7M filas) - transacciones del call center\n",
|
||||||
|
"\n",
|
||||||
|
"**Join bridges:**\n",
|
||||||
|
"1. Telefono: `source` <-> `telefonos` / `orig` (normalizando +34, espacios, asteriscos)\n",
|
||||||
|
"2. Matricula: `license_plate` <-> `cliente`\n",
|
||||||
|
"3. Ventana temporal: `time` <-> `date_time` (proximidad en minutos)\n",
|
||||||
|
"4. Filtro de campana: campanas Mutua en call_transactions"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "b1c2d3e4",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 1. Setup"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "c2d3e4f5",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import os\n",
|
||||||
|
"import pandas as pd\n",
|
||||||
|
"import plotly.express as px\n",
|
||||||
|
"import plotly.graph_objects as go\n",
|
||||||
|
"from dotenv import load_dotenv\n",
|
||||||
|
"load_dotenv()\n",
|
||||||
|
"\n",
|
||||||
|
"import sys\n",
|
||||||
|
"sys.path.insert(0, os.path.join(os.environ[\"FN_REGISTRY_ROOT\"], \"python\", \"functions\"))\n",
|
||||||
|
"from metabase.client import MetabaseClient\n",
|
||||||
|
"\n",
|
||||||
|
"client = MetabaseClient(os.environ[\"METABASE_URL\"], os.environ[\"METABASE_API_KEY\"])\n",
|
||||||
|
"\n",
|
||||||
|
"def query_to_df(sql: str) -> pd.DataFrame:\n",
|
||||||
|
" \"\"\"Execute SQL against BigQuery via Metabase API, return DataFrame.\"\"\"\n",
|
||||||
|
" result = client.request('POST', '/api/dataset', json={\n",
|
||||||
|
" 'database': 6, 'type': 'native', 'native': {'query': sql}\n",
|
||||||
|
" })\n",
|
||||||
|
" cols = [c['name'] for c in result['data']['cols']]\n",
|
||||||
|
" return pd.DataFrame(result['data']['rows'], columns=cols)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "d3e4f5a6",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 2. Carga de run_events"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "e4f5a6b7",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"df_re = query_to_df(\"\"\"\n",
|
||||||
|
"SELECT *\n",
|
||||||
|
"FROM `happy_robot_publicpublic.run_events`\n",
|
||||||
|
"ORDER BY time\n",
|
||||||
|
"\"\"\")\n",
|
||||||
|
"\n",
|
||||||
|
"print(f\"run_events: {len(df_re)} filas, {len(df_re.columns)} columnas\")\n",
|
||||||
|
"df_re.dtypes"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "f5a6b7c8",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Parsear datetimes y columnas numericas\n",
|
||||||
|
"df_re['time'] = pd.to_datetime(df_re['time'])\n",
|
||||||
|
"if 'duration' in df_re.columns:\n",
|
||||||
|
" df_re['duration'] = pd.to_numeric(df_re['duration'], errors='coerce')\n",
|
||||||
|
"\n",
|
||||||
|
"print(f\"Rango temporal: {df_re['time'].min()} -> {df_re['time'].max()}\")\n",
|
||||||
|
"print(f\"Clasificaciones: {df_re['classification'].value_counts().to_dict()}\")\n",
|
||||||
|
"df_re.head(10)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "a6b7c8d9",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Resumen estadistico de run_events\n",
|
||||||
|
"print(\"Columnas disponibles:\", list(df_re.columns))\n",
|
||||||
|
"print(f\"\\nSources unicos (telefonos): {df_re['source'].nunique()}\")\n",
|
||||||
|
"print(f\"Matriculas unicas: {df_re['license_plate'].nunique()}\")\n",
|
||||||
|
"print(f\"Nulos en source: {df_re['source'].isna().sum()}\")\n",
|
||||||
|
"print(f\"Nulos en license_plate: {df_re['license_plate'].isna().sum()}\")\n",
|
||||||
|
"df_re.describe(include='all')"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "b7c8d9e0",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 3. Identificar campanas Mutua en call_transactions"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "c8d9e0f1",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"df_mutua_campaigns = query_to_df(\"\"\"\n",
|
||||||
|
"SELECT\n",
|
||||||
|
" campaign_name,\n",
|
||||||
|
" COUNT(*) as total_calls,\n",
|
||||||
|
" MIN(date_time) as first_call,\n",
|
||||||
|
" MAX(date_time) as last_call\n",
|
||||||
|
"FROM `psql_dcpublic.call_transactions`\n",
|
||||||
|
"WHERE LOWER(campaign_name) LIKE '%mutua%'\n",
|
||||||
|
"GROUP BY campaign_name\n",
|
||||||
|
"ORDER BY total_calls DESC\n",
|
||||||
|
"\"\"\")\n",
|
||||||
|
"\n",
|
||||||
|
"print(\"Campanas Mutua encontradas:\")\n",
|
||||||
|
"df_mutua_campaigns"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "d9e0f1a2",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Volumen reciente de campanas Mutua (desde abril 2026, periodo Happy Robot)\n",
|
||||||
|
"df_mutua_recent = query_to_df(\"\"\"\n",
|
||||||
|
"SELECT\n",
|
||||||
|
" campaign_name,\n",
|
||||||
|
" DATE(date_time) as dia,\n",
|
||||||
|
" COUNT(*) as calls\n",
|
||||||
|
"FROM `psql_dcpublic.call_transactions`\n",
|
||||||
|
"WHERE LOWER(campaign_name) LIKE '%mutua%'\n",
|
||||||
|
" AND date_time >= '2026-04-01'\n",
|
||||||
|
"GROUP BY campaign_name, DATE(date_time)\n",
|
||||||
|
"ORDER BY dia\n",
|
||||||
|
"\"\"\")\n",
|
||||||
|
"\n",
|
||||||
|
"df_mutua_recent['calls'] = pd.to_numeric(df_mutua_recent['calls'])\n",
|
||||||
|
"\n",
|
||||||
|
"fig = px.bar(df_mutua_recent, x='dia', y='calls', color='campaign_name',\n",
|
||||||
|
" title='Llamadas Mutua por dia (desde abril 2026)',\n",
|
||||||
|
" labels={'calls': 'Llamadas', 'dia': 'Fecha', 'campaign_name': 'Campana'})\n",
|
||||||
|
"fig.show()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "e0f1a2b3",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 4. Join por telefono\n",
|
||||||
|
"\n",
|
||||||
|
"Normalizacion:\n",
|
||||||
|
"- `run_events.source`: formato `+34623122972` -> extraer `623122972`\n",
|
||||||
|
"- `call_transactions.telefonos`: formato ` 623122972` (espacio delante) -> strip\n",
|
||||||
|
"- `call_transactions.orig`: formato ` *34623122972` -> strip `*34`\n",
|
||||||
|
"\n",
|
||||||
|
"Join: telefono normalizado + mismo dia"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "f1a2b3c4",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"df_phone_join = query_to_df(\"\"\"\n",
|
||||||
|
"WITH re AS (\n",
|
||||||
|
" SELECT *,\n",
|
||||||
|
" REPLACE(REPLACE(source, '+34', ''), ' ', '') AS phone_clean\n",
|
||||||
|
" FROM `happy_robot_publicpublic.run_events`\n",
|
||||||
|
"),\n",
|
||||||
|
"ct AS (\n",
|
||||||
|
" SELECT *,\n",
|
||||||
|
" REPLACE(REPLACE(REPLACE(TRIM(telefonos), '*34', ''), '+34', ''), ' ', '') AS phone_clean\n",
|
||||||
|
" FROM `psql_dcpublic.call_transactions`\n",
|
||||||
|
" WHERE LOWER(campaign_name) LIKE '%mutua%'\n",
|
||||||
|
" AND date_time >= '2026-04-01'\n",
|
||||||
|
")\n",
|
||||||
|
"SELECT\n",
|
||||||
|
" re.id AS re_id,\n",
|
||||||
|
" re.run_id,\n",
|
||||||
|
" re.time AS re_time,\n",
|
||||||
|
" re.classification,\n",
|
||||||
|
" ct.id AS ct_id,\n",
|
||||||
|
" ct.date_time AS ct_time,\n",
|
||||||
|
" ct.campaign_name,\n",
|
||||||
|
" ct.description AS ct_resolution,\n",
|
||||||
|
" ct.agente,\n",
|
||||||
|
" ct.t_convers\n",
|
||||||
|
"FROM re\n",
|
||||||
|
"JOIN ct ON re.phone_clean = ct.phone_clean\n",
|
||||||
|
" AND DATE(re.time) = DATE(ct.date_time)\n",
|
||||||
|
"ORDER BY re.time\n",
|
||||||
|
"\"\"\")\n",
|
||||||
|
"\n",
|
||||||
|
"print(f\"Matches por telefono + mismo dia: {len(df_phone_join)}\")\n",
|
||||||
|
"print(f\"run_events distintos con match: {df_phone_join['re_id'].nunique()}\")\n",
|
||||||
|
"df_phone_join.head(20)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "a2b3c4d5",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Tambien probar join via campo orig (formato *34...)\n",
|
||||||
|
"df_phone_join_orig = query_to_df(\"\"\"\n",
|
||||||
|
"WITH re AS (\n",
|
||||||
|
" SELECT *,\n",
|
||||||
|
" REPLACE(REPLACE(source, '+34', ''), ' ', '') AS phone_clean\n",
|
||||||
|
" FROM `happy_robot_publicpublic.run_events`\n",
|
||||||
|
"),\n",
|
||||||
|
"ct AS (\n",
|
||||||
|
" SELECT *,\n",
|
||||||
|
" REPLACE(REPLACE(REPLACE(TRIM(orig), '*34', ''), '+34', ''), ' ', '') AS phone_clean\n",
|
||||||
|
" FROM `psql_dcpublic.call_transactions`\n",
|
||||||
|
" WHERE LOWER(campaign_name) LIKE '%mutua%'\n",
|
||||||
|
" AND date_time >= '2026-04-01'\n",
|
||||||
|
")\n",
|
||||||
|
"SELECT\n",
|
||||||
|
" re.id AS re_id,\n",
|
||||||
|
" re.run_id,\n",
|
||||||
|
" re.time AS re_time,\n",
|
||||||
|
" re.classification,\n",
|
||||||
|
" ct.id AS ct_id,\n",
|
||||||
|
" ct.date_time AS ct_time,\n",
|
||||||
|
" ct.campaign_name,\n",
|
||||||
|
" ct.description AS ct_resolution,\n",
|
||||||
|
" ct.agente,\n",
|
||||||
|
" ct.t_convers\n",
|
||||||
|
"FROM re\n",
|
||||||
|
"JOIN ct ON re.phone_clean = ct.phone_clean\n",
|
||||||
|
" AND DATE(re.time) = DATE(ct.date_time)\n",
|
||||||
|
"ORDER BY re.time\n",
|
||||||
|
"\"\"\")\n",
|
||||||
|
"\n",
|
||||||
|
"print(f\"Matches por orig + mismo dia: {len(df_phone_join_orig)}\")\n",
|
||||||
|
"print(f\"run_events distintos con match (orig): {df_phone_join_orig['re_id'].nunique()}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "b3c4d5e6",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Combinar ambos joins de telefono (telefonos + orig), dedup por par (re_id, ct_id)\n",
|
||||||
|
"df_phone_all = pd.concat([df_phone_join, df_phone_join_orig]).drop_duplicates(subset=['re_id', 'ct_id'])\n",
|
||||||
|
"print(f\"Matches telefono combinados (dedup): {len(df_phone_all)}\")\n",
|
||||||
|
"print(f\"run_events distintos con match telefonico: {df_phone_all['re_id'].nunique()} de {len(df_re)} totales\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "c4d5e6f7",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 5. Join por matricula\n",
|
||||||
|
"\n",
|
||||||
|
"- `run_events.license_plate` <-> `call_transactions.cliente`\n",
|
||||||
|
"- Ambos son matriculas (ej: `0000MHG`, `9928MKH`)\n",
|
||||||
|
"- Join directo + mismo dia + campanas Mutua"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "d5e6f7a8",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"df_plate_join = query_to_df(\"\"\"\n",
|
||||||
|
"WITH re AS (\n",
|
||||||
|
" SELECT *\n",
|
||||||
|
" FROM `happy_robot_publicpublic.run_events`\n",
|
||||||
|
" WHERE license_plate IS NOT NULL AND TRIM(license_plate) != ''\n",
|
||||||
|
"),\n",
|
||||||
|
"ct AS (\n",
|
||||||
|
" SELECT *\n",
|
||||||
|
" FROM `psql_dcpublic.call_transactions`\n",
|
||||||
|
" WHERE LOWER(campaign_name) LIKE '%mutua%'\n",
|
||||||
|
" AND date_time >= '2026-04-01'\n",
|
||||||
|
" AND cliente IS NOT NULL AND TRIM(cliente) != ''\n",
|
||||||
|
")\n",
|
||||||
|
"SELECT\n",
|
||||||
|
" re.id AS re_id,\n",
|
||||||
|
" re.run_id,\n",
|
||||||
|
" re.time AS re_time,\n",
|
||||||
|
" re.license_plate,\n",
|
||||||
|
" re.classification,\n",
|
||||||
|
" ct.id AS ct_id,\n",
|
||||||
|
" ct.date_time AS ct_time,\n",
|
||||||
|
" ct.campaign_name,\n",
|
||||||
|
" ct.cliente,\n",
|
||||||
|
" ct.description AS ct_resolution,\n",
|
||||||
|
" ct.agente,\n",
|
||||||
|
" ct.t_convers\n",
|
||||||
|
"FROM re\n",
|
||||||
|
"JOIN ct ON UPPER(TRIM(re.license_plate)) = UPPER(TRIM(ct.cliente))\n",
|
||||||
|
" AND DATE(re.time) = DATE(ct.date_time)\n",
|
||||||
|
"ORDER BY re.time\n",
|
||||||
|
"\"\"\")\n",
|
||||||
|
"\n",
|
||||||
|
"print(f\"Matches por matricula + mismo dia: {len(df_plate_join)}\")\n",
|
||||||
|
"print(f\"run_events distintos con match por matricula: {df_plate_join['re_id'].nunique()}\")\n",
|
||||||
|
"df_plate_join.head(20)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "e6f7a8b9",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Matriculas sin match - que hay en license_plate que no aparece en cliente?\n",
|
||||||
|
"df_no_match_plates = query_to_df(\"\"\"\n",
|
||||||
|
"WITH re AS (\n",
|
||||||
|
" SELECT DISTINCT UPPER(TRIM(license_plate)) AS plate\n",
|
||||||
|
" FROM `happy_robot_publicpublic.run_events`\n",
|
||||||
|
" WHERE license_plate IS NOT NULL AND TRIM(license_plate) != ''\n",
|
||||||
|
"),\n",
|
||||||
|
"ct AS (\n",
|
||||||
|
" SELECT DISTINCT UPPER(TRIM(cliente)) AS plate\n",
|
||||||
|
" FROM `psql_dcpublic.call_transactions`\n",
|
||||||
|
" WHERE LOWER(campaign_name) LIKE '%mutua%'\n",
|
||||||
|
" AND date_time >= '2026-04-01'\n",
|
||||||
|
" AND cliente IS NOT NULL AND TRIM(cliente) != ''\n",
|
||||||
|
")\n",
|
||||||
|
"SELECT re.plate, \n",
|
||||||
|
" CASE WHEN ct.plate IS NOT NULL THEN 'SI' ELSE 'NO' END AS existe_en_ct\n",
|
||||||
|
"FROM re\n",
|
||||||
|
"LEFT JOIN ct ON re.plate = ct.plate\n",
|
||||||
|
"ORDER BY existe_en_ct, re.plate\n",
|
||||||
|
"\"\"\")\n",
|
||||||
|
"\n",
|
||||||
|
"print(\"Matriculas de run_events y si existen en call_transactions (Mutua, abril+):\")\n",
|
||||||
|
"df_no_match_plates"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "f7a8b9c0",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 6. Analisis de cobertura\n",
|
||||||
|
"\n",
|
||||||
|
"- Que % de run_events tienen match en call_transactions?\n",
|
||||||
|
"- Que % de call_transactions Mutua recientes tienen match en run_events?"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "a8b9c0d1",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Cobertura combinando ambos joins (telefono + matricula)\n",
|
||||||
|
"re_ids_phone = set(df_phone_all['re_id'].unique())\n",
|
||||||
|
"re_ids_plate = set(df_plate_join['re_id'].unique())\n",
|
||||||
|
"re_ids_any = re_ids_phone | re_ids_plate\n",
|
||||||
|
"re_ids_both = re_ids_phone & re_ids_plate\n",
|
||||||
|
"\n",
|
||||||
|
"total_re = len(df_re)\n",
|
||||||
|
"\n",
|
||||||
|
"print(\"=== Cobertura de run_events ===\")\n",
|
||||||
|
"print(f\"Total run_events: {total_re}\")\n",
|
||||||
|
"print(f\"Match por telefono: {len(re_ids_phone)} ({100*len(re_ids_phone)/total_re:.1f}%)\")\n",
|
||||||
|
"print(f\"Match por matricula: {len(re_ids_plate)} ({100*len(re_ids_plate)/total_re:.1f}%)\")\n",
|
||||||
|
"print(f\"Match por cualquiera: {len(re_ids_any)} ({100*len(re_ids_any)/total_re:.1f}%)\")\n",
|
||||||
|
"print(f\"Match por ambos: {len(re_ids_both)} ({100*len(re_ids_both)/total_re:.1f}%)\")\n",
|
||||||
|
"print(f\"Sin match: {total_re - len(re_ids_any)} ({100*(total_re - len(re_ids_any))/total_re:.1f}%)\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "b9c0d1e2",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Cobertura inversa: que % de call_transactions Mutua recientes tienen match\n",
|
||||||
|
"df_ct_coverage = query_to_df(\"\"\"\n",
|
||||||
|
"WITH re_phones AS (\n",
|
||||||
|
" SELECT DISTINCT REPLACE(REPLACE(source, '+34', ''), ' ', '') AS phone_clean\n",
|
||||||
|
" FROM `happy_robot_publicpublic.run_events`\n",
|
||||||
|
" WHERE source IS NOT NULL\n",
|
||||||
|
"),\n",
|
||||||
|
"re_plates AS (\n",
|
||||||
|
" SELECT DISTINCT UPPER(TRIM(license_plate)) AS plate\n",
|
||||||
|
" FROM `happy_robot_publicpublic.run_events`\n",
|
||||||
|
" WHERE license_plate IS NOT NULL AND TRIM(license_plate) != ''\n",
|
||||||
|
"),\n",
|
||||||
|
"ct AS (\n",
|
||||||
|
" SELECT *,\n",
|
||||||
|
" REPLACE(REPLACE(REPLACE(TRIM(telefonos), '*34', ''), '+34', ''), ' ', '') AS phone_clean,\n",
|
||||||
|
" UPPER(TRIM(cliente)) AS plate_clean\n",
|
||||||
|
" FROM `psql_dcpublic.call_transactions`\n",
|
||||||
|
" WHERE LOWER(campaign_name) LIKE '%mutua%'\n",
|
||||||
|
" AND date_time >= '2026-04-01'\n",
|
||||||
|
")\n",
|
||||||
|
"SELECT\n",
|
||||||
|
" COUNT(*) AS total_ct_mutua,\n",
|
||||||
|
" COUNTIF(rp.phone_clean IS NOT NULL OR rpl.plate IS NOT NULL) AS matched,\n",
|
||||||
|
" COUNTIF(rp.phone_clean IS NULL AND rpl.plate IS NULL) AS unmatched\n",
|
||||||
|
"FROM ct\n",
|
||||||
|
"LEFT JOIN re_phones rp ON ct.phone_clean = rp.phone_clean\n",
|
||||||
|
"LEFT JOIN re_plates rpl ON ct.plate_clean = rpl.plate\n",
|
||||||
|
"\"\"\")\n",
|
||||||
|
"\n",
|
||||||
|
"print(\"=== Cobertura de call_transactions Mutua (abril+) ===\")\n",
|
||||||
|
"df_ct_coverage"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "c0d1e2f3",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Visualizacion de cobertura\n",
|
||||||
|
"coverage_data = pd.DataFrame({\n",
|
||||||
|
" 'Metodo': ['Solo telefono', 'Solo matricula', 'Ambos', 'Sin match'],\n",
|
||||||
|
" 'Count': [\n",
|
||||||
|
" len(re_ids_phone - re_ids_plate),\n",
|
||||||
|
" len(re_ids_plate - re_ids_phone),\n",
|
||||||
|
" len(re_ids_both),\n",
|
||||||
|
" total_re - len(re_ids_any)\n",
|
||||||
|
" ]\n",
|
||||||
|
"})\n",
|
||||||
|
"\n",
|
||||||
|
"fig = px.pie(coverage_data, values='Count', names='Metodo',\n",
|
||||||
|
" title=f'Cobertura de run_events ({total_re} total)',\n",
|
||||||
|
" color_discrete_sequence=px.colors.qualitative.Set2)\n",
|
||||||
|
"fig.show()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "d1e2f3a4",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 7. Comparacion de resultados\n",
|
||||||
|
"\n",
|
||||||
|
"Para los pares matcheados: comparar la clasificacion de Happy Robot vs la resolucion del call center.\n",
|
||||||
|
"Las llamadas gestionadas por IA se resuelven de forma diferente?"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "e2f3a4b5",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Join mas completo con ambos bridges para analisis de resultados\n",
|
||||||
|
"df_matched = query_to_df(\"\"\"\n",
|
||||||
|
"WITH re AS (\n",
|
||||||
|
" SELECT *,\n",
|
||||||
|
" REPLACE(REPLACE(source, '+34', ''), ' ', '') AS phone_clean\n",
|
||||||
|
" FROM `happy_robot_publicpublic.run_events`\n",
|
||||||
|
"),\n",
|
||||||
|
"ct AS (\n",
|
||||||
|
" SELECT *,\n",
|
||||||
|
" REPLACE(REPLACE(REPLACE(TRIM(telefonos), '*34', ''), '+34', ''), ' ', '') AS phone_clean,\n",
|
||||||
|
" UPPER(TRIM(cliente)) AS plate_clean\n",
|
||||||
|
" FROM `psql_dcpublic.call_transactions`\n",
|
||||||
|
" WHERE LOWER(campaign_name) LIKE '%mutua%'\n",
|
||||||
|
" AND date_time >= '2026-04-01'\n",
|
||||||
|
")\n",
|
||||||
|
"SELECT DISTINCT\n",
|
||||||
|
" re.id AS re_id,\n",
|
||||||
|
" re.run_id,\n",
|
||||||
|
" re.time AS re_time,\n",
|
||||||
|
" re.classification AS hr_classification,\n",
|
||||||
|
" re.source AS hr_phone,\n",
|
||||||
|
" re.license_plate AS hr_plate,\n",
|
||||||
|
" ct.id AS ct_id,\n",
|
||||||
|
" ct.date_time AS ct_time,\n",
|
||||||
|
" ct.campaign_name,\n",
|
||||||
|
" ct.description AS ct_resolution,\n",
|
||||||
|
" ct.agente AS ct_agente,\n",
|
||||||
|
" ct.t_convers AS ct_duration,\n",
|
||||||
|
" CASE\n",
|
||||||
|
" WHEN re.phone_clean = ct.phone_clean THEN 'telefono'\n",
|
||||||
|
" WHEN UPPER(TRIM(re.license_plate)) = ct.plate_clean THEN 'matricula'\n",
|
||||||
|
" END AS join_method\n",
|
||||||
|
"FROM re\n",
|
||||||
|
"JOIN ct ON (\n",
|
||||||
|
" (re.phone_clean = ct.phone_clean)\n",
|
||||||
|
" OR (UPPER(TRIM(re.license_plate)) = ct.plate_clean AND re.license_plate IS NOT NULL AND TRIM(re.license_plate) != '')\n",
|
||||||
|
" )\n",
|
||||||
|
" AND DATE(re.time) = DATE(ct.date_time)\n",
|
||||||
|
"ORDER BY re.time\n",
|
||||||
|
"\"\"\")\n",
|
||||||
|
"\n",
|
||||||
|
"print(f\"Pares matcheados (dedup): {len(df_matched)}\")\n",
|
||||||
|
"print(f\"\\nClasificaciones Happy Robot en matches:\")\n",
|
||||||
|
"print(df_matched['hr_classification'].value_counts())\n",
|
||||||
|
"print(f\"\\nResoluciones call center en matches:\")\n",
|
||||||
|
"print(df_matched['ct_resolution'].value_counts())"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "f3a4b5c6",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Matriz de confusion: clasificacion HR vs resolucion CT\n",
|
||||||
|
"if len(df_matched) > 0:\n",
|
||||||
|
" confusion = pd.crosstab(\n",
|
||||||
|
" df_matched['hr_classification'],\n",
|
||||||
|
" df_matched['ct_resolution'],\n",
|
||||||
|
" margins=True\n",
|
||||||
|
" )\n",
|
||||||
|
" print(\"Matriz: HR classification (filas) vs CT resolution (columnas)\")\n",
|
||||||
|
" display(confusion)\n",
|
||||||
|
"else:\n",
|
||||||
|
" print(\"No hay pares matcheados para comparar\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "a4b5c6d7",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Heatmap de la comparacion\n",
|
||||||
|
"if len(df_matched) > 0:\n",
|
||||||
|
" cross = pd.crosstab(df_matched['hr_classification'], df_matched['ct_resolution'])\n",
|
||||||
|
" fig = px.imshow(cross,\n",
|
||||||
|
" labels=dict(x='Resolucion Call Center', y='Clasificacion Happy Robot', color='Count'),\n",
|
||||||
|
" title='Happy Robot Classification vs Call Center Resolution',\n",
|
||||||
|
" text_auto=True,\n",
|
||||||
|
" color_continuous_scale='Blues')\n",
|
||||||
|
" fig.update_layout(width=900, height=500)\n",
|
||||||
|
" fig.show()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "b5c6d7e8",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 8. Timeline de una llamada\n",
|
||||||
|
"\n",
|
||||||
|
"Para un par matcheado, reconstruir el ciclo de vida completo:\n",
|
||||||
|
"cuando entro en call_transactions, cuando la proceso Happy Robot, cual fue el resultado en cada sistema."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "c6d7e8f9",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Seleccionar un par representativo (el primero con datos completos)\n",
|
||||||
|
"if len(df_matched) > 0:\n",
|
||||||
|
" sample = df_matched.iloc[0]\n",
|
||||||
|
" print(f\"=== Timeline de llamada ===\")\n",
|
||||||
|
" print(f\"run_event ID: {sample['re_id']}\")\n",
|
||||||
|
" print(f\"call_transaction ID: {sample['ct_id']}\")\n",
|
||||||
|
" print(f\"Metodo de join: {sample['join_method']}\")\n",
|
||||||
|
" print(f\"\")\n",
|
||||||
|
" print(f\"--- Happy Robot ---\")\n",
|
||||||
|
" print(f\" Hora: {sample['re_time']}\")\n",
|
||||||
|
" print(f\" Telefono: {sample['hr_phone']}\")\n",
|
||||||
|
" print(f\" Matricula: {sample['hr_plate']}\")\n",
|
||||||
|
" print(f\" Clasificacion: {sample['hr_classification']}\")\n",
|
||||||
|
" print(f\" Run ID: {sample['run_id']}\")\n",
|
||||||
|
" print(f\"\")\n",
|
||||||
|
" print(f\"--- Call Center ---\")\n",
|
||||||
|
" print(f\" Hora: {sample['ct_time']}\")\n",
|
||||||
|
" print(f\" Campana: {sample['campaign_name']}\")\n",
|
||||||
|
" print(f\" Resolucion: {sample['ct_resolution']}\")\n",
|
||||||
|
" print(f\" Agente: {sample['ct_agente']}\")\n",
|
||||||
|
" print(f\" Duracion conversacion: {sample['ct_duration']}s\")\n",
|
||||||
|
"else:\n",
|
||||||
|
" print(\"No hay pares matcheados\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "d7e8f9a0",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Detalle completo del run_event seleccionado desde la BD\n",
|
||||||
|
"if len(df_matched) > 0:\n",
|
||||||
|
" sample_re_id = df_matched.iloc[0]['re_id']\n",
|
||||||
|
" df_re_detail = query_to_df(f\"\"\"\n",
|
||||||
|
" SELECT *\n",
|
||||||
|
" FROM `happy_robot_publicpublic.run_events`\n",
|
||||||
|
" WHERE id = '{sample_re_id}'\n",
|
||||||
|
" \"\"\")\n",
|
||||||
|
" print(\"Detalle completo del run_event:\")\n",
|
||||||
|
" for col in df_re_detail.columns:\n",
|
||||||
|
" print(f\" {col}: {df_re_detail.iloc[0][col]}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "e8f9a0b1",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Timeline visual: todos los matches ordenados cronologicamente\n",
|
||||||
|
"if len(df_matched) > 0:\n",
|
||||||
|
" df_timeline = df_matched.copy()\n",
|
||||||
|
" df_timeline['re_time'] = pd.to_datetime(df_timeline['re_time'])\n",
|
||||||
|
" df_timeline['ct_time'] = pd.to_datetime(df_timeline['ct_time'])\n",
|
||||||
|
" df_timeline['diff_minutes'] = (df_timeline['re_time'] - df_timeline['ct_time']).dt.total_seconds() / 60\n",
|
||||||
|
"\n",
|
||||||
|
" fig = px.scatter(df_timeline, x='re_time', y='diff_minutes',\n",
|
||||||
|
" color='hr_classification',\n",
|
||||||
|
" hover_data=['re_id', 'ct_id', 'hr_classification', 'ct_resolution'],\n",
|
||||||
|
" title='Diferencia temporal: Happy Robot vs Call Center (minutos)',\n",
|
||||||
|
" labels={'re_time': 'Hora Happy Robot', 'diff_minutes': 'Diferencia (min)'})\n",
|
||||||
|
" fig.add_hline(y=0, line_dash='dash', line_color='gray')\n",
|
||||||
|
" fig.show()\n",
|
||||||
|
"\n",
|
||||||
|
" print(f\"\\nDiferencia temporal media: {df_timeline['diff_minutes'].mean():.1f} min\")\n",
|
||||||
|
" print(f\"Diferencia temporal mediana: {df_timeline['diff_minutes'].median():.1f} min\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "f9a0b1c2",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 9. Metricas cruzadas\n",
|
||||||
|
"\n",
|
||||||
|
"Para llamadas matcheadas:\n",
|
||||||
|
"- Comparacion de duracion: `run_events.duration` vs `call_transactions.t_convers`\n",
|
||||||
|
"- Tasa de exito/resolucion en cada sistema"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "a0b1c2d3a",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Comparar duraciones donde ambas estan disponibles\n",
|
||||||
|
"df_durations = query_to_df(\"\"\"\n",
|
||||||
|
"WITH re AS (\n",
|
||||||
|
" SELECT *,\n",
|
||||||
|
" REPLACE(REPLACE(source, '+34', ''), ' ', '') AS phone_clean\n",
|
||||||
|
" FROM `happy_robot_publicpublic.run_events`\n",
|
||||||
|
"),\n",
|
||||||
|
"ct AS (\n",
|
||||||
|
" SELECT *,\n",
|
||||||
|
" REPLACE(REPLACE(REPLACE(TRIM(telefonos), '*34', ''), '+34', ''), ' ', '') AS phone_clean\n",
|
||||||
|
" FROM `psql_dcpublic.call_transactions`\n",
|
||||||
|
" WHERE LOWER(campaign_name) LIKE '%mutua%'\n",
|
||||||
|
" AND date_time >= '2026-04-01'\n",
|
||||||
|
")\n",
|
||||||
|
"SELECT\n",
|
||||||
|
" re.id AS re_id,\n",
|
||||||
|
" re.duration AS hr_duration,\n",
|
||||||
|
" ct.t_convers AS ct_duration,\n",
|
||||||
|
" re.classification AS hr_classification,\n",
|
||||||
|
" ct.description AS ct_resolution\n",
|
||||||
|
"FROM re\n",
|
||||||
|
"JOIN ct ON re.phone_clean = ct.phone_clean\n",
|
||||||
|
" AND DATE(re.time) = DATE(ct.date_time)\n",
|
||||||
|
"WHERE re.duration IS NOT NULL AND ct.t_convers IS NOT NULL\n",
|
||||||
|
"ORDER BY re.time\n",
|
||||||
|
"\"\"\")\n",
|
||||||
|
"\n",
|
||||||
|
"if len(df_durations) > 0:\n",
|
||||||
|
" df_durations['hr_duration'] = pd.to_numeric(df_durations['hr_duration'], errors='coerce')\n",
|
||||||
|
" df_durations['ct_duration'] = pd.to_numeric(df_durations['ct_duration'], errors='coerce')\n",
|
||||||
|
"\n",
|
||||||
|
" print(f\"Pares con duracion en ambos sistemas: {len(df_durations)}\")\n",
|
||||||
|
" print(f\"\\nDuracion Happy Robot (seg):\")\n",
|
||||||
|
" print(df_durations['hr_duration'].describe())\n",
|
||||||
|
" print(f\"\\nDuracion Call Center (seg):\")\n",
|
||||||
|
" print(df_durations['ct_duration'].describe())\n",
|
||||||
|
"else:\n",
|
||||||
|
" print(\"No hay pares con duracion en ambos sistemas\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "b1c2d3e4a",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Scatter: duracion HR vs duracion CT\n",
|
||||||
|
"if len(df_durations) > 0 and df_durations['hr_duration'].notna().sum() > 0:\n",
|
||||||
|
" fig = px.scatter(df_durations, x='hr_duration', y='ct_duration',\n",
|
||||||
|
" color='hr_classification',\n",
|
||||||
|
" hover_data=['re_id', 'ct_resolution'],\n",
|
||||||
|
" title='Duracion: Happy Robot vs Call Center (segundos)',\n",
|
||||||
|
" labels={'hr_duration': 'Happy Robot (seg)', 'ct_duration': 'Call Center (seg)'})\n",
|
||||||
|
" # Linea diagonal (misma duracion)\n",
|
||||||
|
" max_dur = max(df_durations['hr_duration'].max(), df_durations['ct_duration'].max())\n",
|
||||||
|
" fig.add_trace(go.Scatter(x=[0, max_dur], y=[0, max_dur],\n",
|
||||||
|
" mode='lines', line=dict(dash='dash', color='gray'),\n",
|
||||||
|
" name='Igual duracion'))\n",
|
||||||
|
" fig.show()\n",
|
||||||
|
"else:\n",
|
||||||
|
" print(\"Datos de duracion insuficientes para scatter\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "c2d3e4f5a",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Tasa de exito por sistema\n",
|
||||||
|
"if len(df_matched) > 0:\n",
|
||||||
|
" print(\"=== Tasas de resolucion en pares matcheados ===\")\n",
|
||||||
|
" print(f\"\\nHappy Robot - clasificaciones:\")\n",
|
||||||
|
" hr_counts = df_matched['hr_classification'].value_counts()\n",
|
||||||
|
" for cls, count in hr_counts.items():\n",
|
||||||
|
" print(f\" {cls}: {count} ({100*count/len(df_matched):.1f}%)\")\n",
|
||||||
|
"\n",
|
||||||
|
" print(f\"\\nCall Center - resoluciones:\")\n",
|
||||||
|
" ct_counts = df_matched['ct_resolution'].value_counts()\n",
|
||||||
|
" for res, count in ct_counts.items():\n",
|
||||||
|
" print(f\" {res}: {count} ({100*count/len(df_matched):.1f}%)\")\n",
|
||||||
|
"else:\n",
|
||||||
|
" print(\"No hay pares matcheados\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "d3e4f5a6a",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Distribucion comparada de clasificaciones\n",
|
||||||
|
"if len(df_matched) > 0:\n",
|
||||||
|
" fig = go.Figure()\n",
|
||||||
|
"\n",
|
||||||
|
" hr_vc = df_matched['hr_classification'].value_counts()\n",
|
||||||
|
" ct_vc = df_matched['ct_resolution'].value_counts().head(10)\n",
|
||||||
|
"\n",
|
||||||
|
" fig.add_trace(go.Bar(name='Happy Robot', x=hr_vc.index, y=hr_vc.values))\n",
|
||||||
|
"\n",
|
||||||
|
" fig.update_layout(\n",
|
||||||
|
" title='Distribucion de clasificaciones - Happy Robot (pares matcheados)',\n",
|
||||||
|
" xaxis_title='Clasificacion', yaxis_title='Count',\n",
|
||||||
|
" barmode='group'\n",
|
||||||
|
" )\n",
|
||||||
|
" fig.show()\n",
|
||||||
|
"\n",
|
||||||
|
" fig2 = go.Figure()\n",
|
||||||
|
" fig2.add_trace(go.Bar(name='Call Center', x=ct_vc.index, y=ct_vc.values,\n",
|
||||||
|
" marker_color='coral'))\n",
|
||||||
|
" fig2.update_layout(\n",
|
||||||
|
" title='Distribucion de resoluciones - Call Center (pares matcheados, top 10)',\n",
|
||||||
|
" xaxis_title='Resolucion', yaxis_title='Count'\n",
|
||||||
|
" )\n",
|
||||||
|
" fig2.show()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "e4f5a6b7a",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 10. Resumen y hallazgos"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "f5a6b7c8a",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Resumen final cuantitativo\n",
|
||||||
|
"print(\"=\"*60)\n",
|
||||||
|
"print(\"RESUMEN: EDA Combinado Happy Robot + Call Transactions\")\n",
|
||||||
|
"print(\"=\"*60)\n",
|
||||||
|
"print(f\"\\n1. VOLUMENES\")\n",
|
||||||
|
"print(f\" run_events (Happy Robot): {total_re} filas\")\n",
|
||||||
|
"print(f\" call_transactions Mutua (abril+): ver celda 3\")\n",
|
||||||
|
"print(f\"\")\n",
|
||||||
|
"print(f\"2. COBERTURA\")\n",
|
||||||
|
"print(f\" Match por telefono: {len(re_ids_phone)} run_events ({100*len(re_ids_phone)/total_re:.1f}%)\")\n",
|
||||||
|
"print(f\" Match por matricula: {len(re_ids_plate)} run_events ({100*len(re_ids_plate)/total_re:.1f}%)\")\n",
|
||||||
|
"print(f\" Match por cualquier metodo: {len(re_ids_any)} run_events ({100*len(re_ids_any)/total_re:.1f}%)\")\n",
|
||||||
|
"print(f\" Sin match: {total_re - len(re_ids_any)} run_events ({100*(total_re - len(re_ids_any))/total_re:.1f}%)\")\n",
|
||||||
|
"print(f\"\")\n",
|
||||||
|
"print(f\"3. JOIN BRIDGES\")\n",
|
||||||
|
"print(f\" Telefono: source(+34...) <-> telefonos/orig, normalizado, mismo dia\")\n",
|
||||||
|
"print(f\" Matricula: license_plate <-> cliente, mismo dia\")\n",
|
||||||
|
"print(f\" Filtro: solo campanas Mutua en call_transactions\")\n",
|
||||||
|
"print(f\"\")\n",
|
||||||
|
"print(f\"4. PARES MATCHEADOS: {len(df_matched)}\")\n",
|
||||||
|
"if len(df_matched) > 0:\n",
|
||||||
|
" print(f\" Clasificaciones HR: {df_matched['hr_classification'].value_counts().to_dict()}\")\n",
|
||||||
|
" print(f\" Resoluciones CT (top 5): {df_matched['ct_resolution'].value_counts().head(5).to_dict()}\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "a6b7c8d9a",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Hallazgos clave\n",
|
||||||
|
"\n",
|
||||||
|
"**Pendiente de rellenar tras ejecucion:**\n",
|
||||||
|
"\n",
|
||||||
|
"1. **Tasa de matching**: X% de las llamadas de Happy Robot se encuentran en call_transactions\n",
|
||||||
|
"2. **Metodo de join mas efectivo**: telefono vs matricula\n",
|
||||||
|
"3. **Consistencia de resultados**: como se comparan las clasificaciones del agente IA con las resoluciones humanas\n",
|
||||||
|
"4. **Diferencia temporal**: cuanto tiempo pasa entre el registro en cada sistema\n",
|
||||||
|
"5. **Duracion comparada**: llamadas del agente IA son mas cortas/largas que las humanas?\n",
|
||||||
|
"6. **Campanas relevantes**: que campanas Mutua especificas solapan con Happy Robot\n",
|
||||||
|
"\n",
|
||||||
|
"### Proximos pasos\n",
|
||||||
|
"\n",
|
||||||
|
"- Ampliar ventana temporal si el matching es bajo (probar +/- 1 dia en vez de mismo dia)\n",
|
||||||
|
"- Analizar las llamadas sin match: por que no aparecen en call_transactions?\n",
|
||||||
|
"- Comparar con campanas no-Mutua por si hay llamadas mal clasificadas\n",
|
||||||
|
"- Evaluar si Happy Robot reduce la carga del call center o genera llamadas duplicadas"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3 (ipykernel)",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.13.5"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 5
|
||||||
|
}
|
||||||
@@ -0,0 +1,17 @@
|
|||||||
|
[project]
|
||||||
|
name = "happy-robot-calls"
|
||||||
|
version = "0.1.0"
|
||||||
|
description = "Add your description here"
|
||||||
|
readme = "README.md"
|
||||||
|
requires-python = ">=3.13"
|
||||||
|
dependencies = [
|
||||||
|
"httpx>=0.28.1",
|
||||||
|
"jupyter>=1.1.1",
|
||||||
|
"jupyter-collaboration>=4.3.0",
|
||||||
|
"jupyter-mcp-server>=0.4.0",
|
||||||
|
"jupyterlab>=4.5.6",
|
||||||
|
"matplotlib>=3.10.8",
|
||||||
|
"numpy>=2.4.4",
|
||||||
|
"pandas>=3.0.2",
|
||||||
|
"plotly>=6.7.0",
|
||||||
|
]
|
||||||
Executable
+50
@@ -0,0 +1,50 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
# Jupyter Lab — modo colaborativo con autodeteccion de puerto
|
||||||
|
# Generado por write_jupyter_launcher (fn_registry)
|
||||||
|
|
||||||
|
find_free_port() {
|
||||||
|
for port in 8888 8889 8890 8891 8892 8893 8894 8895 8896 8897 8898 8899; do
|
||||||
|
if ! ss -tln 2>/dev/null | grep -q ":${port} " && \
|
||||||
|
! lsof -i:"$port" >/dev/null 2>&1; then
|
||||||
|
echo $port
|
||||||
|
return
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
echo 8888
|
||||||
|
}
|
||||||
|
|
||||||
|
PORT=${1:-$(find_free_port)}
|
||||||
|
cd "$(dirname "$0")"
|
||||||
|
|
||||||
|
echo $PORT > .jupyter-port
|
||||||
|
|
||||||
|
source .venv/bin/activate 2>/dev/null || true
|
||||||
|
|
||||||
|
# IPython startup: cargar .ipython/ local (FN_REGISTRY_ROOT, helpers, sys.path)
|
||||||
|
if [ -d "$(pwd)/.ipython" ]; then
|
||||||
|
export IPYTHONDIR="$(pwd)/.ipython"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if ! python -c "import jupyter_collaboration" 2>/dev/null; then
|
||||||
|
echo "ERROR: jupyter-collaboration no esta instalado"
|
||||||
|
echo "Instala con: uv add jupyter-collaboration"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "════════════════════════════════════════════════"
|
||||||
|
echo " Jupyter Lab + Colaboracion en puerto $PORT"
|
||||||
|
echo "════════════════════════════════════════════════"
|
||||||
|
echo ""
|
||||||
|
echo " Abre: http://localhost:$PORT"
|
||||||
|
echo " Ctrl+C para detener"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
jupyter lab \
|
||||||
|
--port=$PORT \
|
||||||
|
--no-browser \
|
||||||
|
--ServerApp.token='' \
|
||||||
|
--ServerApp.password='' \
|
||||||
|
--ServerApp.disable_check_xsrf=True \
|
||||||
|
--ServerApp.allow_origin='*' \
|
||||||
|
--ServerApp.root_dir="$(pwd)" \
|
||||||
|
--collaborative
|
||||||
Reference in New Issue
Block a user