118 lines
6.5 KiB
Plaintext
118 lines
6.5 KiB
Plaintext
{
|
|
"cells": [
|
|
{"cell_type": "markdown", "metadata": {}, "source": [
|
|
"# 01 — Panorama del registry por dominio\n",
|
|
"\n",
|
|
"**Objetivo**: ver cuantas funciones tenemos por dominio, pureza/test/pipelines, y listar las **mas interesantes** (por reutilizacion y signature) de cada dominio.\n",
|
|
"\n",
|
|
"**Fuente**: `registry.db` (FTS5 indexado por `fn index`).\n",
|
|
"\n",
|
|
"**Secciones**\n",
|
|
"1. Conteo por dominio + cuota de puras y testeadas\n",
|
|
"2. Top funciones por dominio (curado a mano tras revisar names+desc)\n",
|
|
"3. Conclusiones"
|
|
]},
|
|
{"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [
|
|
"import os, sqlite3\n",
|
|
"import pandas as pd\n",
|
|
"ROOT = os.environ['FN_REGISTRY_ROOT']\n",
|
|
"conn = sqlite3.connect(f'file:{ROOT}/registry.db?mode=ro', uri=True)\n",
|
|
"pd.set_option('display.max_colwidth', 110)"
|
|
]},
|
|
{"cell_type": "markdown", "metadata": {}, "source": ["## 1. Cuenta por dominio"]},
|
|
{"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [
|
|
"q = '''\n",
|
|
"SELECT domain,\n",
|
|
" COUNT(*) AS total,\n",
|
|
" SUM(CASE WHEN purity='pure' THEN 1 ELSE 0 END) AS pure,\n",
|
|
" SUM(CASE WHEN tested=1 THEN 1 ELSE 0 END) AS tested,\n",
|
|
" SUM(CASE WHEN kind='pipeline' THEN 1 ELSE 0 END) AS pipelines\n",
|
|
" FROM functions\n",
|
|
" GROUP BY domain\n",
|
|
" ORDER BY total DESC;\n",
|
|
"'''\n",
|
|
"df = pd.read_sql_query(q, conn)\n",
|
|
"df['pure_pct'] = (100*df['pure']/df['total']).round(1)\n",
|
|
"df['tested_pct'] = (100*df['tested']/df['total']).round(1)\n",
|
|
"df"
|
|
]},
|
|
{"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [
|
|
"import matplotlib.pyplot as plt\n",
|
|
"ax = df.set_index('domain')[['pure','total']].plot.bar(figsize=(11,4))\n",
|
|
"ax.set_title('Funciones por dominio (totales y puras)'); plt.tight_layout(); plt.show()"
|
|
]},
|
|
{"cell_type": "markdown", "metadata": {}, "source": [
|
|
"## 2. Top funciones interesantes por dominio\n",
|
|
"\n",
|
|
"Seleccion manual de las funciones mas reutilizables/expresivas de cada bloque (no es ranking automatico — el FTS no captura 'interesante')."
|
|
]},
|
|
{"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [
|
|
"def top(domain, ids):\n",
|
|
" qmarks = ','.join('?'*len(ids))\n",
|
|
" df = pd.read_sql_query(\n",
|
|
" f\"SELECT id, lang, purity, signature, description FROM functions WHERE id IN ({qmarks})\",\n",
|
|
" conn, params=ids)\n",
|
|
" print(f'=== {domain} ({len(df)}) ===')\n",
|
|
" return df"
|
|
]},
|
|
{"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [
|
|
"top('finance', [\n",
|
|
" 'fetch_ohlcv_go_finance','tick_to_ohlcv_go_finance','stream_ticks_go_finance',\n",
|
|
" 'sma_go_finance','ema_go_finance','rsi_go_finance','vwap_go_finance',\n",
|
|
" 'bollinger_bands_go_finance','sharpe_ratio_go_finance','max_drawdown_go_finance',\n",
|
|
" 'avellaneda_stoikov_quotes_py_finance','hawkes_intensity_py_finance','generate_gbm_prices_py_finance',\n",
|
|
" 'write_ohlcv_to_parquet_go_finance','load_ohlcv_from_duckdb_go_finance'])"
|
|
]},
|
|
{"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [
|
|
"top('browser (CDP)', [\n",
|
|
" 'chrome_launch_go_browser','cdp_connect_go_browser','cdp_navigate_go_browser',\n",
|
|
" 'cdp_evaluate_go_browser','cdp_get_html_go_browser','cdp_screenshot_go_browser',\n",
|
|
" 'cdp_click_go_browser','cdp_click_text_go_browser','cdp_find_by_text_go_browser',\n",
|
|
" 'cdp_type_text_go_browser','cdp_wait_element_go_browser','cdp_wait_load_go_browser',\n",
|
|
" 'cdp_har_record_go_browser','cdp_set_cookie_go_browser','cdp_new_tab_go_browser'])"
|
|
]},
|
|
{"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [
|
|
"top('infra (HTTP/WS/SSE)', [\n",
|
|
" 'http_get_json_go_infra','http_post_json_go_infra','http_router_go_infra','http_serve_go_infra',\n",
|
|
" 'http_download_file_go_infra','http_cors_middleware_go_infra','http_logger_middleware_go_infra',\n",
|
|
" 'rate_limit_middleware_go_infra','jwt_middleware_go_infra','sse_handler_go_infra',\n",
|
|
" 'sse_send_go_infra','sse_keepalive_go_infra','ws_handler_go_infra','ws_upgrader_go_infra',\n",
|
|
" 'health_check_http_go_infra'])"
|
|
]},
|
|
{"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [
|
|
"top('datascience', [r[0] for r in conn.execute(\n",
|
|
" \"SELECT id FROM functions WHERE domain='datascience' AND purity='pure' ORDER BY name LIMIT 15\")])"
|
|
]},
|
|
{"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [
|
|
"top('cybersecurity', [r[0] for r in conn.execute(\n",
|
|
" \"SELECT id FROM functions WHERE domain='cybersecurity' ORDER BY tested DESC, name LIMIT 12\")])"
|
|
]},
|
|
{"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [
|
|
"top('ml', [r[0] for r in conn.execute(\n",
|
|
" \"SELECT id FROM functions WHERE domain='ml' ORDER BY name LIMIT 15\")])"
|
|
]},
|
|
{"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [
|
|
"top('pipelines', [r[0] for r in conn.execute(\n",
|
|
" \"SELECT id FROM functions WHERE domain='pipelines' ORDER BY name LIMIT 15\")])"
|
|
]},
|
|
{"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [
|
|
"top('gamedev (audio)', ['audio_engine_cpp_gamedev','audio_play_cpp_gamedev'])"
|
|
]},
|
|
{"cell_type": "markdown", "metadata": {}, "source": [
|
|
"## 3. Conclusiones\n",
|
|
"\n",
|
|
"- **infra (496)** y **core (240)** dominan: middleware HTTP, SSE/WS, SQLite, helpers Go puros.\n",
|
|
"- **finance (28)** ya tiene un mini stack de trading + market-making: indicadores, OHLCV, simulador Avellaneda-Stoikov + Hawkes + GBM.\n",
|
|
"- **browser (17)** = CDP completo en Go puro (sin chromedp). Base solida para scraping y RPA.\n",
|
|
"- **ml (25)** son casi todos **tipos** (`image_generator`, `model_ref`, `lora_ref`, `generation_config`) — el contrato esta definido, las **funciones de ejecucion estan vacias**.\n",
|
|
"- **audio**: solo playback (miniaudio en `gamedev`). 0 generacion, 0 STT/TTS, 0 voice conversion.\n",
|
|
"- **LLM/text**: 0 clientes — solo tipos `message/part/tool_part` en core."
|
|
]}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {"display_name": "Python 3", "language": "python", "name": "python3"},
|
|
"language_info": {"name": "python", "version": "3.12"}
|
|
},
|
|
"nbformat": 4, "nbformat_minor": 5
|
|
}
|