{ "cells": [ { "cell_type": "markdown", "id": "3451405a", "metadata": {}, "source": [ "# Spike 01 — SD Turbo via diffusers backend\n", "\n", "**Objetivo:** validar contrato `GenerationConfig_py_ml` + backend `diffusers_generate_py_ml` con SD Turbo. Sanity check antes de invertir en sd.cpp / FLUX.\n", "\n", "**Criterio PASS:**\n", "- 4 imagenes generadas, seeds deterministas, mismo seed → imagen identica entre runs.\n", "- `vram_peak_mb` < 6000 MB.\n", "- `duration_ms` < 5000 ms/imagen.\n", "- `image_grid` 2x2 visible.\n", "\n", "**Criterio FAIL:** cualquiera de los anteriores no se cumple → replantear contrato o backend.\n", "\n", "**Lo que NO se valida aqui:** LoRA loading, comparacion vs sd.cpp, samplers != euler_a (SD Turbo solo soporta 1-step euler_a).\n", "\n", "**Funciones del registry usadas:**\n", "- `cuda_available_py_ml`, `gpu_info_py_ml`, `vram_budget_py_ml`\n", "- `diffusers_load_pipeline_py_ml`, `diffusers_generate_py_ml`, `diffusers_unload_py_ml`\n", "- `image_grid_py_ml`, `image_save_png_py_ml`\n", "- Tipos: `GenerationConfig`, `ModelRef`, `ImageGenResult`" ] }, { "cell_type": "markdown", "id": "4b43e6a2", "metadata": {}, "source": [ "## 1. Hardware check\n", "\n", "Verificar CUDA + GPU antes de empezar. Si no hay GPU, el resto del notebook correra en CPU (lento)." ] }, { "cell_type": "code", "execution_count": 1, "id": "5f395460", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CUDA: {'available': True, 'device_count': 1, 'devices': ['NVIDIA GeForce RTX 3070'], 'torch_version': '2.11.0+cu130', 'cuda_version': '13.0'}\n", "GPUs: [{'index': 0, 'name': 'NVIDIA GeForce RTX 3070', 'vram_total_mb': 8192, 'vram_free_mb': 4404, 'driver_version': '591.86', 'cuda_version': '8.6'}]\n" ] } ], "source": [ "import sys, os\n", "FN_ROOT = os.environ.get(\"FN_REGISTRY_ROOT\", \"/home/lucas/fn_registry\")\n", "sys.path.insert(0, os.path.join(FN_ROOT, \"python/functions/ml\"))\n", "\n", "from cuda_available import cuda_available\n", "from gpu_info import gpu_info\n", "from vram_budget import vram_budget\n", "\n", "cuda = cuda_available()\n", "gpus = gpu_info()\n", "print(\"CUDA:\", cuda)\n", "print(\"GPUs:\", gpus)" ] }, { "cell_type": "markdown", "id": "6b7c5272", "metadata": {}, "source": [ "## 2. Hipotesis VRAM\n", "\n", "SD Turbo es derivado de SD 1.5 (~1.5GB fp16). Usamos `vram_budget` con `sd15/fp16` como proxy. Esperamos `fits=True` y `required_mb` ~2000-4000." ] }, { "cell_type": "code", "execution_count": 2, "id": "0b7eb904", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "VRAM total: 8192 MB\n", "Budget: {'required_mb': 6196, 'fits': True, 'headroom_mb': 1996, 'warning': None}\n" ] } ], "source": [ "total_mb = gpus[0][\"vram_total_mb\"] if gpus else 8000 # fallback CPU/sim\n", "budget = vram_budget(\n", " gpu_vram_total_mb=total_mb,\n", " model_type=\"sd15\",\n", " quantization=\"fp16\",\n", " width=512, height=512,\n", ")\n", "print(\"VRAM total:\", total_mb, \"MB\")\n", "print(\"Budget:\", budget)" ] }, { "cell_type": "markdown", "id": "a89fec8c", "metadata": {}, "source": [ "## 3. Construir 4 `GenerationConfig`\n", "\n", "Seeds fijos: 42, 123, 7, 999. Prompts variados. SD Turbo: `steps=1`, `cfg_scale=0.0` (no usa guidance), `sampler=euler_a`." ] }, { "cell_type": "code", "execution_count": 3, "id": "b210dfc6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "42 a cinematic shot of a baby racoon wearing a tiny crown\n", "123 watercolor of a cozy library with floating books\n", "7 isometric pixel art of a robot fishing on a pier\n", "999 oil painting of a fox playing chess against a cat\n" ] } ], "source": [ "# Forzar imports limpios para evitar double-import pydantic\n", "for _mod in [\"sampler_name\", \"model_ref\", \"lora_ref\", \"generation_config\", \"image_gen_result\"]:\n", " sys.modules.pop(_mod, None)\n", "\n", "from model_ref import ModelRef\n", "from generation_config import GenerationConfig\n", "\n", "VAULT = \"/home/lucas/vaults/imagegen_models/diffusers/sd-turbo\"\n", "\n", "model = ModelRef(\n", " name=\"stabilityai/sd-turbo\",\n", " model_type=\"sd15\",\n", " quantization=\"fp16\",\n", " path=VAULT,\n", ")\n", "\n", "PROMPTS = [\n", " (\"a cinematic shot of a baby racoon wearing a tiny crown\", 42),\n", " (\"watercolor of a cozy library with floating books\", 123),\n", " (\"isometric pixel art of a robot fishing on a pier\", 7),\n", " (\"oil painting of a fox playing chess against a cat\", 999),\n", "]\n", "\n", "configs = [\n", " GenerationConfig(\n", " prompt=p,\n", " seed=s,\n", " steps=1,\n", " cfg_scale=0.0,\n", " sampler=\"euler_a\",\n", " width=512,\n", " height=512,\n", " model=model,\n", " )\n", " for p, s in PROMPTS\n", "]\n", "for c in configs:\n", " print(c.seed, c.prompt[:60])" ] }, { "cell_type": "markdown", "id": "743b6bb2", "metadata": {}, "source": [ "## 4. Cargar pipeline SD Turbo\n", "\n", "Primera carga ~10-30s (mete pesos en VRAM). Llamadas subsiguientes: cacheadas." ] }, { "cell_type": "code", "execution_count": 4, "id": "de0f03b2", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "[transformers] `CLIPImageProcessor` requires torchvision (not installed); falling back to `CLIPImageProcessorPil` for backward compatibility. Install torchvision to use the default backend, or import `CLIPImageProcessorPil` directly to silence this warning.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[transformers] `SiglipImageProcessor` requires torchvision (not installed); falling back to `SiglipImageProcessorPil` for backward compatibility. Install torchvision to use the default backend, or import `SiglipImageProcessorPil` directly to silence this warning.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[transformers] `Siglip2ImageProcessorFast` is deprecated. The `Fast` suffix for image processors has been removed; use `Siglip2ImageProcessor` instead.\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b08c215efc5b4af7aab91bc7aeebc2af", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Loading pipeline components...: 0%| | 0/5 [00:00 by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Pipeline: StableDiffusionPipeline\n", "Scheduler: EulerDiscreteScheduler\n", "Device: cuda:0\n" ] } ], "source": [ "from diffusers_load_pipeline import diffusers_load_pipeline\n", "\n", "pipe = diffusers_load_pipeline(model=model, device=\"auto\", dtype=\"fp16\")\n", "print(\"Pipeline:\", type(pipe).__name__)\n", "print(\"Scheduler:\", type(pipe.scheduler).__name__)\n", "print(\"Device:\", next(pipe.unet.parameters()).device)" ] }, { "cell_type": "markdown", "id": "a328f117", "metadata": {}, "source": [ "## 5. Generar 4 imagenes\n", "\n", "Una a una. Captura `duration_ms` + `vram_peak_mb` por imagen. SD Turbo 1-step: esperado <2s en GPU NVIDIA moderna." ] }, { "cell_type": "code", "execution_count": 5, "id": "d1780d91", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6fdcc6415e644121bf9e10a2e46d2bbc", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/1 [00:00" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from image_grid import image_grid\n", "\n", "labels = [f\"seed={r.meta.get('seed', '?')} | {c.prompt[:40]}...\" for c, r in zip(configs2, results)]\n", "grid = image_grid(\n", " images=[r.image for r in results],\n", " cols=2,\n", " labels=labels,\n", " gap_px=12,\n", ")\n", "grid" ] }, { "cell_type": "markdown", "id": "ae4e3184", "metadata": {}, "source": [ "## 7. Guardar grid + configs a vault\n", "\n", "PNG con metadata embebida (prompt/seed) en `~/vaults/imagegen_models/outputs/`. JSON canonico en `configs/`." ] }, { "cell_type": "code", "execution_count": 7, "id": "f7ca8a69", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "grid: /home/lucas/vaults/imagegen_models/outputs/spike01_grid_20260513_003047.png\n", " [0] /home/lucas/vaults/imagegen_models/outputs/spike01_seed42_20260513_003047.png\n", " /home/lucas/vaults/imagegen_models/configs/spike01_seed42_20260513_003047.json\n", " [1] /home/lucas/vaults/imagegen_models/outputs/spike01_seed123_20260513_003047.png\n", " /home/lucas/vaults/imagegen_models/configs/spike01_seed123_20260513_003047.json\n", " [2] /home/lucas/vaults/imagegen_models/outputs/spike01_seed7_20260513_003047.png\n", " /home/lucas/vaults/imagegen_models/configs/spike01_seed7_20260513_003047.json\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " [3] /home/lucas/vaults/imagegen_models/outputs/spike01_seed999_20260513_003047.png\n", " /home/lucas/vaults/imagegen_models/configs/spike01_seed999_20260513_003047.json\n" ] } ], "source": [ "from image_save_png import image_save_png\n", "from genconfig_save_json import genconfig_save_json\n", "import datetime, os as _os\n", "\n", "OUT_DIR = \"/home/lucas/vaults/imagegen_models/outputs\"\n", "CFG_DIR = \"/home/lucas/vaults/imagegen_models/configs\"\n", "stamp = datetime.datetime.now().strftime(\"%Y%m%d_%H%M%S\")\n", "\n", "grid_path = image_save_png(\n", " grid,\n", " _os.path.join(OUT_DIR, f\"spike01_grid_{stamp}.png\"),\n", " metadata={\"experiment\": \"spike01_sd_turbo\", \"model\": model2.name, \"n\": str(len(results))},\n", ")\n", "print(\"grid:\", grid_path)\n", "\n", "for i, (c, r) in enumerate(zip(configs2, results)):\n", " img_path = image_save_png(\n", " r.image,\n", " _os.path.join(OUT_DIR, f\"spike01_seed{c.seed}_{stamp}.png\"),\n", " metadata={\"prompt\": c.prompt, \"seed\": str(c.seed), \"steps\": str(c.steps), \"sampler\": c.sampler},\n", " )\n", " cfg_path = genconfig_save_json(c, _os.path.join(CFG_DIR, f\"spike01_seed{c.seed}_{stamp}.json\"))\n", " print(f\" [{i}] {img_path}\")\n", " print(f\" {cfg_path}\")" ] }, { "cell_type": "markdown", "id": "071982d6", "metadata": {}, "source": [ "## 8. Tabla resumen + veredicto" ] }, { "cell_type": "code", "execution_count": 8, "id": "4a4a18c4", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
seedpromptduration_msvram_peak_mb
042a cinematic shot of a baby racoon wearing a ti...16153097
1123watercolor of a cozy library with floating boo...1923097
27isometric pixel art of a robot fishing on a pi...1893097
3999oil painting of a fox playing chess against a ...1973097
\n", "
" ], "text/plain": [ " seed prompt duration_ms \\\n", "0 42 a cinematic shot of a baby racoon wearing a ti... 1615 \n", "1 123 watercolor of a cozy library with floating boo... 192 \n", "2 7 isometric pixel art of a robot fishing on a pi... 189 \n", "3 999 oil painting of a fox playing chess against a ... 197 \n", "\n", " vram_peak_mb \n", "0 3097 \n", "1 3097 \n", "2 3097 \n", "3 3097 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Max duration: 1615 ms (target <5000) -> OK\n", "Max VRAM peak: 3097 MB (target <6000) -> OK\n", "\n", "VEREDICTO: PASS\n" ] } ], "source": [ "rows = []\n", "for c, r in zip(configs2, results):\n", " rows.append({\n", " \"seed\": c.seed,\n", " \"prompt\": c.prompt[:50] + \"...\",\n", " \"duration_ms\": r.duration_ms,\n", " \"vram_peak_mb\": r.vram_peak_mb,\n", " })\n", "\n", "try:\n", " import pandas as pd\n", " df = pd.DataFrame(rows)\n", " display(df)\n", "except ImportError:\n", " for row in rows:\n", " print(row)\n", "\n", "max_dur = max(r.duration_ms for r in results)\n", "max_vram = max((r.vram_peak_mb or 0) for r in results)\n", "pass_dur = max_dur < 5000\n", "pass_vram = max_vram < 6000\n", "verdict = \"PASS\" if (pass_dur and pass_vram) else \"FAIL\"\n", "print(f\"\\nMax duration: {max_dur} ms (target <5000) -> {'OK' if pass_dur else 'FAIL'}\")\n", "print(f\"Max VRAM peak: {max_vram} MB (target <6000) -> {'OK' if pass_vram else 'FAIL'}\")\n", "print(f\"\\nVEREDICTO: {verdict}\")" ] }, { "cell_type": "markdown", "id": "19701ac7", "metadata": {}, "source": [ "## 9. Cleanup VRAM" ] }, { "cell_type": "code", "execution_count": 9, "id": "d25cfb9c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "unloaded\n" ] } ], "source": [ "from diffusers_unload import diffusers_unload\n", "diffusers_unload(pipe)\n", "diffusers_unload(None) # clear cache global\n", "print(\"unloaded\")" ] }, { "cell_type": "markdown", "id": "f331dc00", "metadata": {}, "source": [ "## 10. Siguiente paso\n", "\n", "Si PASS:\n", "- Notebook `02_seed_reproducibility.ipynb` — mismo seed, dos runs, verificar hash identico.\n", "- Notebook `03_sdxl_turbo.ipynb` cuando se descargue (~6.5GB).\n", "- Lanzar Ola 3.B (sd.cpp Python bindings) + descargar FLUX schnell GGUF para validacion cruzada.\n", "\n", "Si FAIL:\n", "- Revisar `vram_budget` table si VRAM peak excede.\n", "- Revisar `diffusers_generate` si duration excede (offloading, dtype incorrecto).\n", "- Replantear contrato `GenerationConfig` si la firma no cuadra." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.7" } }, "nbformat": 4, "nbformat_minor": 5 }