---
id: "0027"
title: "C++ gl_compute_shader + gl_pingpong_fbo + DAG node Compute"
status: pendiente
type: feature
domain:
  - cpp-stack
scope: multi-app
priority: alta
depends: []
blocks: []
related: []
created: 2026-05-17
updated: 2026-05-17
tags: []
---
# 0027 — C++ gl_compute_shader + gl_pingpong_fbo + DAG node Compute

## APP Metadata

| Campo | Valor |
|-------|-------|
| **ID** | 0027 |
| **Estado** | pendiente |
| **Prioridad** | alta |
| **Tipo** | feature — C++ gfx (cpp/functions/gfx) |

## Dependencias

Recomendado leer primero `gl_loader_cpp_gfx`, `gl_shader_cpp_gfx`, `gl_framebuffer_cpp_gfx`, `dag_catalog_cpp_gfx`, `dag_compile_cpp_gfx`. Sin bloqueos.

**Desbloquea:** simulaciones GPGPU (particulas, fluidos), feedback loops (reaction-diffusion, blur multipass, bloom), cualquier shader que necesite estado entre frames. Triplica el alcance del DAG visual de `shaders_lab`.

---

## Objetivo

Tres adiciones complementarias:

1. **`gl_compute_shader_cpp_gfx`** — carga + dispatch de compute shaders (GLSL `#version 430 core` + `layout(local_size_*) in`). API parejo a `gl_shader`.
2. **`gl_pingpong_fbo_cpp_gfx`** — wrapper de **dos** `gl_framebuffer` con swap A↔B en cada paso. Permite usar el output del frame N-1 como input del frame N (feedback).
3. **DAG kind `Compute`** — nuevo `DagNodeKind::Compute` en `dag_catalog`/`dag_compile`. Nodo que ejecuta un compute shader sobre la imagen actual, con N iteraciones configurable.

Demo en `primitives_gallery`: reaction-diffusion (Gray-Scott) usando ping-pong FBO + compute shader.

## Contexto

`gl_shader` solo soporta fragment shaders. El DAG actual es estrictamente fragment-pipeline (gen → op → blend). No hay forma de:
- Persistir estado entre frames (necesario para feedback effects).
- Hacer GPGPU verdadero (compute con buffers / images).
- Implementar efectos como fluid sim, particulas o blur multipass eficientes.

OpenGL compute shaders requieren `#version 430 core` (= GL 4.3). El proyecto ya pide GL 3.3+; subir el minimo solo para los nodos compute es aceptable (graceful degrade si la GPU no lo soporta).

## Arquitectura

```
cpp/functions/gfx/
├── gl_compute_shader.h          # NEW
├── gl_compute_shader.cpp        # NEW
├── gl_compute_shader.md         # NEW
├── gl_pingpong_fbo.h            # NEW
├── gl_pingpong_fbo.cpp          # NEW
├── gl_pingpong_fbo.md           # NEW
├── dag_types.h                  # MOD: añadir DagNodeKind::Compute
├── dag_catalog.cpp/.h           # MOD: registrar nodos compute (ej: ReactionDiffusion)
├── dag_compile.cpp/.h           # MOD: emitir paso compute en pipeline
└── dag_uniforms.cpp/.h          # MOD si los compute tienen params
cpp/apps/primitives_gallery/
├── demos_compute.cpp            # NEW
├── demos.h                      # MOD
├── main.cpp                     # MOD
└── CMakeLists.txt               # MOD
```

### Pure core / impure shell

- `gl_compute_shader` y `gl_pingpong_fbo`: ambos `purity: impure`, `kind: function`.
- `dag_catalog`/`dag_compile`: ya son `pure`, mantener.
- `dag_node_editor` / `dag_panel`: ya son `impure`, no se tocan en este issue (los nodos compute aparecen automaticamente al añadirse al catalog).

### API propuesta

```cpp
namespace fn {

// gl_compute_shader.h
struct GlComputeProgram {
    GLuint program = 0;
    int local_x = 1, local_y = 1, local_z = 1;
    bool ok() const { return program != 0; }
};

GlComputeProgram gl_compute_compile(const char* glsl_body);  // body sin version
void gl_compute_destroy(GlComputeProgram&);

// Dispatch: bind images, set uniforms (callback opcional), glDispatchCompute, barrier.
void gl_compute_dispatch(const GlComputeProgram&, int groups_x, int groups_y, int groups_z = 1);
void gl_compute_bind_image(int unit, GLuint texture, GLenum access /*GL_READ_ONLY|WRITE|READ_WRITE*/, GLenum format = GL_RGBA8);

const char* gl_compute_last_error();

// gl_pingpong_fbo.h — dos FBO RGBA8 del mismo tamaño con A/B swap.
struct PingpongFbo {
    Framebuffer a, b;
    bool a_is_front = true;  // front = ultimo escrito
};

PingpongFbo pingpong_create(int w, int h);
void pingpong_destroy(PingpongFbo&);
void pingpong_resize(PingpongFbo&, int w, int h);
void pingpong_swap(PingpongFbo&);
const Framebuffer& pingpong_front(const PingpongFbo&);  // ultima lectura
const Framebuffer& pingpong_back (const PingpongFbo&);  // siguiente write target
}
```

### DAG: nuevo kind Compute

`dag_catalog` registra al menos un nodo de ejemplo: `compute_blur_separable` o `compute_reaction_diffusion`. Tras este issue queda abierto añadir mas (cada uno = un nuevo `DagNode`).

`dag_compile`: cuando emite un paso `Compute`, decide bindings (image read = ping-front, image write = pong-back) y devuelve metadatos para que el host haga `dispatch + swap` correctamente.

## Tareas

### Fase 1 — gl_compute_shader

- 1.1 Implementar wrapper minimo. Detectar version GL >= 4.3 al inicializar; si no, retornar `program == 0` y guardar error "compute shaders require OpenGL 4.3+".
- 1.2 Helper `gl_compute_bind_image` con `glBindImageTexture`.
- 1.3 `.md` con frontmatter (`purity: impure`, `kind: function`, `error_type`).

### Fase 2 — gl_pingpong_fbo

- 2.1 Implementar wrapper que reusa el `gl_framebuffer` existente. Resize propaga a los dos.
- 2.2 `.md` con frontmatter.

### Fase 3 — DAG Compute kind

- 3.1 Añadir `DagNodeKind::Compute` en `dag_types.h` + serializacion (json IO).
- 3.2 `dag_catalog`: registrar nodo de ejemplo `compute_blur_2pass` (separable Gaussian blur, 2 dispatches via flag uniform `direction`).
- 3.3 `dag_compile`: emitir step compute con suficiente metadata para que el host (un nuevo helper) haga el dispatch.
- 3.4 Helper `dag_compute_run_step()` — recibe `DagStep` compute + `PingpongFbo` + uniforms y hace `dispatch + swap`.

### Fase 4 — Gallery demo

- 4.1 `demos_compute.cpp` con `demo_compute_reaction_diffusion()`: `PingpongFbo` 512×512 + compute shader Gray-Scott con sliders (feed, kill, dt). Usa `ImGui::Image` para mostrar el front buffer cada frame. Boton "reset" rellena con seed.
- 4.2 Registrar en gallery.

### Fase 5 — Tests + docs

- 5.1 Test (Linux + GPU disponible, opcional via env var `FN_GPU_TESTS=1`): compila un compute simple que escribe `vec4(1,0,0,1)` y verifica via `glReadPixels`.
- 5.2 Test puro de `dag_compile` con un pipeline que mezcla nodos `Gen` + `Compute`.
- 5.3 `./fn index` + `./fn show *_cpp_gfx` para los nuevos.

## Ejemplo de uso

```cpp
auto cs = fn::gl_compute_compile(R"glsl(
  layout(local_size_x=8, local_size_y=8) in;
  layout(rgba8, binding=0) uniform image2D src;
  layout(rgba8, binding=1) uniform image2D dst;
  void main() {
    ivec2 p = ivec2(gl_GlobalInvocationID.xy);
    vec4 c = imageLoad(src, p);
    imageStore(dst, p, vec4(1.0 - c.rgb, 1.0));
  }
)glsl");

auto pp = fn::pingpong_create(512, 512);
glUseProgram(cs.program);
fn::gl_compute_bind_image(0, fn::pingpong_front(pp).color_tex, GL_READ_ONLY);
fn::gl_compute_bind_image(1, fn::pingpong_back (pp).color_tex, GL_WRITE_ONLY);
fn::gl_compute_dispatch(cs, 64, 64);
fn::pingpong_swap(pp);
```

## Decisiones de diseño

- **Subir minimo a GL 4.3 solo para nodos Compute** — el resto del pipeline sigue compilando con 3.3. `gl_compute_compile` falla con error claro en GPUs viejas.
- **Ping-pong fuera de FBO** — wrapper aparte, no fusionado con `gl_framebuffer` (mantenemos primitivos atomicos).
- **Catalog del DAG abierto** — un solo nodo compute en este issue. Otros (fluid, particles) se añaden en issues posteriores reusando los wrappers.

## Riesgos

- **macOS no soporta compute shaders en GL 4.1** — documentar limitacion. En macOS el path Compute queda inactivo.
- **Memory barriers**: olvidar `glMemoryBarrier(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT)` produce datos basura. Encapsular en `gl_compute_dispatch`.
- **Dispatch grande bloquea UI**: documentar que dispatches > 1M pixels deben dividirse o bajar local_size.