perf(viz): graph_renderer Tier 1 (RGBA8 + orphan + frustum cull) + force_layout auto-pause helper

Issue 0049c. Tres optimizaciones internas en graph_renderer.cpp + un helper puro en graph_force_layout para detectar convergencia. API publica intacta — solo cambian el layout interno de los buffers, el shader y los costes por frame. 1. RGBA8 color packing - El instance buffer de nodos pasa de (x,y,size,r,g,b,a) 28B a (x,y,size,color_u32) 16B (-43%). Aristas: 24B → 12B/vertex (-50%). - Shaders desempaquetan con bit shifts (compatible GL 3.30+, no necesita unpackUnorm4x8 que es 4.20+). - Helpers expuestos: pack_rgba8 / unpack_rgba8 / modulate_alpha_rgba8 en graph_renderer.h. Los GraphNode.color y la paleta ya tenian el layout correcto (R en LSB), asi que CPU ahora pasa el uint32 directo sin convertir a 4 floats por nodo y por frame. 2. Capacity-tracked streaming buffers - Sustituye el doble glBufferData de antes por: glBufferData(NULL, capacity, STREAM_DRAW) // orphan + reserva glBufferSubData(0, used_bytes, data) // solo lo usado - capacity crece x2 cuando hace falta (inicial 4096 nodos / 8192 vertices de aristas) → reallocaciones en O(log N). - Staging CPU (NodeInstance* / EdgeVertex*) reusado entre frames con realloc, no malloc/free per frame. 3. Frustum cull (CPU-side) - AABB del viewport en world coords con margen 10%. - Aristas: skip si AABB del segmento no intersecta el viewport. - Nodos: solo los visibles entran al instance buffer; visible_count es el N que pasa a glDrawArraysInstanced. Pop-in de borde mitigado por el margen. 4. graph_force_layout_should_pause(low_frames, min_consecutive) - Helper puro: el caller mantiene el contador, la funcion solo decide si parar. Reemplaza la rama inline en demos_graph.cpp. - Test Catch2 con secuencias artificiales. Tests: test_graph_pack_rgba8 (16401 asserts, 4 cases — roundtrip exhaustivo + alpha modulation + clamp). test_graph_should_pause (3 cases, 14 asserts). Los 29 tests del cpp/tests/ siguen verdes (incluido test_visual con goldens). Bump versiones: - graph_renderer 1.1.0 → 1.2.0 - graph_force_layout 1.0.0 → 1.1.0 (tested: true via should_pause test)
2026-04-29 22:17:13 +02:00
parent 0e6a013937
commit 02b4141cc1
12 changed files with 437 additions and 146 deletions
@@ -26,3 +26,33 @@ void graph_renderer_resize(GraphRenderer* r, int width, int height);
 // Returns OpenGL texture ID suitable for ImGui::Image().
 unsigned int graph_renderer_draw(GraphRenderer* r, const GraphData& graph,
                                  float cam_x, float cam_y, float cam_zoom);
+
+// ---------------------------------------------------------------------------
+// RGBA8 packing helpers
+// ---------------------------------------------------------------------------
+// Layout: byte 0 (LSB) = R, byte 1 = G, byte 2 = B, byte 3 (MSB) = A.
+// On a little-endian host this matches GLSL's `unpackUnorm4x8(uint)` which
+// returns vec4(byte0, byte1, byte2, byte3) / 255 — so the GPU reads it as
+// (R, G, B, A) without any swizzle.
+inline uint32_t pack_rgba8(uint8_t r, uint8_t g, uint8_t b, uint8_t a) {
+    return  (uint32_t)r
+          | ((uint32_t)g << 8)
+          | ((uint32_t)b << 16)
+          | ((uint32_t)a << 24);
+}
+
+inline void unpack_rgba8(uint32_t c, uint8_t& r, uint8_t& g, uint8_t& b, uint8_t& a) {
+    r = (uint8_t)( c        & 0xFF);
+    g = (uint8_t)((c >> 8 ) & 0xFF);
+    b = (uint8_t)((c >> 16) & 0xFF);
+    a = (uint8_t)((c >> 24) & 0xFF);
+}
+
+// Multiply alpha channel by a [0..1] scale, clamping to 255.
+inline uint32_t modulate_alpha_rgba8(uint32_t c, float scale) {
+    uint32_t a = (c >> 24) & 0xFFu;
+    float    af = (float)a * scale + 0.5f;
+    if (af < 0.0f) af = 0.0f;
+    if (af > 255.0f) af = 255.0f;
+    return (c & 0x00FFFFFFu) | ((uint32_t)af << 24);
+}