5e6a974a5d
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
188 lines
6.9 KiB
Markdown
188 lines
6.9 KiB
Markdown
---
|
||
id: "0030"
|
||
title: "C++ audio reactivo (capture + FFT + uniform feed + viz)"
|
||
status: pendiente
|
||
type: feature
|
||
domain:
|
||
- cpp-stack
|
||
- frontend
|
||
scope: multi-app
|
||
priority: media
|
||
depends: []
|
||
blocks: []
|
||
related: []
|
||
created: 2026-05-17
|
||
updated: 2026-05-17
|
||
tags: []
|
||
---
|
||
# 0030 — C++ audio reactivo (capture + FFT + uniform feed + viz)
|
||
|
||
## APP Metadata
|
||
|
||
| Campo | Valor |
|
||
|-------|-------|
|
||
| **ID** | 0030 |
|
||
| **Estado** | pendiente |
|
||
| **Prioridad** | media |
|
||
| **Tipo** | feature — C++ multi-domain (core, gfx, viz) |
|
||
|
||
## Dependencias
|
||
|
||
`gl_loader_cpp_gfx`, `time_series_buffer_cpp_core`. Independiente de los demas issues.
|
||
|
||
**Desbloquea:** shaders audio-reactivos en `shaders_lab`, dashboards en vivo con waveform/spectrum, debugging visual de pipelines de audio.
|
||
|
||
---
|
||
|
||
## Objetivo
|
||
|
||
Cinco primitivos coherentes:
|
||
|
||
1. **`audio_capture_cpp_core`** — input mic/loopback con [miniaudio](https://github.com/mackron/miniaudio) (single-header, dominio publico). Lock-free ring buffer.
|
||
2. **`audio_fft_cpp_core`** — FFT real → magnitudes + smoothing (puro). Tamaño configurable (256/512/1024/2048).
|
||
3. **`audio_uniform_feed_cpp_gfx`** — sube espectro como `sampler1D` o `texture2D 1×N` al shader activo.
|
||
4. **`waveform_view_cpp_viz`** — dibuja muestras crudas en un line plot ImPlot.
|
||
5. **`spectrum_view_cpp_viz`** — dibuja magnitudes FFT en bar plot logaritmico.
|
||
|
||
Demo en `primitives_gallery`: input live → fft → vis dual + opcion de bind a un shader fullscreen reactivo.
|
||
|
||
## Contexto
|
||
|
||
`shaders_lab` no tiene input externo dinamico. Los visuals reaccionan a `u_time` y `u_mouse` solamente. Audio reactivo es uno de los casos mas comunes de uso creativo de shaders y abre demos espectaculares con muy poco codigo añadido.
|
||
|
||
## Arquitectura
|
||
|
||
```
|
||
cpp/
|
||
├── vendor/miniaudio/
|
||
│ └── miniaudio.h # NEW (~80k LOC, public domain)
|
||
├── functions/core/
|
||
│ ├── audio_capture.h/.cpp/.md # NEW (impure)
|
||
│ └── audio_fft.h/.cpp/.md # NEW (pure)
|
||
├── functions/gfx/
|
||
│ └── audio_uniform_feed.h/.cpp/.md # NEW (impure)
|
||
└── functions/viz/
|
||
├── waveform_view.h/.cpp/.md # NEW (pure component)
|
||
└── spectrum_view.h/.cpp/.md # NEW (pure component)
|
||
cpp/apps/primitives_gallery/
|
||
├── demos_audio.cpp # NEW
|
||
├── demos.h # MOD
|
||
├── main.cpp # MOD
|
||
└── CMakeLists.txt # MOD
|
||
cpp/CMakeLists.txt # MOD
|
||
```
|
||
|
||
### API propuesta
|
||
|
||
```cpp
|
||
namespace fn {
|
||
|
||
// audio_capture (impure)
|
||
struct AudioCapture;
|
||
struct AudioCaptureConfig {
|
||
int sample_rate = 48000;
|
||
int channels = 1; // mixdown a mono
|
||
int buffer_samples = 8192; // ring buffer
|
||
bool loopback = false; // Win/Linux pulse: capturar output del sistema
|
||
};
|
||
AudioCapture* audio_capture_start(const AudioCaptureConfig&);
|
||
void audio_capture_stop(AudioCapture*);
|
||
// Drena hasta `out_capacity` muestras float [-1,1]. Retorna count efectivo.
|
||
int audio_capture_read(AudioCapture*, float* out, int out_capacity);
|
||
const char* audio_capture_last_error();
|
||
|
||
// audio_fft (pure)
|
||
struct FftResult {
|
||
std::vector<float> magnitudes; // size = fft_size/2
|
||
float dc = 0.f;
|
||
float peak_freq_hz = 0.f;
|
||
};
|
||
FftResult audio_fft_compute(const float* samples, int n, int sample_rate, float smoothing = 0.7f, FftResult* prev = nullptr);
|
||
|
||
// audio_uniform_feed (impure)
|
||
struct AudioTexture { GLuint tex = 0; int n = 0; };
|
||
AudioTexture audio_texture_create(int n);
|
||
void audio_texture_destroy(AudioTexture&);
|
||
void audio_texture_upload(AudioTexture&, const float* magnitudes, int n);
|
||
void audio_texture_bind_uniform(GLuint program, const char* name, const AudioTexture&, int unit);
|
||
|
||
// waveform_view (pure component)
|
||
void waveform_view(const char* id, const float* samples, int n, ImVec2 size = {-1, 100});
|
||
// spectrum_view (pure component)
|
||
void spectrum_view(const char* id, const float* magnitudes, int n, int sample_rate, ImVec2 size = {-1, 150});
|
||
}
|
||
```
|
||
|
||
## Tareas
|
||
|
||
### Fase 1 — Vendor
|
||
|
||
- 1.1 `cpp/vendor/miniaudio/miniaudio.h` (pin commit). Crear `cpp/vendor/miniaudio/miniaudio_impl.cpp` con `#define MINIAUDIO_IMPLEMENTATION`.
|
||
- 1.2 Añadir source al CMakeLists.
|
||
|
||
### Fase 2 — audio_capture
|
||
|
||
- 2.1 Implementar wrapper. Backend default por OS (alsa/pulse en Linux, wasapi en Win). Lock-free ring buffer (single-producer/single-consumer) accumulando muestras del callback.
|
||
- 2.2 `loopback`: Linux pulse loopback monitor source; Windows wasapi loopback. Documentar limitaciones.
|
||
- 2.3 `.md` (`kind: function`, `purity: impure`, `error_type`).
|
||
|
||
### Fase 3 — audio_fft (puro)
|
||
|
||
- 3.1 Usar implementacion FFT minima (Cooley-Tukey radix-2) o vendorear [pffft](https://github.com/marton78/pffft) (BSD). Decidir y documentar.
|
||
- 3.2 Aplicar Hann window antes de FFT.
|
||
- 3.3 Smoothing exponencial frame-a-frame con `prev`.
|
||
- 3.4 Tests con seno puro a frecuencia conocida; `peak_freq_hz` debe coincidir.
|
||
- 3.5 `.md`.
|
||
|
||
### Fase 4 — audio_uniform_feed
|
||
|
||
- 4.1 Crear textura 1D (o 2D 1×N) `GL_R32F`. Subida con `glTexSubImage1D`/`2D`.
|
||
- 4.2 `.md`.
|
||
|
||
### Fase 5 — waveform_view + spectrum_view
|
||
|
||
- 5.1 `waveform_view`: linea con `ImPlot::PlotLine`. Eje y fijado [-1, 1]. Eje x = sample index.
|
||
- 5.2 `spectrum_view`: bar plot con eje x logaritmico (Hz), eje y dB (20*log10(magnitude+eps)).
|
||
- 5.3 `.md` para cada uno.
|
||
|
||
### Fase 6 — Gallery demo
|
||
|
||
- 6.1 `demos_audio.cpp` con `demo_audio_reactive()`: start capture, fft cada frame, dos vistas (waveform + spectrum), shader fullscreen 256×256 al lado que samplea la `audio_texture` y desplaza colores en funcion del bin medio.
|
||
- 6.2 Boton start/stop. Mostrar device en uso.
|
||
|
||
### Fase 7 — Tests + docs
|
||
|
||
- 7.1 Test FFT con seno conocido.
|
||
- 7.2 Test ring buffer (productor escribe N, consumidor lee, sin perdidas).
|
||
- 7.3 Smoke test: capture 100ms, asserts `read > 0`.
|
||
- 7.4 `./fn index` + `./fn show` de los 5.
|
||
|
||
## Ejemplo de uso
|
||
|
||
```cpp
|
||
auto* cap = fn::audio_capture_start({});
|
||
fn::FftResult prev{};
|
||
|
||
fn::run_app("audio", [&]{
|
||
float buf[2048];
|
||
int n = fn::audio_capture_read(cap, buf, 2048);
|
||
if (n >= 1024) {
|
||
prev = fn::audio_fft_compute(buf, 1024, 48000, 0.7f, &prev);
|
||
fn::waveform_view("##wav", buf, n);
|
||
fn::spectrum_view("##spec", prev.magnitudes.data(), prev.magnitudes.size(), 48000);
|
||
}
|
||
});
|
||
```
|
||
|
||
## Decisiones de diseño
|
||
|
||
- **miniaudio** vs portaudio: miniaudio es header-only, sin build extra; gana por simplicidad de integracion.
|
||
- **FFT minimo propio o pffft**: empezar con propio (50 LOC) — performance suficiente para 1024 samples a 60Hz. Si hace falta velocidad, swap por pffft.
|
||
- **`spectrum_view` y `waveform_view` puros**: no tocan el audio, solo dibujan.
|
||
|
||
## Riesgos
|
||
|
||
- **Permisos mic en macOS**: documentar Info.plist NSMicrophoneUsageDescription si se distribuye.
|
||
- **Loopback no estandar entre OSes**: documentar plataformas soportadas y degradar grácilmente si no esta disponible.
|
||
- **Latencia**: ring buffer vs callback puede acumular. Documentar tamaño y politica de drop.
|