fix(crash): init data_table State.col_visible + col_order before render

Pressing Refresh Agents (or Test Connection — both trigger fetch + table
re-render) crashed the app with Windows exit code 5 (access violation).

Root cause: agents_tbl_state was default-constructed, so
State.col_visible (std::vector<bool>) and State.col_order
(std::vector<int>) were empty. render_grid_stage0 indexes them by column
index up to N_COLS=11 without bounds checking → undefined behaviour →
segfault on the first render after agents data populated.

Fix: at first render of the agents panel, assign col_visible=true * N_COLS,
fill col_order with [0..N_COLS), and ensure stages.size() >= 1. Same
pattern tql_apply.cpp uses (col_visible.assign(eff_cols, true)).

Diagnostic infra added (kept in place — minimal overhead):
- FN_DBG macro: fprintf(stderr, ...) + fflush. Survives crashes that
  fn_log's buffered file output doesn't.
- --auto-refresh CLI flag: triggers fetch_agents_async at frame 30,
  auto-exits at frame 180 (~3s @ 60Hz). Headless smoke for CI.
- DBG breadcrumbs through main → load_apikey → fn::run_app → render →
  fetch_agents_async (thread enter/request/response/parse/exit) → render
  table (pre/post). Each step flushes stderr immediately.

E2E regression guard: test_app_survives_auto_refresh_cycle. Runs the .exe
with --auto-refresh, asserts exit 0, asserts the breadcrumb chain reaches
both "fetch thread parsed" and "agents_panel POST-render" in stderr. 25
tests passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-22 23:31:25 +02:00
parent 18b5ffdfd9
commit 9cade2f2f8
2 changed files with 126 additions and 16 deletions
+35
View File
@@ -82,6 +82,41 @@ def test_connect_succeeds_with_valid_apikey():
assert n > 0, f"expected at least 1 agent, got {n}"
def test_app_survives_auto_refresh_cycle():
"""Regression: app must NOT crash on Refresh Agents button click.
Bug history: v0.2 migration to data_table_cpp_viz left State.col_visible
and State.col_order uninitialized — render_grid_stage0 indexed into empty
std::vector<bool>, causing an access violation (Windows exit code 5).
The --auto-refresh CLI flag triggers fetch_agents_async + a full render
cycle from a headless GLFW window, then exits at frame 180 (~3s @ 60Hz).
Exit 0 means the agents panel rendered the live data without crashing.
"""
pass_check = subprocess.run(["pass", "agentes/api-key"],
capture_output=True, text=True, timeout=5)
if pass_check.returncode != 0 or not pass_check.stdout.strip():
pytest.skip("pass agentes/api-key not readable (GPG locked?)")
# WSL → Windows: launch the .exe and let it self-exit after 180 frames.
r = subprocess.run(
[str(_exe()), "--auto-refresh"],
capture_output=True, text=True, timeout=30,
)
assert r.returncode == 0, (
f"app crashed (exit={r.returncode}); last stderr:\n"
+ "\n".join(r.stderr.splitlines()[-20:])
)
# Sanity: stderr must show that fetch_agents reached the parse step.
assert "fetch thread parsed" in r.stderr, (
f"fetch never reached parse; stderr:\n{r.stderr[-1000:]}"
)
# Sanity: render must have completed at least once (POST-render logged).
assert "agents_panel POST-render" in r.stderr, (
f"render_grid_stage0 crashed before completing; stderr:\n{r.stderr[-1000:]}"
)
def test_connect_falls_back_to_pass_when_env_empty():
"""When AGENTS_API_KEY env is empty, the .exe must fetch apikey via
`wsl.exe pass agentes/api-key` (or `pass` on Linux). This is what makes