feat(0133-1+2): columnar snapshot + string interning in data_table
Change 1 — Columnar Snapshot Internal: - Add ColumnSnapshot struct (type + str_ids/i64/f64 per column) in data_table_internal.h - Add SnapshotCache struct with pointer-identity sentinel (last_cells_ptr) - Add SnapshotCache field to UiState singleton - In render(): rebuild snapshot after join materialization when cells ptr changes Uses same pointer-identity pattern as existing stats_last_cells in State Int/Float columns parsed once via parse_number; String/Auto interned Change 2 — String Interning: - Add StringPool struct (strings + unordered_map<string_view, uint32_t>) to data_table_types.h - StringPool is per-State (NOT global) for table isolation - intern(sv) inserts if absent, returns stable uint32_t index - Cleared + rebuilt on each snapshot rebuild for index coherence - Add string_pool field to State struct Documentation: - Extended header comment in data_table_internal.h describing design, StringPool API, invariants (pointer-identity, row→snapshot_row), and how stats_last_cells and snapshot coexist independently Build: fn_module_data_table + tables_qa pass, no new errors (only pre-existing -Wformat-truncation warnings unrelated to this change). Public API (data_table.h, TableInput, render() signature) unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -8,6 +8,8 @@
|
||||
#include "compute_column_stats.h"
|
||||
|
||||
#include <string>
|
||||
#include <string_view>
|
||||
#include <unordered_map>
|
||||
#include <utility>
|
||||
#include <vector>
|
||||
|
||||
@@ -353,6 +355,44 @@ struct VizPanel {
|
||||
mutable ViewMode last_non_table = ViewMode::Bar;
|
||||
};
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
// StringPool — interning de strings para columnas de texto (issue 0133).
|
||||
// Una instancia por State (NOT global) para aislar tablas independientes.
|
||||
//
|
||||
// intern(sv) devuelve un indice uint32_t estable para la vida del rebuild.
|
||||
// El pool se limpia (clear()) al inicio de cada rebuild de snapshot columnar.
|
||||
//
|
||||
// Invariante de invalidacion de string_view:
|
||||
// - El vector `strings` se reserva con reserve() ANTES del primer intern()
|
||||
// para evitar reallocs que invalidarian los string_view del mapa.
|
||||
// Si la estimacion es insuficiente (columna con mas unicos de lo esperado),
|
||||
// el mapa se reconstruye post-push_back: intern() verifica cap antes de
|
||||
// insertar en el map para cubrir este caso.
|
||||
// ----------------------------------------------------------------------------
|
||||
struct StringPool {
|
||||
std::vector<std::string> strings; // strings unicos, por indice
|
||||
std::unordered_map<std::string_view, uint32_t> index; // sv→id (sv apunta a strings[i])
|
||||
|
||||
void clear() {
|
||||
strings.clear();
|
||||
index.clear();
|
||||
}
|
||||
|
||||
// intern: inserta si no existe. Devuelve indice estable.
|
||||
uint32_t intern(std::string_view sv) {
|
||||
auto it = index.find(sv);
|
||||
if (it != index.end()) return it->second;
|
||||
uint32_t id = (uint32_t)strings.size();
|
||||
strings.emplace_back(sv);
|
||||
// Re-apuntar el string_view al almacenamiento interno (strings[id]).
|
||||
index.emplace(std::string_view(strings[id]), id);
|
||||
return id;
|
||||
}
|
||||
|
||||
const std::string& at(uint32_t id) const { return strings[id]; }
|
||||
bool empty() const { return strings.empty(); }
|
||||
};
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
// State: stage pipeline + viz globales.
|
||||
// ----------------------------------------------------------------------------
|
||||
@@ -419,6 +459,11 @@ struct State {
|
||||
std::vector<DrillStep> drill_back;
|
||||
std::vector<DrillStep> drill_forward;
|
||||
|
||||
// String interning pool (issue 0133, Change 2).
|
||||
// Limpiado y repoblado en cada rebuild del snapshot columnar.
|
||||
// NOT global — una instancia por State para aislar tablas independientes.
|
||||
StringPool string_pool;
|
||||
|
||||
// Helpers (definidos en compute_stage.cpp).
|
||||
Stage& raw();
|
||||
const Stage& raw() const;
|
||||
|
||||
Reference in New Issue
Block a user