feat: funciones Python datascience, finance, cybersecurity y pipelines
Datascience: aggregate_by_group, deduplicate_entities/relations, detect_drift, diff_entities/relations, extract_entities/relations_llm, hotness_score, melt, merge_graphs, pivot, build_entity/relation_schema_prompt. Finance: avellaneda_stoikov_quotes, generate_gbm_prices, generate_taker_order, hawkes_intensity + módulo finance.py. Cybersecurity: envelope_encrypt/decrypt + módulo cybersecurity.py. Pipelines: extraction_pipeline, monte_carlo_market, run_market_sim. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,58 @@
|
||||
---
|
||||
name: diff_entities
|
||||
kind: function
|
||||
lang: py
|
||||
domain: datascience
|
||||
version: "1.0.0"
|
||||
purity: pure
|
||||
signature: "def diff_entities(before: list[dict], after: list[dict], key: str = 'id', ignore_fields: list[str] | None = None, compare_fields: list[str] | None = None) -> dict"
|
||||
description: "Compara dos snapshots de entities y devuelve diferencias campo a campo. Detecta añadidas, eliminadas, modificadas e inalteradas. Ignora created_at y updated_at por defecto."
|
||||
tags: [diff, entities, snapshot, operations, comparison, datascience]
|
||||
uses_functions: []
|
||||
uses_types: []
|
||||
returns: []
|
||||
returns_optional: false
|
||||
error_type: ""
|
||||
imports: []
|
||||
tested: true
|
||||
tests:
|
||||
- "entity añadida"
|
||||
- "entity eliminada"
|
||||
- "entity modificada con detalle de campos"
|
||||
- "entities identicas → unchanged"
|
||||
- "ignore_fields funciona"
|
||||
- "compare_fields filtra correctamente"
|
||||
- "lista vacia vs lista con datos"
|
||||
test_file_path: "python/functions/datascience/diff_entities_test.py"
|
||||
file_path: "python/functions/datascience/diff_entities.py"
|
||||
---
|
||||
|
||||
## Ejemplo
|
||||
|
||||
```python
|
||||
before = [
|
||||
{"id": "1", "name": "Alice", "status": "active", "updated_at": "2024-01-01"},
|
||||
{"id": "2", "name": "Bob", "status": "active", "updated_at": "2024-01-01"},
|
||||
]
|
||||
after = [
|
||||
{"id": "1", "name": "Alice", "status": "inactive", "updated_at": "2024-01-02"},
|
||||
{"id": "3", "name": "Carol", "status": "active", "updated_at": "2024-01-02"},
|
||||
]
|
||||
|
||||
result = diff_entities(before, after)
|
||||
# result["added"] -> [{"id": "3", "name": "Carol", ...}]
|
||||
# result["removed"] -> [{"id": "2", "name": "Bob", ...}]
|
||||
# result["modified"] -> [{"key": "1", "changes": {"status": {"old": "active", "new": "inactive"}}}]
|
||||
# result["unchanged"] -> 0
|
||||
# result["summary"] -> "1 added, 1 removed, 1 modified, 0 unchanged"
|
||||
```
|
||||
|
||||
## Notas
|
||||
|
||||
Funcion pura. No hace I/O — toma listas de dicts ya cargadas en memoria.
|
||||
|
||||
El campo `key` debe existir en todas las entities; las que no lo tengan se ignoran silenciosamente.
|
||||
|
||||
Si `compare_fields` se da, tiene prioridad sobre `ignore_fields`. Esto permite comparar solo un subconjunto especifico de campos sin preocuparse por los campos temporales.
|
||||
|
||||
El orden de `added` y `removed` no esta garantizado (depende del orden de iteracion de sets).
|
||||
Reference in New Issue
Block a user