feat: funciones Python datascience, finance, cybersecurity y pipelines
Datascience: aggregate_by_group, deduplicate_entities/relations, detect_drift, diff_entities/relations, extract_entities/relations_llm, hotness_score, melt, merge_graphs, pivot, build_entity/relation_schema_prompt. Finance: avellaneda_stoikov_quotes, generate_gbm_prices, generate_taker_order, hawkes_intensity + módulo finance.py. Cybersecurity: envelope_encrypt/decrypt + módulo cybersecurity.py. Pipelines: extraction_pipeline, monte_carlo_market, run_market_sim. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,52 @@
|
||||
---
|
||||
name: diff_relations
|
||||
kind: function
|
||||
lang: py
|
||||
domain: datascience
|
||||
version: "1.0.0"
|
||||
purity: pure
|
||||
signature: "def diff_relations(before: list[dict], after: list[dict], key: tuple[str, str, str] = ('source_id', 'target_id', 'relation_type'), ignore_fields: list[str] | None = None, compare_fields: list[str] | None = None) -> dict"
|
||||
description: "Compara relaciones entre dos snapshots usando key compuesta (source_id, target_id, relation_type). Detecta relaciones añadidas, eliminadas y modificadas con detalle campo a campo."
|
||||
tags: [diff, relations, graph, snapshot, operations, comparison, datascience]
|
||||
uses_functions: []
|
||||
uses_types: []
|
||||
returns: []
|
||||
returns_optional: false
|
||||
error_type: ""
|
||||
imports: []
|
||||
tested: true
|
||||
tests:
|
||||
- "relacion añadida"
|
||||
- "relacion eliminada"
|
||||
- "relacion con metadata modificada (mismo source/target/type, distinto weight)"
|
||||
- "key compuesta funciona correctamente"
|
||||
test_file_path: "python/functions/datascience/diff_relations_test.py"
|
||||
file_path: "python/functions/datascience/diff_relations.py"
|
||||
---
|
||||
|
||||
## Ejemplo
|
||||
|
||||
```python
|
||||
before = [
|
||||
{"source_id": "A", "target_id": "B", "relation_type": "knows", "weight": 1.0},
|
||||
{"source_id": "B", "target_id": "C", "relation_type": "owns", "weight": 0.5},
|
||||
]
|
||||
after = [
|
||||
{"source_id": "A", "target_id": "B", "relation_type": "knows", "weight": 2.0},
|
||||
{"source_id": "C", "target_id": "D", "relation_type": "knows", "weight": 1.0},
|
||||
]
|
||||
|
||||
result = diff_relations(before, after)
|
||||
# result["added"] -> [{"source_id": "C", "target_id": "D", ...}]
|
||||
# result["removed"] -> [{"source_id": "B", "target_id": "C", ...}]
|
||||
# result["modified"] -> [{"key": "A|B|knows", "changes": {"weight": {"old": 1.0, "new": 2.0}}}]
|
||||
# result["unchanged"] -> 0
|
||||
```
|
||||
|
||||
## Notas
|
||||
|
||||
La key compuesta se serializa como `source_id|target_id|relation_type`. Si alguno de los campos clave no existe en la relacion, se usa string vacio.
|
||||
|
||||
Misma semantica que `diff_entities_py_datascience` pero adaptada para relaciones donde no hay un ID unico — la identidad se define por los tres campos de la key.
|
||||
|
||||
Complemento natural de `diff_entities_py_datascience` para comparar grafos completos entre ejecuciones de pipelines.
|
||||
Reference in New Issue
Block a user