126 lines
7.3 KiB
Markdown
126 lines
7.3 KiB
Markdown
# Graphiti Node Summary Pipelines
|
|
|
|
This document outlines how entity summaries are generated today and the pragmatic changes proposed to gate expensive LLM calls using novelty and recency heuristics.
|
|
|
|
## Current Execution Flow
|
|
|
|
```
|
|
+------------------+ +-------------------------------+
|
|
| Episode Ingest | ----> | retrieve_episodes(last=3) |
|
|
+------------------+ +-------------------------------+
|
|
| |
|
|
v v
|
|
+----------------------+ +---------------------------+
|
|
| extract_nodes | | extract_edges |
|
|
| (LLM + reflexion) | | (LLM fact extraction) |
|
|
+----------------------+ +---------------------------+
|
|
| |
|
|
| dedupe candidates |
|
|
v v
|
|
+--------------------------+ +---------------------------+
|
|
| resolve_extracted_nodes | | resolve_extracted_edges |
|
|
| (deterministic + LLM) | | (LLM dedupe, invalidates)|
|
|
+--------------------------+ +---------------------------+
|
|
| |
|
|
+----------- parallel ----+
|
|
|
|
|
v
|
|
+-----------------------------------------+
|
|
| extract_attributes_from_nodes |
|
|
| - LLM attribute fill (optional) |
|
|
| - LLM summary refresh (always runs) |
|
|
+-----------------------------------------+
|
|
```
|
|
|
|
### Current Data & Timing Characteristics
|
|
|
|
- **Previous context**: only the latest `EPISODE_WINDOW_LEN = 3` episodes are retrieved before any LLM call.
|
|
- **Fact availability**: raw `EntityEdge.fact` strings exist immediately after edge extraction; embeddings are produced inside `resolve_extracted_edges`, concurrently with summary generation.
|
|
- **Summary inputs**: the summary prompt only sees `node.summary`, node attributes, episode content, and the three-episode window. It does *not* observe resolved fact invalidations or final edge embeddings.
|
|
- **LLM usage**: every node that survives dedupe invokes the summary prompt, even when the episode is low-information or repeat content.
|
|
|
|
## Proposed Execution Flow
|
|
|
|
```
|
|
+------------------+ +-------------------------------+
|
|
| Episode Ingest | ----> | retrieve_episodes(last=3) |
|
|
+------------------+ +-------------------------------+
|
|
| |
|
|
v v
|
|
+----------------------+ +---------------------------+
|
|
| extract_nodes | | extract_edges |
|
|
| (LLM + reflexion) | | (LLM fact extraction) |
|
|
+----------------------+ +---------------------------+
|
|
| |
|
|
| dedupe candidates |
|
|
v v
|
|
+--------------------------+ +---------------------------+
|
|
| resolve_extracted_nodes | | build_node_deltas |
|
|
| (deterministic + LLM) | | - new facts per node |
|
|
+--------------------------+ | - embed facts upfront |
|
|
| | - track candidate flips |
|
|
| +-------------+-------------+
|
|
| |
|
|
| v
|
|
| +---------------------------+
|
|
| | resolve_extracted_edges |
|
|
| | (uses prebuilt embeddings|
|
|
| | updates NodeDelta state)|
|
|
| +-------------+-------------+
|
|
| |
|
|
| v
|
|
| +---------------------------+
|
|
| | summary_gate.should_refresh|
|
|
| | - fact hash drift |
|
|
| | - embedding drift |
|
|
| | - negation / invalidation |
|
|
| | - burst & staleness rules |
|
|
| +------+------+------------+
|
|
| | |
|
|
| | +------------------------------+
|
|
| | |
|
|
| v v
|
|
| +---------------------------+ +-----------------------------+
|
|
| | skip summary (log cause) | | extract_summary LLM call |
|
|
| | update metadata only | | update summary & metadata |
|
|
| +---------------------------+ +-----------------------------+
|
|
v
|
|
+-------------------------------+
|
|
| add_nodes_and_edges_bulk |
|
|
+-------------------------------+
|
|
```
|
|
|
|
### Key Proposed Changes
|
|
|
|
1. **NodeDelta staging**
|
|
- Generate fact embeddings immediately after edge extraction (`create_entity_edge_embeddings`) and group fact deltas by target node.
|
|
- Record potential invalidations and episode timestamps within the delta so the summary gate has full context.
|
|
|
|
2. **Deterministic novelty checks**
|
|
- Maintain a stable hash of active facts (new facts minus invalidated). Skip summarisation when the hash matches the stored value.
|
|
- Compare pooled fact embeddings to the persisted summary embedding; trigger refresh only when cosine drift exceeds a tuned threshold.
|
|
- Force refresh whenever the delta indicates polarity changes (contradiction/negation cues).
|
|
|
|
3. **Recency & burst handling**
|
|
- Track recent episode timestamps per node in metadata. If multiple episodes arrive within a short window, accumulate deltas and defer the summary until the burst ends or a hard cap is reached.
|
|
- Enforce a staleness SLA (`last_summary_ts`) so long-lived nodes eventually refresh even if novelty remains low.
|
|
|
|
4. **Metadata persistence**
|
|
- Persist gate state on each entity (`_graphiti_meta`: `fact_hash`, `summary_embedding`, `last_summary_ts`, `_recent_episode_times`, `_burst_active`).
|
|
- Update metadata whether the summary runs or is skipped to keep the gating logic deterministic across ingests.
|
|
|
|
5. **Observability**
|
|
- Emit counters for `summary_skipped`, `summary_refreshed`, and reasons (unchanged hash, low drift, burst deferral, staleness). Sample a small percentage of skipped cases to validate heuristics.
|
|
|
|
## Implementation Snapshot
|
|
|
|
| Area | Current | Proposed |
|
|
| ---- | ------- | -------- |
|
|
| Summary trigger | Always per deduped node | Controlled by `summary_gate` using fact hash, embedding drift, negation, burst, staleness |
|
|
| Fact embeddings | Produced during edge resolution (parallel) | Produced immediately after extraction and reused downstream |
|
|
| Fact availability in summaries | Not available | Encapsulated in `NodeDelta` passed into summary gate |
|
|
| Metadata on node | Summary text + organic attributes | Summary text + `_graphiti_meta` (hash, embedding, timestamps, burst state) |
|
|
| Recency handling | None | Deque of recent episode timestamps + burst deferral |
|
|
| Negation detection | LLM-only inside edge resolution | Propagated into gate to force summary refresh |
|
|
|
|
These adjustments retain the existing inline execution model—no scheduled jobs—while reducing unnecessary LLM calls and improving determinism by grounding summaries in the same fact set that backs the graph.
|