Graphiti Node Summary Pipelines

This document outlines how entity summaries are generated today and the pragmatic changes proposed to gate expensive LLM calls using novelty and recency heuristics.

Current Execution Flow

+------------------+       +-------------------------------+
| Episode Ingest   | ----> | retrieve_episodes(last=3)     |
+------------------+       +-------------------------------+
            |                         |
            v                         v
+----------------------+       +---------------------------+
| extract_nodes        |       | extract_edges             |
|  (LLM + reflexion)   |       |  (LLM fact extraction)    |
+----------------------+       +---------------------------+
            |                         |
            | dedupe candidates       |
            v                         v
+--------------------------+   +---------------------------+
| resolve_extracted_nodes  |   | resolve_extracted_edges   |
|  (deterministic + LLM)   |   |  (LLM dedupe, invalidates)|
+--------------------------+   +---------------------------+
            |                         |
            +----------- parallel ----+
                        |
                        v
        +-----------------------------------------+
        | extract_attributes_from_nodes           |
        |  - LLM attribute fill (optional)        |
        |  - LLM summary refresh (always runs)    |
        +-----------------------------------------+

Current Data & Timing Characteristics

Previous context: only the latest EPISODE_WINDOW_LEN = 3 episodes are retrieved before any LLM call.
Fact availability: raw EntityEdge.fact strings exist immediately after edge extraction; embeddings are produced inside resolve_extracted_edges, concurrently with summary generation.
Summary inputs: the summary prompt only sees node.summary, node attributes, episode content, and the three-episode window. It does not observe resolved fact invalidations or final edge embeddings.
LLM usage: every node that survives dedupe invokes the summary prompt, even when the episode is low-information or repeat content.

Proposed Execution Flow

+------------------+       +-------------------------------+
| Episode Ingest   | ----> | retrieve_episodes(last=3)     |
+------------------+       +-------------------------------+
            |                         |
            v                         v
+----------------------+       +---------------------------+
| extract_nodes        |       | extract_edges             |
|  (LLM + reflexion)   |       |  (LLM fact extraction)    |
+----------------------+       +---------------------------+
            |                         |
            | dedupe candidates       |
            v                         v
+--------------------------+   +---------------------------+
| resolve_extracted_nodes  |   | build_node_deltas         |
|  (deterministic + LLM)   |   |  - new facts per node     |
+--------------------------+   |  - embed facts upfront    |
            |                 |  - track candidate flips   |
            |                 +-------------+-------------+
            |                               |
            |                               v
            |                 +---------------------------+
            |                 | resolve_extracted_edges   |
            |                 |  (uses prebuilt embeddings|
            |                 |   updates NodeDelta state)|
            |                 +-------------+-------------+
            |                               |
            |                               v
            |                 +---------------------------+
            |                 | summary_gate.should_refresh|
            |                 |  - fact hash drift         |
            |                 |  - embedding drift         |
            |                 |  - negation / invalidation |
            |                 |  - burst & staleness rules |
            |                 +------+------+------------+
            |                        |     |
            |                        |     +------------------------------+
            |                        |                                    |
            |                        v                                    v
            |         +---------------------------+      +-----------------------------+
            |         | skip summary (log cause)  |      | extract_summary LLM call    |
            |         | update metadata only      |      | update summary & metadata   |
            |         +---------------------------+      +-----------------------------+
            v
+-------------------------------+
| add_nodes_and_edges_bulk      |
+-------------------------------+

Key Proposed Changes

NodeDelta staging
- Generate fact embeddings immediately after edge extraction (create_entity_edge_embeddings) and group fact deltas by target node.
- Record potential invalidations and episode timestamps within the delta so the summary gate has full context.
Deterministic novelty checks
- Maintain a stable hash of active facts (new facts minus invalidated). Skip summarisation when the hash matches the stored value.
- Compare pooled fact embeddings to the persisted summary embedding; trigger refresh only when cosine drift exceeds a tuned threshold.
- Force refresh whenever the delta indicates polarity changes (contradiction/negation cues).
Recency & burst handling
- Track recent episode timestamps per node in metadata. If multiple episodes arrive within a short window, accumulate deltas and defer the summary until the burst ends or a hard cap is reached.
- Enforce a staleness SLA (last_summary_ts) so long-lived nodes eventually refresh even if novelty remains low.
Metadata persistence
- Persist gate state on each entity (_graphiti_meta: fact_hash, summary_embedding, last_summary_ts, _recent_episode_times, _burst_active).
- Update metadata whether the summary runs or is skipped to keep the gating logic deterministic across ingests.
Observability
- Emit counters for summary_skipped, summary_refreshed, and reasons (unchanged hash, low drift, burst deferral, staleness). Sample a small percentage of skipped cases to validate heuristics.

Implementation Snapshot

Area	Current	Proposed
Summary trigger	Always per deduped node	Controlled by `summary_gate` using fact hash, embedding drift, negation, burst, staleness
Fact embeddings	Produced during edge resolution (parallel)	Produced immediately after extraction and reused downstream
Fact availability in summaries	Not available	Encapsulated in `NodeDelta` passed into summary gate
Metadata on node	Summary text + organic attributes	Summary text + `_graphiti_meta` (hash, embedding, timestamps, burst state)
Recency handling	None	Deque of recent episode timestamps + burst deferral
Negation detection	LLM-only inside edge resolution	Propagated into gate to force summary refresh

These adjustments retain the existing inline execution model—no scheduled jobs—while reducing unnecessary LLM calls and improving determinism by grounding summaries in the same fact set that backs the graph.

7.3 KiB Raw Blame History