graphiti/SUMMARY_PIPELINE_STATE.md

# Graphiti Node Summary Pipelines

This document outlines how entity summaries are generated today and the pragmatic changes proposed to gate expensive LLM calls using novelty and recency heuristics.

## Current Execution Flow

```
+------------------+       +-------------------------------+
| Episode Ingest   | ----> | retrieve_episodes(last=3)     |
+------------------+       +-------------------------------+
            |                         |
            v                         v
+----------------------+       +---------------------------+
| extract_nodes        |       | extract_edges             |
|  (LLM + reflexion)   |       |  (LLM fact extraction)    |
+----------------------+       +---------------------------+
            |                         |
            | dedupe candidates       |
            v                         v
+--------------------------+   +---------------------------+
| resolve_extracted_nodes  |   | resolve_extracted_edges   |
|  (deterministic + LLM)   |   |  (LLM dedupe, invalidates)|
+--------------------------+   +---------------------------+
            |                         |
            +----------- parallel ----+
                        |
                        v
        +-----------------------------------------+
        | extract_attributes_from_nodes           |
        |  - LLM attribute fill (optional)        |
        |  - LLM summary refresh (always runs)    |
        +-----------------------------------------+
```

### Current Data & Timing Characteristics

- **Previous context**: only the latest `EPISODE_WINDOW_LEN = 3` episodes are retrieved before any LLM call.
- **Fact availability**: raw `EntityEdge.fact` strings exist immediately after edge extraction; embeddings are produced inside `resolve_extracted_edges`, concurrently with summary generation.
- **Summary inputs**: the summary prompt only sees `node.summary`, node attributes, episode content, and the three-episode window. It does *not* observe resolved fact invalidations or final edge embeddings.
- **LLM usage**: every node that survives dedupe invokes the summary prompt, even when the episode is low-information or repeat content.

## Proposed Execution Flow

```
+------------------+       +-------------------------------+
| Episode Ingest   | ----> | retrieve_episodes(last=3)     |
+------------------+       +-------------------------------+
            |                         |
            v                         v
+----------------------+       +---------------------------+
| extract_nodes        |       | extract_edges             |
|  (LLM + reflexion)   |       |  (LLM fact extraction)    |
+----------------------+       +---------------------------+
            |                         |
            | dedupe candidates       |
            v                         v
+--------------------------+   +---------------------------+
| resolve_extracted_nodes  |   | build_node_deltas         |
|  (deterministic + LLM)   |   |  - new facts per node     |
+--------------------------+   |  - embed facts upfront    |
            |                 |  - track candidate flips   |
            |                 +-------------+-------------+
            |                               |
            |                               v
            |                 +---------------------------+
            |                 | resolve_extracted_edges   |
            |                 |  (uses prebuilt embeddings|
            |                 |   updates NodeDelta state)|
            |                 +-------------+-------------+
            |                               |
            |                               v
            |                 +---------------------------+
            |                 | summary_gate.should_refresh|
            |                 |  - fact hash drift         |
            |                 |  - embedding drift         |
            |                 |  - negation / invalidation |
            |                 |  - burst & staleness rules |
            |                 +------+------+------------+
            |                        |     |
            |                        |     +------------------------------+
            |                        |                                    |
            |                        v                                    v
            |         +---------------------------+      +-----------------------------+
            |         | skip summary (log cause)  |      | extract_summary LLM call    |
            |         | update metadata only      |      | update summary & metadata   |
            |         +---------------------------+      +-----------------------------+
            v
+-------------------------------+
| add_nodes_and_edges_bulk      |
+-------------------------------+
```

### Key Proposed Changes

1. **NodeDelta staging**
   - Generate fact embeddings immediately after edge extraction (`create_entity_edge_embeddings`) and group fact deltas by target node.
   - Record potential invalidations and episode timestamps within the delta so the summary gate has full context.

2. **Deterministic novelty checks**
   - Maintain a stable hash of active facts (new facts minus invalidated). Skip summarisation when the hash matches the stored value.
   - Compare pooled fact embeddings to the persisted summary embedding; trigger refresh only when cosine drift exceeds a tuned threshold.
   - Force refresh whenever the delta indicates polarity changes (contradiction/negation cues).

3. **Recency & burst handling**
   - Track recent episode timestamps per node in metadata. If multiple episodes arrive within a short window, accumulate deltas and defer the summary until the burst ends or a hard cap is reached.
   - Enforce a staleness SLA (`last_summary_ts`) so long-lived nodes eventually refresh even if novelty remains low.

4. **Metadata persistence**
   - Persist gate state on each entity (`_graphiti_meta`: `fact_hash`, `summary_embedding`, `last_summary_ts`, `_recent_episode_times`, `_burst_active`).
   - Update metadata whether the summary runs or is skipped to keep the gating logic deterministic across ingests.

5. **Observability**
   - Emit counters for `summary_skipped`, `summary_refreshed`, and reasons (unchanged hash, low drift, burst deferral, staleness). Sample a small percentage of skipped cases to validate heuristics.

## Implementation Snapshot

| Area | Current | Proposed |
| ---- | ------- | -------- |
| Summary trigger | Always per deduped node | Controlled by `summary_gate` using fact hash, embedding drift, negation, burst, staleness |
| Fact embeddings | Produced during edge resolution (parallel) | Produced immediately after extraction and reused downstream |
| Fact availability in summaries | Not available | Encapsulated in `NodeDelta` passed into summary gate |
| Metadata on node | Summary text + organic attributes | Summary text + `_graphiti_meta` (hash, embedding, timestamps, burst state) |
| Recency handling | None | Deque of recent episode timestamps + burst deferral |
| Negation detection | LLM-only inside edge resolution | Propagated into gate to force summary refresh |

These adjustments retain the existing inline execution model—no scheduled jobs—while reducing unnecessary LLM calls and improving determinism by grounding summaries in the same fact set that backs the graph.