* Fix datetime comparison errors by normalizing to UTC
Applied ensure_utc() to all datetime comparisons in edge_operations.py to prevent TypeError when comparing timezone-naive and timezone-aware datetimes. Removed redundant tzinfo checks since ensure_utc() handles both None and naive datetimes.
Fixed comparisons at:
- Lines 419, 423: resolve_edge_contradictions function
- Line 430: resolve_edge_contradictions function
- Line 627: resolve_extracted_edge function (removed redundant tzinfo checks)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update uv.lock
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix sorting with mixed timezone-aware/naive datetimes
Normalize datetime to UTC in sort key to prevent TypeError when comparing mixed timezone-aware and timezone-naive datetimes during sorting.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
Changes to `to_prompt_json()` helper to default to minified JSON (no indentation) instead of 2-space indentation. This reduces token consumption in LLM prompts while maintaining all necessary information.
- Changed default `indent` parameter from `2` to `None` in `prompt_helpers.py`
- Updated all prompt modules to remove explicit `indent=2` arguments
- Minor code formatting fixes in LLM clients
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
* Add OpenTelemetry distributed tracing support
- Add tracer abstraction with no-op and OpenTelemetry implementations
- Instrument add_episode and add_episode_bulk with tracing spans
- Instrument LLM client with cache-aware tracing
- Add configurable span name prefix support
- Refactor add_episode methods to improve code quality
- Add OTEL_TRACING.md documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix linting errors in tracing implementation
- Remove unused episodes_by_uuid variable
- Fix tracer type annotations for context manager support
- Replace isinstance tuple with union syntax
- Use contextlib.suppress for exception handling
- Fix import ordering and use AbstractContextManager
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Address PR review feedback on tracing implementation
Critical fixes:
- Remove flawed error span creation in graphiti.py that created orphaned spans
- Restructure LLM client tracing to create span once at start, eliminating code duplication
- Initialize LLM client tracer to NoOpTracer by default to fix type checking
Enhancements:
- Add comprehensive span attributes to add_episode: reference_time, entity/edge type counts, previous episodes count, invalidated edge count, community count
- Optimize isinstance check for better performance
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add prompt name tracking to OpenTelemetry tracing spans
Add prompt_name parameter to all LLM client generate_response() methods
and set it as a span attribute in the llm.generate span. This enables
better observability by identifying which prompt template was used for
each LLM call.
Changes:
- Add prompt_name parameter to LLMClient.generate_response() base method
- Add prompt_name parameter and tracing to OpenAIBaseClient,
AnthropicClient, GeminiClient, and OpenAIGenericClient
- Update all 14 LLM call sites across maintenance operations to include
prompt_name:
- edge_operations.py: 4 calls
- node_operations.py: 6 calls (note: 7 listed but only 6 unique)
- temporal_operations.py: 2 calls
- community_operations.py: 2 calls
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix exception handling in add_episode to record errors in OpenTelemetry span
Moved try-except block inside the OpenTelemetry span context and added
proper error recording with span.set_status() and span.record_exception().
This ensures exceptions are captured in the distributed trace, matching
the pattern used in add_episode_bulk.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Refactor summary prompts to use character limit and prevent meta-commentary
- Changed summary length constraint from "8 sentences" to "250 characters" for more predictable output
- Created reusable summary_instructions snippet in snippets.py with clear BAD/GOOD examples
- Added explicit instruction to output only factual content without meta-commentary
- Applied consistent formatting across extract_nodes.py and summarize_nodes.py
- Bumped version to 0.22.0pre2
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add copyright header to snippets.py
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Enforce shorter summaries with 8 sentence limit
Replace 250-word limit with 8 sentence limit for node summaries to improve conciseness. Also update prompt system message for summarize_context to better reflect its dual purpose of generating summaries and attributes.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update graphiti_core/prompts/summarize_nodes.py
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
* Bump version to 0.22.0pre1
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update graphiti_core/prompts/summarize_nodes.py
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
* Refactor node extraction for better maintainability
- Extract helper functions from extract_attributes_from_node to improve code organization
- Add _extract_entity_attributes, _extract_entity_summary, and _build_episode_context helpers
- Apply consistent formatting (double quotes per ruff configuration)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Apply consistent single quote style throughout node_operations
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* cleanup
* cleanup
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Bump version to 0.22.0pre0
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Add group_id parameter to get_extraction_language_instruction
Enable consumers to provide group-specific language extraction
instructions by passing group_id through the call chain.
Changes:
- Add optional group_id parameter to get_extraction_language_instruction()
- Add group_id parameter to all LLMClient.generate_response() methods
- Pass group_id through to language instruction function
- Maintain backward compatibility with default None value
Users can now customize extraction per group:
```python
def custom_instruction(group_id: str | None = None) -> str:
if group_id == 'spanish-users':
return '\n\nExtract in Spanish.'
return '\n\nExtract in original language.'
client.get_extraction_language_instruction = custom_instruction
```
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Pass group_id to generate_response in extraction operations
Thread group_id parameter through all extraction-related generate_response()
calls where it's naturally available (via episode.group_id or node.group_id).
This enables consumers to override get_extraction_language_instruction() with
group-specific language preferences.
Changes:
- edge_operations.py: Pass group_id in extract_edges()
- node_operations.py: Pass episode.group_id in extract_nodes() and
node.group_id in extract_attributes_from_node()
- node_operations.py: Add group_id parameter to extract_nodes_reflexion()
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix type inconsistency in extract_nodes_reflexion parameter
Change group_id parameter from str = '' to str | None = None to match
the pattern used throughout the codebase and align with the optional
nature of group_id in generate_response().
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Remove ensure_ascii parameter and uv.lock file
* Reset uv.lock to main branch version
---------
Co-authored-by: Claude <noreply@anthropic.com>
* fix: Improve edge extraction entity ID validation
Fixes invalid entity ID references in edge extraction that caused warnings like:
"WARNING: source or target node not filled WILL_FIND. source_node_uuid: 23 and target_node_uuid: 3"
Changes:
- Format ENTITIES list as proper JSON in prompt for better LLM parsing
- Clarify field descriptions to reference entity id from ENTITIES list
- Add explicit entity ID validation as #1 extraction rule with examples
- Improve error logging (removed PII, added entity count and valid range)
These changes follow patterns from extract_nodes.py and dedupe_nodes.py where
entity referencing works reliably.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* wip
* fix: Align fact field naming and add description
- Change extraction rule to reference 'fact' instead of 'fact_text'
- Add descriptive text for fact field in Edge model
* fix: Remove ensure_ascii parameter from to_prompt_json call
Align with other to_prompt_json calls that don't use ensure_ascii
* fix: Use validated target_node_idx variable consistently
Line 190 was using raw edge_data.target_entity_id instead of the
validated target_node_idx variable, creating inconsistency with line 189
* fix: Improve edge extraction validation checks
- Add explicit check for empty nodes list
- Use more explicit 0 <= idx comparison instead of -1 < idx
- Prevents nonsensical error message when no entities provided
* chore: Restore uv.lock from main branch
Previously deleted in commit 7e4464b, now restored to match main branch state
* Update uv.lock
---------
Co-authored-by: Claude <noreply@anthropic.com>
* chore: Update edge extraction prompt to paraphrase instead of quote
- Changed instruction 5 to request paraphrasing rather than verbatim quoting
- Updated string quotes to use double quotes for consistency
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* chore: Format edge_operations.py and update lock file
- Minor formatting fix in edge_operations.py list comprehension
- Update uv.lock with version bump to 0.21.0rc8
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* fix: Prevent duplicate edge facts within same episode
This fixes three related bugs that allowed verbatim duplicate edge facts:
1. Fixed LLM deduplication: Changed related_edges_context to use integer
indices instead of UUIDs, matching the EdgeDuplicate model expectations.
2. Fixed batch deduplication: Removed episode skip in dedupe_edges_bulk
that prevented comparing edges from the same episode. Added self-comparison
guard to prevent edge from comparing against itself.
3. Added fast-path deduplication: Added exact string matching before parallel
processing in resolve_extracted_edges to catch within-episode duplicates
early, preventing race conditions where concurrent edges can't see each other.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* test: Add tests for edge deduplication fixes
Added three tests to verify the edge deduplication fixes:
1. test_dedupe_edges_bulk_deduplicates_within_episode: Verifies that
dedupe_edges_bulk now compares edges from the same episode after
removing the `if i == j: continue` check.
2. test_resolve_extracted_edge_uses_integer_indices_for_duplicates:
Validates that the LLM receives integer indices for duplicate
detection and correctly processes returned duplicate_facts.
3. test_resolve_extracted_edges_fast_path_deduplication: Confirms that
the fast-path exact string matching deduplicates identical edges
before parallel processing, preventing race conditions.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: Remove unused variables flagged by ruff
- Remove unused loop variable 'j' in bulk_utils.py
- Remove unused return value 'edges_by_episode' in test
- Replace unused 'edge_uuid' with '_' in test loop
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* fix: Add edge type validation based on node labels
- Add DEFAULT_EDGE_NAME constant for 'RELATES_TO'
- Implement pre-resolution validation to reset invalid edge names
- Add post-resolution validation for LLM-returned fact types
- Rename parameter from edge_types to edge_type_candidates for clarity
- Add comprehensive tests for validation scenarios
This ensures edges conform to edge_type_map constraints and prevents
misclassification when edge types don't match node label pairs.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* chore: Bump version to 0.30.0pre4
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* gpt-5-mini and gpt-5-nano default
* bump version
* remove unused imports
* linter
* update
* disable neptune errors while we get a fixture in place
* update pyright
* revert non-structured completions
* fix typo
* fix: remove global DEFAULT_DATABASE usage in favor of driver-specific
config
Fixes bugs introduced in PR #607. This removes reliance on the global
DEFAULT_DATABASE environment variable. It specifies the database within
each driver. PR #607 introduced a Neo4j compatability, as the database
names are different when attempting to support FalkorDB.
This refactor improves compatability across database types and ensures
future reliance by isolating the configuraiton to the driver level.
* fix: make falkordb support optional
This ensures that the the optional dependency and subsequent import is compliant with the graphiti-core project dependencies.
* chore: fmt code
* chore: undo changes to uv.lock
* fix: undo potentially breaking changes to drive interface
* fix: ensure a default database of "None" is provided - falling back to internal default
* chore: ensure default value exists for session and delete_all_indexes
* chore: fix typos and grammar
* chore: update package versions and dependencies in uv.lock and bulk_utils.py
* docs: update database configuration instructions for Neo4j and FalkorDB
Clarified default database names and how to override them in driver constructors. Updated testing requirements to include specific commands for running integration and unit tests.
* fix: ensure params defaults to an empty dictionary in Neo4jDriver
Updated the execute_query method to initialize params as an empty dictionary if not provided, ensuring compatibility with the database configuration.
---------
Co-authored-by: Urmzd <urmzd@dal.ca>