* Remove integration markers from database tests
Removed @pytest.mark.integration from database tests to allow them to run
while excluding API integration tests that call external services.
Database tests (now run):
- tests/test_edge_int.py
- tests/test_graphiti_int.py
- tests/test_node_int.py
- tests/test_entity_exclusion_int.py
- tests/cross_encoder/test_bge_reranker_client_int.py
- tests/driver/test_falkordb_driver.py
API integration tests (excluded):
- tests/llm_client/test_anthropic_client_int.py
- tests/utils/maintenance/test_temporal_operations_int.py
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Apply ruff formatting to falkordb driver and node queries
- Quote style fixes in falkordb_driver.py
- Trailing whitespace cleanup in node_db_queries.py
- Update uv.lock
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Remove api-integration-tests job from CI workflow
The api-integration-tests job has been removed since API integration tests
are now excluded via @pytest.mark.integration marker.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix database-integration-tests to run all database tests
Previously only ran test_graphiti_mock.py, now runs all database tests:
- tests/test_graphiti_mock.py
- tests/test_graphiti_int.py
- tests/test_node_int.py
- tests/test_edge_int.py
- tests/test_entity_exclusion_int.py
- tests/cross_encoder/test_bge_reranker_client_int.py
- tests/driver/test_falkordb_driver.py
The -m "not integration" filter excludes API integration tests that call
external services (Anthropic, OpenAI, etc).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Restore integration markers for tests that call LLM APIs
test_graphiti_int.py and test_entity_exclusion_int.py call graphiti.add_episode()
and graphiti.search_() which require LLM API calls, so they are API integration
tests, not pure database tests.
Final categorization:
Pure unit tests (no external dependencies):
- tests/llm_client/test_*.py (except test_anthropic_client_int.py)
- tests/embedder/test_*.py
- tests/utils/maintenance/test_*.py (except test_temporal_operations_int.py)
- tests/utils/search/search_utils_test.py
- tests/test_text_utils.py
Database tests (require Neo4j/FalkorDB, no API calls):
- tests/test_graphiti_mock.py
- tests/test_node_int.py
- tests/test_edge_int.py
- tests/cross_encoder/test_bge_reranker_client_int.py
- tests/driver/test_falkordb_driver.py
API integration tests (excluded via @pytest.mark.integration):
- tests/test_graphiti_int.py
- tests/test_entity_exclusion_int.py
- tests/llm_client/test_anthropic_client_int.py
- tests/utils/maintenance/test_temporal_operations_int.py
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Fix datetime comparison errors by normalizing to UTC
Applied ensure_utc() to all datetime comparisons in edge_operations.py to prevent TypeError when comparing timezone-naive and timezone-aware datetimes. Removed redundant tzinfo checks since ensure_utc() handles both None and naive datetimes.
Fixed comparisons at:
- Lines 419, 423: resolve_edge_contradictions function
- Line 430: resolve_edge_contradictions function
- Line 627: resolve_extracted_edge function (removed redundant tzinfo checks)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update uv.lock
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix sorting with mixed timezone-aware/naive datetimes
Normalize datetime to UTC in sort key to prevent TypeError when comparing mixed timezone-aware and timezone-naive datetimes during sorting.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
Changes to `to_prompt_json()` helper to default to minified JSON (no indentation) instead of 2-space indentation. This reduces token consumption in LLM prompts while maintaining all necessary information.
- Changed default `indent` parameter from `2` to `None` in `prompt_helpers.py`
- Updated all prompt modules to remove explicit `indent=2` arguments
- Minor code formatting fixes in LLM clients
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
* Add OpenTelemetry distributed tracing support
- Add tracer abstraction with no-op and OpenTelemetry implementations
- Instrument add_episode and add_episode_bulk with tracing spans
- Instrument LLM client with cache-aware tracing
- Add configurable span name prefix support
- Refactor add_episode methods to improve code quality
- Add OTEL_TRACING.md documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix linting errors in tracing implementation
- Remove unused episodes_by_uuid variable
- Fix tracer type annotations for context manager support
- Replace isinstance tuple with union syntax
- Use contextlib.suppress for exception handling
- Fix import ordering and use AbstractContextManager
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Address PR review feedback on tracing implementation
Critical fixes:
- Remove flawed error span creation in graphiti.py that created orphaned spans
- Restructure LLM client tracing to create span once at start, eliminating code duplication
- Initialize LLM client tracer to NoOpTracer by default to fix type checking
Enhancements:
- Add comprehensive span attributes to add_episode: reference_time, entity/edge type counts, previous episodes count, invalidated edge count, community count
- Optimize isinstance check for better performance
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add prompt name tracking to OpenTelemetry tracing spans
Add prompt_name parameter to all LLM client generate_response() methods
and set it as a span attribute in the llm.generate span. This enables
better observability by identifying which prompt template was used for
each LLM call.
Changes:
- Add prompt_name parameter to LLMClient.generate_response() base method
- Add prompt_name parameter and tracing to OpenAIBaseClient,
AnthropicClient, GeminiClient, and OpenAIGenericClient
- Update all 14 LLM call sites across maintenance operations to include
prompt_name:
- edge_operations.py: 4 calls
- node_operations.py: 6 calls (note: 7 listed but only 6 unique)
- temporal_operations.py: 2 calls
- community_operations.py: 2 calls
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix exception handling in add_episode to record errors in OpenTelemetry span
Moved try-except block inside the OpenTelemetry span context and added
proper error recording with span.set_status() and span.record_exception().
This ensures exceptions are captured in the distributed trace, matching
the pattern used in add_episode_bulk.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Refactor summary prompts to use character limit and prevent meta-commentary
- Changed summary length constraint from "8 sentences" to "250 characters" for more predictable output
- Created reusable summary_instructions snippet in snippets.py with clear BAD/GOOD examples
- Added explicit instruction to output only factual content without meta-commentary
- Applied consistent formatting across extract_nodes.py and summarize_nodes.py
- Bumped version to 0.22.0pre2
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add copyright header to snippets.py
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Enforce shorter summaries with 8 sentence limit
Replace 250-word limit with 8 sentence limit for node summaries to improve conciseness. Also update prompt system message for summarize_context to better reflect its dual purpose of generating summaries and attributes.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update graphiti_core/prompts/summarize_nodes.py
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
* Bump version to 0.22.0pre1
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update graphiti_core/prompts/summarize_nodes.py
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
* Refactor node extraction for better maintainability
- Extract helper functions from extract_attributes_from_node to improve code organization
- Add _extract_entity_attributes, _extract_entity_summary, and _build_episode_context helpers
- Apply consistent formatting (double quotes per ruff configuration)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Apply consistent single quote style throughout node_operations
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* cleanup
* cleanup
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Bump version to 0.22.0pre0
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Add group_id parameter to get_extraction_language_instruction
Enable consumers to provide group-specific language extraction
instructions by passing group_id through the call chain.
Changes:
- Add optional group_id parameter to get_extraction_language_instruction()
- Add group_id parameter to all LLMClient.generate_response() methods
- Pass group_id through to language instruction function
- Maintain backward compatibility with default None value
Users can now customize extraction per group:
```python
def custom_instruction(group_id: str | None = None) -> str:
if group_id == 'spanish-users':
return '\n\nExtract in Spanish.'
return '\n\nExtract in original language.'
client.get_extraction_language_instruction = custom_instruction
```
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Pass group_id to generate_response in extraction operations
Thread group_id parameter through all extraction-related generate_response()
calls where it's naturally available (via episode.group_id or node.group_id).
This enables consumers to override get_extraction_language_instruction() with
group-specific language preferences.
Changes:
- edge_operations.py: Pass group_id in extract_edges()
- node_operations.py: Pass episode.group_id in extract_nodes() and
node.group_id in extract_attributes_from_node()
- node_operations.py: Add group_id parameter to extract_nodes_reflexion()
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix type inconsistency in extract_nodes_reflexion parameter
Change group_id parameter from str = '' to str | None = None to match
the pattern used throughout the codebase and align with the optional
nature of group_id in generate_response().
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Remove ensure_ascii parameter and uv.lock file
* Reset uv.lock to main branch version
---------
Co-authored-by: Claude <noreply@anthropic.com>
* fix: Improve edge extraction entity ID validation
Fixes invalid entity ID references in edge extraction that caused warnings like:
"WARNING: source or target node not filled WILL_FIND. source_node_uuid: 23 and target_node_uuid: 3"
Changes:
- Format ENTITIES list as proper JSON in prompt for better LLM parsing
- Clarify field descriptions to reference entity id from ENTITIES list
- Add explicit entity ID validation as #1 extraction rule with examples
- Improve error logging (removed PII, added entity count and valid range)
These changes follow patterns from extract_nodes.py and dedupe_nodes.py where
entity referencing works reliably.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* wip
* fix: Align fact field naming and add description
- Change extraction rule to reference 'fact' instead of 'fact_text'
- Add descriptive text for fact field in Edge model
* fix: Remove ensure_ascii parameter from to_prompt_json call
Align with other to_prompt_json calls that don't use ensure_ascii
* fix: Use validated target_node_idx variable consistently
Line 190 was using raw edge_data.target_entity_id instead of the
validated target_node_idx variable, creating inconsistency with line 189
* fix: Improve edge extraction validation checks
- Add explicit check for empty nodes list
- Use more explicit 0 <= idx comparison instead of -1 < idx
- Prevents nonsensical error message when no entities provided
* chore: Restore uv.lock from main branch
Previously deleted in commit 7e4464b, now restored to match main branch state
* Update uv.lock
---------
Co-authored-by: Claude <noreply@anthropic.com>
* chore: Update edge extraction prompt to paraphrase instead of quote
- Changed instruction 5 to request paraphrasing rather than verbatim quoting
- Updated string quotes to use double quotes for consistency
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* chore: Format edge_operations.py and update lock file
- Minor formatting fix in edge_operations.py list comprehension
- Update uv.lock with version bump to 0.21.0rc8
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* fix: Prevent duplicate edge facts within same episode
This fixes three related bugs that allowed verbatim duplicate edge facts:
1. Fixed LLM deduplication: Changed related_edges_context to use integer
indices instead of UUIDs, matching the EdgeDuplicate model expectations.
2. Fixed batch deduplication: Removed episode skip in dedupe_edges_bulk
that prevented comparing edges from the same episode. Added self-comparison
guard to prevent edge from comparing against itself.
3. Added fast-path deduplication: Added exact string matching before parallel
processing in resolve_extracted_edges to catch within-episode duplicates
early, preventing race conditions where concurrent edges can't see each other.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* test: Add tests for edge deduplication fixes
Added three tests to verify the edge deduplication fixes:
1. test_dedupe_edges_bulk_deduplicates_within_episode: Verifies that
dedupe_edges_bulk now compares edges from the same episode after
removing the `if i == j: continue` check.
2. test_resolve_extracted_edge_uses_integer_indices_for_duplicates:
Validates that the LLM receives integer indices for duplicate
detection and correctly processes returned duplicate_facts.
3. test_resolve_extracted_edges_fast_path_deduplication: Confirms that
the fast-path exact string matching deduplicates identical edges
before parallel processing, preventing race conditions.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: Remove unused variables flagged by ruff
- Remove unused loop variable 'j' in bulk_utils.py
- Remove unused return value 'edges_by_episode' in test
- Replace unused 'edge_uuid' with '_' in test loop
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* fix: Add edge type validation based on node labels
- Add DEFAULT_EDGE_NAME constant for 'RELATES_TO'
- Implement pre-resolution validation to reset invalid edge names
- Add post-resolution validation for LLM-returned fact types
- Rename parameter from edge_types to edge_type_candidates for clarity
- Add comprehensive tests for validation scenarios
This ensures edges conform to edge_type_map constraints and prevents
misclassification when edge types don't match node label pairs.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* chore: Bump version to 0.30.0pre4
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* gpt-5-mini and gpt-5-nano default
* bump version
* remove unused imports
* linter
* update
* disable neptune errors while we get a fixture in place
* update pyright
* revert non-structured completions
* fix typo