* Add OpenTelemetry distributed tracing support
- Add tracer abstraction with no-op and OpenTelemetry implementations
- Instrument add_episode and add_episode_bulk with tracing spans
- Instrument LLM client with cache-aware tracing
- Add configurable span name prefix support
- Refactor add_episode methods to improve code quality
- Add OTEL_TRACING.md documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix linting errors in tracing implementation
- Remove unused episodes_by_uuid variable
- Fix tracer type annotations for context manager support
- Replace isinstance tuple with union syntax
- Use contextlib.suppress for exception handling
- Fix import ordering and use AbstractContextManager
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Address PR review feedback on tracing implementation
Critical fixes:
- Remove flawed error span creation in graphiti.py that created orphaned spans
- Restructure LLM client tracing to create span once at start, eliminating code duplication
- Initialize LLM client tracer to NoOpTracer by default to fix type checking
Enhancements:
- Add comprehensive span attributes to add_episode: reference_time, entity/edge type counts, previous episodes count, invalidated edge count, community count
- Optimize isinstance check for better performance
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add prompt name tracking to OpenTelemetry tracing spans
Add prompt_name parameter to all LLM client generate_response() methods
and set it as a span attribute in the llm.generate span. This enables
better observability by identifying which prompt template was used for
each LLM call.
Changes:
- Add prompt_name parameter to LLMClient.generate_response() base method
- Add prompt_name parameter and tracing to OpenAIBaseClient,
AnthropicClient, GeminiClient, and OpenAIGenericClient
- Update all 14 LLM call sites across maintenance operations to include
prompt_name:
- edge_operations.py: 4 calls
- node_operations.py: 6 calls (note: 7 listed but only 6 unique)
- temporal_operations.py: 2 calls
- community_operations.py: 2 calls
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix exception handling in add_episode to record errors in OpenTelemetry span
Moved try-except block inside the OpenTelemetry span context and added
proper error recording with span.set_status() and span.record_exception().
This ensures exceptions are captured in the distributed trace, matching
the pattern used in add_episode_bulk.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Refactor prompt structure: move MESSAGES after instructions
Reordered prompt structure in extract_nodes.py to place MESSAGES section
after instructions/guidelines in both extract_attributes and extract_summary
functions for improved prompt clarity.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add sentence-aware text truncator for entity summaries
- Created truncate_at_sentence() utility function that truncates text at
sentence boundaries while respecting max character limits
- Added MAX_SUMMARY_CHARS constant (250 chars) for entity summaries
- Applied truncator to entity summaries in prompts (extract_nodes.py)
- Applied truncator to LLM-generated summaries (node_operations.py)
- Added comprehensive test suite for truncation logic
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Clean up formatting in extract_attributes prompt
- Remove extra blank lines
- Fix indentation of MESSAGES tag
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Bump version to 0.22.0pre3
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Refactor node extraction for better maintainability
- Extract helper functions from extract_attributes_from_node to improve code organization
- Add _extract_entity_attributes, _extract_entity_summary, and _build_episode_context helpers
- Apply consistent formatting (double quotes per ruff configuration)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Apply consistent single quote style throughout node_operations
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* cleanup
* cleanup
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Bump version to 0.22.0pre0
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Add group_id parameter to get_extraction_language_instruction
Enable consumers to provide group-specific language extraction
instructions by passing group_id through the call chain.
Changes:
- Add optional group_id parameter to get_extraction_language_instruction()
- Add group_id parameter to all LLMClient.generate_response() methods
- Pass group_id through to language instruction function
- Maintain backward compatibility with default None value
Users can now customize extraction per group:
```python
def custom_instruction(group_id: str | None = None) -> str:
if group_id == 'spanish-users':
return '\n\nExtract in Spanish.'
return '\n\nExtract in original language.'
client.get_extraction_language_instruction = custom_instruction
```
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Pass group_id to generate_response in extraction operations
Thread group_id parameter through all extraction-related generate_response()
calls where it's naturally available (via episode.group_id or node.group_id).
This enables consumers to override get_extraction_language_instruction() with
group-specific language preferences.
Changes:
- edge_operations.py: Pass group_id in extract_edges()
- node_operations.py: Pass episode.group_id in extract_nodes() and
node.group_id in extract_attributes_from_node()
- node_operations.py: Add group_id parameter to extract_nodes_reflexion()
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix type inconsistency in extract_nodes_reflexion parameter
Change group_id parameter from str = '' to str | None = None to match
the pattern used throughout the codebase and align with the optional
nature of group_id in generate_response().
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Remove ensure_ascii parameter and uv.lock file
* Reset uv.lock to main branch version
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Remove ensure_ascii configuration parameter
- Changed to_prompt_json default from ensure_ascii=True to False
- Removed ensure_ascii parameter from Graphiti.__init__ and GraphitiClients
- Removed ensure_ascii from all function signatures and context dictionaries
- Removed ensure_ascii from all test files
- All JSON serialization now preserves Unicode characters by default
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* format
---------
Co-authored-by: Claude <noreply@anthropic.com>
* fix: Improve deduplication ID validation and logging
- Add comprehensive logging to verify IDs sent to LLM (sent vs received)
- Enhance prompt with explicit ID bounds (0 through N-1)
- Add validation warnings for missing and extra IDs from LLM responses
- Improve error message clarity for invalid dedupe IDs
- Log actual IDs sent to LLM to confirm no index leakage
This helps diagnose cases where the LLM returns IDs outside the valid
range (e.g., ID 19 when only 0-18 were sent).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: Remove redundant logging parameter
Address reviewer comment about redundant third parameter in debug log statement.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: Address reviewer comments on list slicing and prompt clarity
- Fix list slicing bug: change <= to < to avoid gap when exactly 20 elements
(previously would skip element 10 when showing 21 elements)
- Consolidate redundant prompt phrasing while maintaining clarity
(reduced from 3 sentences to 2, keeping essential constraints)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: Remove redundant prompt text to reduce token usage
Consolidate 'using these exact IDs (0 through N-1)' with following sentence
to eliminate repetition. Changes:
- 'using these exact IDs (0 through {N-1}). Do not skip IDs or use IDs outside this range'
- 'with IDs 0 through {N-1}. Do not skip or add IDs'
Saves ~15 tokens per deduplication call.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
Add NodeSummaryFilter callback parameter to extract_attributes_from_nodes
and extract_attributes_from_node functions, allowing consumers to
selectively skip summary regeneration for specific nodes.
This enables downstream applications to implement custom logic for
throttling or filtering which nodes should have summaries regenerated,
reducing unnecessary LLM calls and token costs.
Key changes:
- Add NodeSummaryFilter type alias: Callable[[EntityNode], Awaitable[bool]]
- Update extract_attributes_from_nodes with optional should_summarize_node parameter
- Update extract_attributes_from_node with conditional summary generation logic
- Add 5 comprehensive test cases covering callback functionality
- Maintain full backwards compatibility (default None = all summaries generated)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
* Refactor deduplication logic to enhance node resolution and track duplicate pairs (#929)
* Simplify deduplication process in bulk_utils by reusing canonical nodes.
* Update dedup_helpers to store duplicate pairs during resolution.
* Modify node_operations to append duplicate pairs when resolving nodes.
* Add tests to verify deduplication behavior and ensure correct state updates.
* reveret to concurrent dedup with fanout and then reconcilation
* add performance note for deduplication loop in bulk_utils
* enhance deduplication logic in bulk_utils to handle missing canonical nodes gracefully
* Update graphiti_core/utils/bulk_utils.py
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
* refactor deduplication logic in bulk_utils to use directed union-find for canonical UUID resolution
* implement _build_directed_uuid_map for efficient UUID resolution in bulk_utils
* document directed union-find lookup in bulk_utils for clarity
---------
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
* add repository guidelines and project structure documentation
* update neo4j image version and modify test command to disable specific databases
* implement deduplication helpers and integrate with node operations
* refactor string formatting to use single quotes in node operations
* enhance deduplication helpers with UUID indexing and update resolution logic
* implement exact fact matching (#931)
* Add support for non-ASCII characters in LLM prompts
- Add ensure_ascii parameter to Graphiti class (default: True)
- Create to_prompt_json helper function for consistent JSON serialization
- Update all prompt files to use new helper function
- Preserve Korean/Japanese/Chinese characters when ensure_ascii=False
- Maintain backward compatibility with existing behavior
Fixes issue where non-ASCII characters were escaped as unicode sequences
in prompts, making them unreadable in LLM logs and potentially affecting
model understanding.
* Remove unused json imports after replacing with to_prompt_json helper
- Fix ruff lint errors (F401) for unused json imports
- All prompt files now use to_prompt_json helper instead of json.dumps
- Maintains clean code style and passes lint checks
* Fix ensure_ascii propagation to all LLM calls
- Add ensure_ascii parameter to maintenance operation functions that were missing it
- Update function signatures in node_operations, community_operations, temporal_operations, and edge_operations
- Ensure all llm_client.generate_response calls receive proper ensure_ascii context
- Fix hardcoded ensure_ascii: True values that prevented non-ASCII character preservation
- Maintain backward compatibility with default ensure_ascii=True
- Complete the fix for issue #804 ensuring Korean/Japanese/Chinese characters are properly handled in LLM prompts
* Bump version from 0.9.0 to 0.9.1 in pyproject.toml and update google-genai dependency to >=0.1.0
* Bump version from 0.9.1 to 0.9.2 in pyproject.toml
* Update google-genai dependency version to >=0.8.0 in pyproject.toml
* loc file
* Update pyproject.toml to version 0.9.3, restructure dependencies, and modify author format. Remove outdated Google API key note from README.md.
* upgrade poetry and ruff
* first cut
* Update dependencies and enhance README for optional LLM providers
- Bump aiohttp version from 3.11.14 to 3.11.16
- Update yarl version from 1.18.3 to 1.19.0
- Modify pyproject.toml to include optional extras for Anthropic, Groq, and Google Gemini
- Revise README.md to reflect new optional LLM provider installation instructions and clarify API key requirements
* Remove deprecated packages from poetry.lock and update content hash
- Removed cachetools, google-auth, google-genai, pyasn1, pyasn1-modules, rsa, and websockets from the lock file.
- Added new extras for anthropic, google-genai, and groq.
- Updated content hash to reflect changes.
* Refactor import paths for GeminiClient in README and __init__.py
- Updated import statement in README.md to reflect the new module structure for GeminiClient.
- Removed GeminiClient from the __all__ list in __init__.py as it is no longer directly imported.
* Refactor import paths for GeminiEmbedder in README and __init__.py
- Updated import statement in README.md to reflect the new module structure for GeminiEmbedder.
- Removed GeminiEmbedder and GeminiEmbedderConfig from the __all__ list in __init__.py as they are no longer directly imported.