* Remove ensure_ascii configuration parameter
- Changed to_prompt_json default from ensure_ascii=True to False
- Removed ensure_ascii parameter from Graphiti.__init__ and GraphitiClients
- Removed ensure_ascii from all function signatures and context dictionaries
- Removed ensure_ascii from all test files
- All JSON serialization now preserves Unicode characters by default
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* format
---------
Co-authored-by: Claude <noreply@anthropic.com>
* fix: Prevent duplicate edge facts within same episode
This fixes three related bugs that allowed verbatim duplicate edge facts:
1. Fixed LLM deduplication: Changed related_edges_context to use integer
indices instead of UUIDs, matching the EdgeDuplicate model expectations.
2. Fixed batch deduplication: Removed episode skip in dedupe_edges_bulk
that prevented comparing edges from the same episode. Added self-comparison
guard to prevent edge from comparing against itself.
3. Added fast-path deduplication: Added exact string matching before parallel
processing in resolve_extracted_edges to catch within-episode duplicates
early, preventing race conditions where concurrent edges can't see each other.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* test: Add tests for edge deduplication fixes
Added three tests to verify the edge deduplication fixes:
1. test_dedupe_edges_bulk_deduplicates_within_episode: Verifies that
dedupe_edges_bulk now compares edges from the same episode after
removing the `if i == j: continue` check.
2. test_resolve_extracted_edge_uses_integer_indices_for_duplicates:
Validates that the LLM receives integer indices for duplicate
detection and correctly processes returned duplicate_facts.
3. test_resolve_extracted_edges_fast_path_deduplication: Confirms that
the fast-path exact string matching deduplicates identical edges
before parallel processing, preventing race conditions.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: Remove unused variables flagged by ruff
- Remove unused loop variable 'j' in bulk_utils.py
- Remove unused return value 'edges_by_episode' in test
- Replace unused 'edge_uuid' with '_' in test loop
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* chore: Update dependencies and enhance edge resolution logic
- Add new dependencies: boto3, opensearch-py, and langchain-aws to pyproject.toml.
- Modify Graphiti class to handle additional parameters in edge resolution.
- Improve edge type handling in deduplication logic by introducing custom edge type names.
- Enhance tests for edge resolution to cover new scenarios and ensure correct behavior.
This update improves the flexibility and functionality of edge operations while ensuring compatibility with new libraries.
* refactor: Clean up test_edge_operations.py and format response returns
- Remove unnecessary stubs for opensearchpy module.
- Format return values in llm_client.generate_response for consistency.
- Enhance readability by ensuring proper indentation and structure in test cases.
This refactor improves the clarity and maintainability of the test suite for edge operations.
* bump version to 0.30.0pre5 and enhance docstring for resolve_extracted_edge function
- Update version in pyproject.toml to 0.30.0pre5.
- Add detailed docstring to resolve_extracted_edge function in edge_operations.py, clarifying parameters and return values.
This update improves documentation clarity for the edge resolution process.
* Refactor deduplication logic to enhance node resolution and track duplicate pairs (#929)
* Simplify deduplication process in bulk_utils by reusing canonical nodes.
* Update dedup_helpers to store duplicate pairs during resolution.
* Modify node_operations to append duplicate pairs when resolving nodes.
* Add tests to verify deduplication behavior and ensure correct state updates.
* reveret to concurrent dedup with fanout and then reconcilation
* add performance note for deduplication loop in bulk_utils
* enhance deduplication logic in bulk_utils to handle missing canonical nodes gracefully
* Update graphiti_core/utils/bulk_utils.py
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
* refactor deduplication logic in bulk_utils to use directed union-find for canonical UUID resolution
* implement _build_directed_uuid_map for efficient UUID resolution in bulk_utils
* document directed union-find lookup in bulk_utils for clarity
---------
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
* Add support for non-ASCII characters in LLM prompts
- Add ensure_ascii parameter to Graphiti class (default: True)
- Create to_prompt_json helper function for consistent JSON serialization
- Update all prompt files to use new helper function
- Preserve Korean/Japanese/Chinese characters when ensure_ascii=False
- Maintain backward compatibility with existing behavior
Fixes issue where non-ASCII characters were escaped as unicode sequences
in prompts, making them unreadable in LLM logs and potentially affecting
model understanding.
* Remove unused json imports after replacing with to_prompt_json helper
- Fix ruff lint errors (F401) for unused json imports
- All prompt files now use to_prompt_json helper instead of json.dumps
- Maintains clean code style and passes lint checks
* Fix ensure_ascii propagation to all LLM calls
- Add ensure_ascii parameter to maintenance operation functions that were missing it
- Update function signatures in node_operations, community_operations, temporal_operations, and edge_operations
- Ensure all llm_client.generate_response calls receive proper ensure_ascii context
- Fix hardcoded ensure_ascii: True values that prevented non-ASCII character preservation
- Maintain backward compatibility with default ensure_ascii=True
- Complete the fix for issue #804 ensuring Korean/Japanese/Chinese characters are properly handled in LLM prompts
* Prepare code
* Fix tests
* As -> AS, remove trailing spaces
* Enable more tests for FalkorDB
* Fix more cypher queries
* Return all created nodes and edges
* Add Neo4j service to unit tests workflow
- Introduced Neo4j as a service in the GitHub Actions workflow for unit tests.
- Configured Neo4j with appropriate ports, authentication, and health checks.
- Updated test steps to include waiting for Neo4j and running integration tests against it.
- Set environment variables for Neo4j connection in both non-integration and integration test steps.
* Update Neo4j authentication in unit tests workflow
- Changed Neo4j authentication password from 'test' to 'testpass' in the GitHub Actions workflow.
- Updated health check command to reflect the new password.
- Ensured consistency across all test steps that utilize Neo4j credentials.
* fix health check
* Fix Neo4j integration tests in CI workflow
Remove reference to non-existent test_neo4j_driver.py file from test command.
Integration tests now run via parametrized tests using the drivers list.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add OPENAI_API_KEY to Neo4j integration tests
Neo4j integration tests require OpenAI API access for LLM functionality.
Add the secret environment variable to enable these tests to run properly.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix Neo4j Cypher syntax error in BFS search queries
Replace parameter substitution in relationship pattern ranges (*1..$depth)
with direct string interpolation (*1..{bfs_max_depth}). Neo4j doesn't allow
parameter maps in MATCH pattern ranges - they must be literal values.
Fixed in both node_bfs_search and edge_bfs_search functions.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix variable name mismatch in edge_bfs_search query
Change relationship variable from 'r' to 'e' to match ENTITY_EDGE_RETURN
constant expectations. The ENTITY_EDGE_RETURN constant references variable
'e' for relationships, but the query was using 'r', causing "Variable e
not defined" errors.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Isolate database tests in CI workflow
- FalkorDB tests: Add DISABLE_NEO4J=1 and remove Neo4j env vars
- Neo4j tests: Keep current setup without DISABLE_NEO4J flag
This ensures proper test isolation where each test suite only runs
against its intended database backend.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Siddhartha Sahu <sid@kuzudb.com>
Co-authored-by: Claude <noreply@anthropic.com>
* fix: remove global DEFAULT_DATABASE usage in favor of driver-specific
config
Fixes bugs introduced in PR #607. This removes reliance on the global
DEFAULT_DATABASE environment variable. It specifies the database within
each driver. PR #607 introduced a Neo4j compatability, as the database
names are different when attempting to support FalkorDB.
This refactor improves compatability across database types and ensures
future reliance by isolating the configuraiton to the driver level.
* fix: make falkordb support optional
This ensures that the the optional dependency and subsequent import is compliant with the graphiti-core project dependencies.
* chore: fmt code
* chore: undo changes to uv.lock
* fix: undo potentially breaking changes to drive interface
* fix: ensure a default database of "None" is provided - falling back to internal default
* chore: ensure default value exists for session and delete_all_indexes
* chore: fix typos and grammar
* chore: update package versions and dependencies in uv.lock and bulk_utils.py
* docs: update database configuration instructions for Neo4j and FalkorDB
Clarified default database names and how to override them in driver constructors. Updated testing requirements to include specific commands for running integration and unit tests.
* fix: ensure params defaults to an empty dictionary in Neo4jDriver
Updated the execute_query method to initialize params as an empty dictionary if not provided, ensuring compatibility with the database configuration.
---------
Co-authored-by: Urmzd <urmzd@dal.ca>
* set and retrieve group ids
* update add episode with group id support
* add episode and search functional
* update bulk
* mypy updates
* remove unused imports
* update unit tests
* unit tests
* add optional uuid field
* format
* mypy
* ellipsis
* parallelize edge deduping more
* parallelize node insertion more
* improve bulk behavior performance
* dedupe nodes actually works
* add a reranker to search
* bulk dedupe episodes only across the same nodes
* add temporal extraction bulk function
* cleaned up bulk
* default to 4o
* format
* mypy
* mympy
* mypy ignore
* feat: Update project name and description
The project name and description in the `pyproject.toml` file have been updated to reflect the changes made to the project.
* chore: Update pyproject.toml to include core package
The `pyproject.toml` file has been updated to include the `core` package in the list of packages. This change ensures that the `core` package is included when building the project.
* fix imports
* fix importats