graphiti

Author	SHA1	Message	Date
Daniel Chalef	e72f81092e	Separate unit, database, and API integration tests (#997 ) * Separate unit and integration tests to allow external contributors This change addresses the issue where external contributor PRs fail unit tests because GitHub secrets (API keys) are unavailable to external PRs for security reasons. Changes: - Split GitHub Actions workflow into two jobs: - unit-tests: Runs without API keys or database connections (all PRs) - integration-tests: Runs only for internal contributors with API keys - Renamed test_bge_reranker_client.py to test_bge_reranker_client_int.py to follow naming convention for integration tests - Unit tests now skip all tests requiring databases or API keys - Integration tests properly separated into: - Database integration tests (no API keys) - API integration tests (requires OPENAI_API_KEY, etc.) The unit-tests job now: - Runs for all PRs (internal and external) - Requires no GitHub secrets - Disables all database drivers - Excludes all integration test files - Passes 93 tests successfully The integration-tests job: - Only runs for internal contributors (same repo PRs or pushes to main) - Has access to GitHub secrets - Tests database operations and API integrations - Uses conditional: github.event.pull_request.head.repo.full_name == github.repository 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Separate database tests from API integration tests Restructured the workflow into three distinct jobs: 1. unit-tests: Runs on all PRs, no external dependencies (93 tests) - No API keys required - No database connections required - Fast execution 2. database-integration-tests: Runs on all PRs with databases (NEW) - Requires Neo4j and FalkorDB services - No API keys required - Tests database operations without external API calls - Includes: test_graphiti_mock.py, test_falkordb_driver.py, and utils/maintenance tests 3. api-integration-tests: Runs only for internal contributors - Requires API keys (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.) - Conditional execution for same-repo PRs only - Tests that make actual API calls to LLM providers This ensures external contributor PRs can run both unit tests and database integration tests successfully, while API integration tests requiring secrets only run for internal contributors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Disable Kuzu in CI database integration tests Kuzu requires downloading extensions from external URLs which fails in CI environment due to network restrictions. Disable Kuzu for database and API integration tests. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Use pytest -k filter to skip Kuzu tests instead of DISABLE_KUZU The original workflow used -k "neo4j" to filter tests. Kuzu requires downloading FTS extensions from external URLs which fails in CI. Use -k "neo4j or falkordb" to run tests against available databases while skipping Kuzu parametrized tests. This maintains the same test coverage as the original workflow while properly separating unit, database, and API integration tests. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Upgrade Kuzu to v0.11.3+ to fix FTS extension download issue Kuzu v0.11.3+ has FTS extension pre-installed, eliminating the need to download it from external URLs. This fixes the "Could not establish connection" error when trying to download libfts.kuzu_extension in CI. Changes: - Upgrade kuzu dependency from >=0.11.2 to >=0.11.3 - Remove pytest -k filters to run all database tests (Neo4j, FalkorDB, Kuzu) - FTS extension is now available immediately without network calls 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Move pure unit tests from database integration to unit test job The reviewer correctly identified that test_bulk_utils.py, test_edge_operations.py, and test_node_operations.py are pure unit tests using only mocks - they don't require database connections. Changes: - Removed tests/utils/maintenance/ from ignore list (too broad) - Added specific ignore for test_temporal_operations_int.py (true integration test) - Moved test_bulk_utils.py, test_edge_operations.py, test_node_operations.py to unit tests - Kept test_graphiti_mock.py in database integration (uses real graph_driver fixture) This reduces database integration test time and properly categorizes tests. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Skip flaky LLM-based tests in test_temporal_operations_int.py - test_get_edge_contradictions_multiple_existing - test_invalidate_edges_partial_update These tests rely on OpenAI LLM responses for edge contradiction detection and produce non-deterministic results. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Use pytest -k filter for API integration tests Replace explicit file listing with `pytest tests/ -k "_int"` to automatically discover all integration tests in any subdirectory. This improves maintainability by eliminating the need to manually update the workflow when adding new integration test files. Excludes: - tests/driver/ (runs separately in database-integration-tests) - tests/test_graphiti_mock.py (runs separately in database-integration-tests) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Rename workflow from "Unit Tests" to "Tests" The workflow now runs multiple test types (unit, database integration, and API integration), so "Tests" is a more accurate name. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-12 09:07:24 -07:00
Daniel Chalef	b28bd92c16	Remove ensure_ascii configuration parameter (#969 ) * Remove ensure_ascii configuration parameter - Changed to_prompt_json default from ensure_ascii=True to False - Removed ensure_ascii parameter from Graphiti.__init__ and GraphitiClients - Removed ensure_ascii from all function signatures and context dictionaries - Removed ensure_ascii from all test files - All JSON serialization now preserves Unicode characters by default 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * format --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-02 15:10:57 -07:00
Daniel Chalef	644aa2b967	feat: Add optional callback to control node summary generation (#959 ) Add NodeSummaryFilter callback parameter to extract_attributes_from_nodes and extract_attributes_from_node functions, allowing consumers to selectively skip summary regeneration for specific nodes. This enables downstream applications to implement custom logic for throttling or filtering which nodes should have summaries regenerated, reducing unnecessary LLM calls and token costs. Key changes: - Add NodeSummaryFilter type alias: Callable[[EntityNode], Awaitable[bool]] - Update extract_attributes_from_nodes with optional should_summarize_node parameter - Update extract_attributes_from_node with conditional summary generation logic - Add 5 comprehensive test cases covering callback functionality - Maintain full backwards compatibility (default None = all summaries generated) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-01 16:17:48 -07:00
Daniel Chalef	420676faf2	fix: Prevent duplicate edge facts within same episode (#955 ) * fix: Prevent duplicate edge facts within same episode This fixes three related bugs that allowed verbatim duplicate edge facts: 1. Fixed LLM deduplication: Changed related_edges_context to use integer indices instead of UUIDs, matching the EdgeDuplicate model expectations. 2. Fixed batch deduplication: Removed episode skip in dedupe_edges_bulk that prevented comparing edges from the same episode. Added self-comparison guard to prevent edge from comparing against itself. 3. Added fast-path deduplication: Added exact string matching before parallel processing in resolve_extracted_edges to catch within-episode duplicates early, preventing race conditions where concurrent edges can't see each other. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * test: Add tests for edge deduplication fixes Added three tests to verify the edge deduplication fixes: 1. test_dedupe_edges_bulk_deduplicates_within_episode: Verifies that dedupe_edges_bulk now compares edges from the same episode after removing the `if i == j: continue` check. 2. test_resolve_extracted_edge_uses_integer_indices_for_duplicates: Validates that the LLM receives integer indices for duplicate detection and correctly processes returned duplicate_facts. 3. test_resolve_extracted_edges_fast_path_deduplication: Confirms that the fast-path exact string matching deduplicates identical edges before parallel processing, preventing race conditions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Remove unused variables flagged by ruff - Remove unused loop variable 'j' in bulk_utils.py - Remove unused return value 'edges_by_episode' in test - Replace unused 'edge_uuid' with '_' in test loop 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-01 07:30:30 -07:00
Daniel Chalef	f2c4c97362	Allow Edge extraction to keep discovered edge labels (#950 ) * chore: Update dependencies and enhance edge resolution logic - Add new dependencies: boto3, opensearch-py, and langchain-aws to pyproject.toml. - Modify Graphiti class to handle additional parameters in edge resolution. - Improve edge type handling in deduplication logic by introducing custom edge type names. - Enhance tests for edge resolution to cover new scenarios and ensure correct behavior. This update improves the flexibility and functionality of edge operations while ensuring compatibility with new libraries. * refactor: Clean up test_edge_operations.py and format response returns - Remove unnecessary stubs for opensearchpy module. - Format return values in llm_client.generate_response for consistency. - Enhance readability by ensuring proper indentation and structure in test cases. This refactor improves the clarity and maintainability of the test suite for edge operations. * bump version to 0.30.0pre5 and enhance docstring for resolve_extracted_edge function - Update version in pyproject.toml to 0.30.0pre5. - Add detailed docstring to resolve_extracted_edge function in edge_operations.py, clarifying parameters and return values. This update improves documentation clarity for the edge resolution process.	2025-09-29 21:32:47 -07:00
Daniel Chalef	3fcd587276	fix: Add edge type validation based on node labels (#948 ) * fix: Add edge type validation based on node labels - Add DEFAULT_EDGE_NAME constant for 'RELATES_TO' - Implement pre-resolution validation to reset invalid edge names - Add post-resolution validation for LLM-returned fact types - Rename parameter from edge_types to edge_type_candidates for clarity - Add comprehensive tests for validation scenarios This ensures edges conform to edge_type_map constraints and prevents misclassification when edge types don't match node label pairs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: Bump version to 0.30.0pre4 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-09-29 16:35:00 -07:00
Daniel Chalef	d7828d48d8	Fix index out of range errors in LLM deduplication responses (#939 ) * add tests for llm dedupe guardrails * document llm dedupe guardrails	2025-09-26 14:57:48 -07:00
Daniel Chalef	9aee3174bd	Refactor batch deduplication logic to enhance node resolution and track duplicate pairs (#929 ) (#936 ) * Refactor deduplication logic to enhance node resolution and track duplicate pairs (#929) * Simplify deduplication process in bulk_utils by reusing canonical nodes. * Update dedup_helpers to store duplicate pairs during resolution. * Modify node_operations to append duplicate pairs when resolving nodes. * Add tests to verify deduplication behavior and ensure correct state updates. * reveret to concurrent dedup with fanout and then reconcilation * add performance note for deduplication loop in bulk_utils * enhance deduplication logic in bulk_utils to handle missing canonical nodes gracefully * Update graphiti_core/utils/bulk_utils.py Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> * refactor deduplication logic in bulk_utils to use directed union-find for canonical UUID resolution * implement _build_directed_uuid_map for efficient UUID resolution in bulk_utils * document directed union-find lookup in bulk_utils for clarity --------- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>	2025-09-26 08:40:18 -07:00
Daniel Chalef	7c469e8e2b	Improve node deduplication w/ deterministic matching, LLM fallbacks (#929 ) * add repository guidelines and project structure documentation * update neo4j image version and modify test command to disable specific databases * implement deduplication helpers and integrate with node operations * refactor string formatting to use single quotes in node operations * enhance deduplication helpers with UUID indexing and update resolution logic * implement exact fact matching (#931)	2025-09-25 07:13:19 -07:00
Preston Rasmussen	9422b6f5fb	Node dedupe efficiency (#490 ) * update resolve extracted edge * updated edge resolution * dedupe nodes update * single pass node resolution * updates * mypy updates * Update graphiti_core/prompts/dedupe_nodes.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> * remove unused imports * mypy --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>	2025-05-15 13:56:33 -04:00
Preston Rasmussen	fd9969b5a1	Update dedupe prompt (#457 ) * improve dedupe logic * cut summary length * update unit tests	2025-05-07 23:23:31 -04:00
Preston Rasmussen	1193b25fa3	`add_episode()` refactor (#421 ) * temporal updates * update resolve nodes * dedupe edge updates * edge dedupe * extract attributes * update dynamic pydantic model * first pass of extract node attributes * no errors * bug fixes * bug fixes * prompt updates * prompt updates * updates * updates * remove unused imports * update tests based on changes * remove unused import	2025-04-30 12:08:52 -04:00
Daniel Chalef	0f6ac57dab	chore: update version to 0.9.3 and restructure dependencies (#338 ) * Bump version from 0.9.0 to 0.9.1 in pyproject.toml and update google-genai dependency to >=0.1.0 * Bump version from 0.9.1 to 0.9.2 in pyproject.toml * Update google-genai dependency version to >=0.8.0 in pyproject.toml * loc file * Update pyproject.toml to version 0.9.3, restructure dependencies, and modify author format. Remove outdated Google API key note from README.md. * upgrade poetry and ruff	2025-04-08 20:47:38 -07:00
Preston Rasmussen	9efa6762d7	entity typo (#274 )	2025-02-24 12:44:17 -05:00
Preston Rasmussen	088029a80c	node label filters (#265 ) * node label filters * update * add search filters * updates * bump versions * update tests * test update	2025-02-21 12:38:01 -05:00
Daniel Chalef	445dccc021	refactor: use `utc_now()` for consistent UTC datetime handling (#234 ) * ensure utc timezones * fix: dep cycle --------- Co-authored-by: paulpaliychuk <pavlo.paliychuk.ca@gmail.com>	2024-12-09 10:36:04 -08:00
Preston Rasmussen	3199e893ed	add_fact endpoint (#207 ) * add_fact endpoint * bump version * add edge invalidation * update	2024-11-06 09:12:21 -05:00
Preston Rasmussen	e15c872900	Fix edge invalidation (#174 ) * update edge operations * add new tests	2024-10-07 11:45:31 -04:00
Preston Rasmussen	d7c20c1f59	Search refactor + Community search (#111 ) * WIP * WIP * WIP * community search * WIP * WIP * integration tested * tests * tests * mypy * mypy * format	2024-09-16 14:03:05 -04:00
Preston Rasmussen	42fb590606	Add group ids (#89 ) * set and retrieve group ids * update add episode with group id support * add episode and search functional * update bulk * mypy updates * remove unused imports * update unit tests * unit tests * add optional uuid field * format * mypy * ellipsis	2024-09-06 12:33:42 -04:00
Preston Rasmussen	06d8d9359f	Add Missing Node and edge CRUD (#51 ) * add CRUD operations and fix search limit bugs * format * update tests * å * update tests to double limit call * add default field * format * import correct field	2024-08-27 16:18:01 -04:00
Daniel Chalef	2d0705fc1b	Add get_nodes_by_query method to Graphiti class (#49 ) * Add get_nodes_by_query method to Graphiti class Add a method to the Graphiti class that wraps `get_relevant_nodes` and returns a list of nodes given a query. * Add `get_nodes_by_query` method to the `Graphiti` class in `graphiti_core/graphiti.py`. * Import `generate_embedding` from `graphiti_core/llm_client/utils.py`. * Use `generate_embedding` to generate an embedding for the query. * Call `get_relevant_nodes` with the generated embedding and return the relevant nodes. Add an embedding function to `llm_client/utils.py`. * Add `generate_embedding` function to `graphiti_core/llm_client/utils.py`. * Accept an embedder and model_id as parameters. * Generate an embedding for the given text and return it. --- For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/getzep/graphiti?shareId=XXXX-XXXX-XXXX-XXXX). * address comments left by @danielchalef on #49 (Add get_nodes_by_query method to Graphiti class); * fix ellipsis name in cla config * feat: Add get_nodes_by_query method to Graphiti class * chore: Cleanup unused files, add hybrid node search, add tests --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> Co-authored-by: paulpaliychuk <pavlo.paliychuk.ca@gmail.com>	2024-08-26 20:00:28 -07:00
Pavlo Paliychuk	6e8c964aef	chore: Add comments to graphiti methods (#40 ) * chore: Add comments to graphiti methods * chore: Update int test name + add header to test files * chore: Add comments to episode type	2024-08-26 13:11:50 -04:00
Pavlo Paliychuk	0ed7739bc0	Controlled example (#37 ) * chore: Add romeo runner * fix: Linter * dedupe fixes * wip * wip dump * allbirds * chore: Update romeo parser * chore: Anthropic model fix * allbirds runner * format * wip * mypy updates * update * remove r * update tests * format * wip * wip * wip * chore: Strategically update the message * chore: Add romeo runner * fix: Linter * wip * wip dump * chore: Update romeo parser * chore: Anthropic model fix * wip * allbirds * allbirds runner * format * wip * wip * mypy updates * update * remove r * update tests * format * wip * chore: Strategically update the message * rebase and fix import issues * Update package imports for graphiti_core in examples and utils * nits * chore: Update OpenAI GPT-4o model to gpt-4o-2024-08-06 * implement groq * improvments & linting * cleanup and nits * Refactor package imports for graphiti_core in examples and utils * Refactor package imports for graphiti_core in examples and utils * chore: Nuke unused examples * chore: Nuke unused examples * chore: Only run type check on graphiti_core * fix unit tests * reformat * unit test * fix: Unit tests * test: Add coverage for extract_date_strings_from_edge * lint * remove commented code --------- Co-authored-by: prestonrasmussen <prasmuss15@gmail.com> Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2024-08-26 10:30:22 -04:00
Daniel Chalef	c5e52153c4	chore: Fix packaging (#38 ) * feat: Update project name and description The project name and description in the `pyproject.toml` file have been updated to reflect the changes made to the project. * chore: Update pyproject.toml to include core package The `pyproject.toml` file has been updated to include the `core` package in the list of packages. This change ensures that the `core` package is included when building the project. * fix imports * fix importats	2024-08-25 10:07:50 -07:00
Pavlo Paliychuk	605219f8c7	feat: Add real world dates extraction (#26 ) * feat: Add real world dates extraction * fix: Linter * fix: 💄 mypy errors * chore: handle invalid dates returned by the llm * chore: Polish prompt * reformat * style: 💄 reformat	2024-08-23 14:18:45 -04:00
Pavlo Paliychuk	8a55f48f5e	Fix temporal invalidation unit tests (#23 ) * wip * wip * wip * fix: Linter errors * fix formatting * chore: fix ruff * fix: Duplication * chore: Fix unit tests for temporal invalidation * attempt to fix unit tests * fix: format --------- Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2024-08-22 19:02:20 -04:00
Daniel Chalef	73ec0146ff	ruff action (#17 ) * ruff action * chore: Update Python version to 3.10 in lint.yml workflow * fix lint and formatting * cleanup	2024-08-22 13:06:42 -07:00
Daniel Chalef	50da9d0f31	format and linting (#18 ) * Makefile and format * fix podcast stuff * refactor: update import statement for transcript_parser in podcast_runner.py * format and linting * chore: Update import statements and remove unused code in maintenance module	2024-08-22 12:26:13 -07:00
Pavlo Paliychuk	a6fd0ddb75	feat: Initial version of temporal invalidation + tests (#8 ) * feat: Initial version of temporal invalidation + tests * fix: dont run int tests on CI * fix: dont run int tests on CI * fix: dont run int tests on CI * fix: time of day issue * fix: running non int tests in ci * fix: running non int tests in ci * fix: running non int tests in ci * fix: running non int tests in ci * fix: running non int tests in ci * fix: running non int tests in ci * fix: running non int tests in ci * revert: Tests structural changes * chore: Remove idea file * chore: Get rid of NodesWithEdges class and define a triplet type instead	2024-08-20 16:29:19 -04:00

30 commits