graphiti

Author	SHA1	Message	Date
Naseem Ali	d0a3cd97ae	Integrate MCP for FalkorDB (#910 ) * Integrate MCP for FalkorDB * fix lint errors	2025-10-21 12:06:57 -04:00
Naseem Ali	7c38ce7830	Add FalkorDB support for docker compose (#911 )	2025-10-21 11:47:38 -04:00
Jack Ryan	beae5a94c4	Add Zep vs Graphiti comparison table to README (#1014 ) Changes auto-committed by Conductor	2025-10-20 17:49:31 -05:00
Daniel Chalef	6b62c75f03	@0fism has signed the CLA in getzep/graphiti#1005	2025-10-15 10:00:37 -07:00
Daniel Chalef	038a72b6aa	@ngaiyuc has signed the CLA in getzep/graphiti#1005	2025-10-15 09:45:22 -07:00
Daniel Chalef	12ac194714	v0.22.0 bump (#1003 )	2025-10-13 09:52:01 -07:00
Daniel Chalef	37a9ea65a2	Remove integration markers from database tests (#1000 ) * Remove integration markers from database tests Removed @pytest.mark.integration from database tests to allow them to run while excluding API integration tests that call external services. Database tests (now run): - tests/test_edge_int.py - tests/test_graphiti_int.py - tests/test_node_int.py - tests/test_entity_exclusion_int.py - tests/cross_encoder/test_bge_reranker_client_int.py - tests/driver/test_falkordb_driver.py API integration tests (excluded): - tests/llm_client/test_anthropic_client_int.py - tests/utils/maintenance/test_temporal_operations_int.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Apply ruff formatting to falkordb driver and node queries - Quote style fixes in falkordb_driver.py - Trailing whitespace cleanup in node_db_queries.py - Update uv.lock 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Remove api-integration-tests job from CI workflow The api-integration-tests job has been removed since API integration tests are now excluded via @pytest.mark.integration marker. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix database-integration-tests to run all database tests Previously only ran test_graphiti_mock.py, now runs all database tests: - tests/test_graphiti_mock.py - tests/test_graphiti_int.py - tests/test_node_int.py - tests/test_edge_int.py - tests/test_entity_exclusion_int.py - tests/cross_encoder/test_bge_reranker_client_int.py - tests/driver/test_falkordb_driver.py The -m "not integration" filter excludes API integration tests that call external services (Anthropic, OpenAI, etc). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Restore integration markers for tests that call LLM APIs test_graphiti_int.py and test_entity_exclusion_int.py call graphiti.add_episode() and graphiti.search_() which require LLM API calls, so they are API integration tests, not pure database tests. Final categorization: Pure unit tests (no external dependencies): - tests/llm_client/test_.py (except test_anthropic_client_int.py) - tests/embedder/test_.py - tests/utils/maintenance/test_*.py (except test_temporal_operations_int.py) - tests/utils/search/search_utils_test.py - tests/test_text_utils.py Database tests (require Neo4j/FalkorDB, no API calls): - tests/test_graphiti_mock.py - tests/test_node_int.py - tests/test_edge_int.py - tests/cross_encoder/test_bge_reranker_client_int.py - tests/driver/test_falkordb_driver.py API integration tests (excluded via @pytest.mark.integration): - tests/test_graphiti_int.py - tests/test_entity_exclusion_int.py - tests/llm_client/test_anthropic_client_int.py - tests/utils/maintenance/test_temporal_operations_int.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-12 10:16:34 -07:00
Naseem Ali	a8ec45b1bd	fix: wrap embeddings with vecf32() in FalkorDB single save paths (#991 ) Fixes #972. Entity and edge single save operations now properly convert embeddings to vecf32 type, matching bulk save behavior and preventing type mismatch errors during vector similarity searches.	2025-10-12 09:42:52 -07:00
Daniel Chalef	b7358e52eb	Secure Claude PR reviews with two-workflow approach (#999 ) Fixes permission errors for fork PRs while maintaining security. Changes: - Split into automatic (internal) and manual (fork) workflows - Add fork detection to prevent auto-review of external PRs - Add security-hardened prompts preventing secret disclosure - Create manual workflow for maintainer-triggered fork reviews - Add friendly notification for external contributors Security model: - Internal PRs: Auto-reviewed (trusted contributors) - Fork PRs: Human gate-keeping required before optional Claude review - Prevents prompt injection attacks via untrusted PR content 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-12 09:41:13 -07:00
Daniel Chalef	a5f26b6764	Fix FalkorDB index deletion implementation (#998 ) * Update node_db_queries.py * Update node_db_queries.py * fix-delete-idx * improve-delete * fix uint tests --------- Co-authored-by: Naseem Ali <34807727+Naseem77@users.noreply.github.com> Co-authored-by: Gal Shubeli <galshubeli93@gmail.com>	2025-10-12 09:36:22 -07:00
Daniel Chalef	e72f81092e	Separate unit, database, and API integration tests (#997 ) * Separate unit and integration tests to allow external contributors This change addresses the issue where external contributor PRs fail unit tests because GitHub secrets (API keys) are unavailable to external PRs for security reasons. Changes: - Split GitHub Actions workflow into two jobs: - unit-tests: Runs without API keys or database connections (all PRs) - integration-tests: Runs only for internal contributors with API keys - Renamed test_bge_reranker_client.py to test_bge_reranker_client_int.py to follow naming convention for integration tests - Unit tests now skip all tests requiring databases or API keys - Integration tests properly separated into: - Database integration tests (no API keys) - API integration tests (requires OPENAI_API_KEY, etc.) The unit-tests job now: - Runs for all PRs (internal and external) - Requires no GitHub secrets - Disables all database drivers - Excludes all integration test files - Passes 93 tests successfully The integration-tests job: - Only runs for internal contributors (same repo PRs or pushes to main) - Has access to GitHub secrets - Tests database operations and API integrations - Uses conditional: github.event.pull_request.head.repo.full_name == github.repository 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Separate database tests from API integration tests Restructured the workflow into three distinct jobs: 1. unit-tests: Runs on all PRs, no external dependencies (93 tests) - No API keys required - No database connections required - Fast execution 2. database-integration-tests: Runs on all PRs with databases (NEW) - Requires Neo4j and FalkorDB services - No API keys required - Tests database operations without external API calls - Includes: test_graphiti_mock.py, test_falkordb_driver.py, and utils/maintenance tests 3. api-integration-tests: Runs only for internal contributors - Requires API keys (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.) - Conditional execution for same-repo PRs only - Tests that make actual API calls to LLM providers This ensures external contributor PRs can run both unit tests and database integration tests successfully, while API integration tests requiring secrets only run for internal contributors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Disable Kuzu in CI database integration tests Kuzu requires downloading extensions from external URLs which fails in CI environment due to network restrictions. Disable Kuzu for database and API integration tests. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Use pytest -k filter to skip Kuzu tests instead of DISABLE_KUZU The original workflow used -k "neo4j" to filter tests. Kuzu requires downloading FTS extensions from external URLs which fails in CI. Use -k "neo4j or falkordb" to run tests against available databases while skipping Kuzu parametrized tests. This maintains the same test coverage as the original workflow while properly separating unit, database, and API integration tests. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Upgrade Kuzu to v0.11.3+ to fix FTS extension download issue Kuzu v0.11.3+ has FTS extension pre-installed, eliminating the need to download it from external URLs. This fixes the "Could not establish connection" error when trying to download libfts.kuzu_extension in CI. Changes: - Upgrade kuzu dependency from >=0.11.2 to >=0.11.3 - Remove pytest -k filters to run all database tests (Neo4j, FalkorDB, Kuzu) - FTS extension is now available immediately without network calls 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Move pure unit tests from database integration to unit test job The reviewer correctly identified that test_bulk_utils.py, test_edge_operations.py, and test_node_operations.py are pure unit tests using only mocks - they don't require database connections. Changes: - Removed tests/utils/maintenance/ from ignore list (too broad) - Added specific ignore for test_temporal_operations_int.py (true integration test) - Moved test_bulk_utils.py, test_edge_operations.py, test_node_operations.py to unit tests - Kept test_graphiti_mock.py in database integration (uses real graph_driver fixture) This reduces database integration test time and properly categorizes tests. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Skip flaky LLM-based tests in test_temporal_operations_int.py - test_get_edge_contradictions_multiple_existing - test_invalidate_edges_partial_update These tests rely on OpenAI LLM responses for edge contradiction detection and produce non-deterministic results. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Use pytest -k filter for API integration tests Replace explicit file listing with `pytest tests/ -k "_int"` to automatically discover all integration tests in any subdirectory. This improves maintainability by eliminating the need to manually update the workflow when adding new integration test files. Excludes: - tests/driver/ (runs separately in database-integration-tests) - tests/test_graphiti_mock.py (runs separately in database-integration-tests) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Rename workflow from "Unit Tests" to "Tests" The workflow now runs multiple test types (unit, database integration, and API integration), so "Tests" is a more accurate name. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-12 09:07:24 -07:00
Guy Korland	0e2760d1ce	Update README.md fix wrong link (#768 ) Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2025-10-11 16:07:24 -07:00
Preston Rasmussen	1e35306126	fix deprecated cypher pattern (#993 )	2025-10-09 16:12:55 -04:00
Preston Rasmussen	604e3199a3	add search and graph operations interfaces (#984 ) * add search and graph operations interfaces * update * update * update * update * update * update	2025-10-07 13:34:37 -04:00
Daniel Chalef	73015e980e	Fix datetime comparison errors by normalizing to UTC (#988 ) * Fix datetime comparison errors by normalizing to UTC Applied ensure_utc() to all datetime comparisons in edge_operations.py to prevent TypeError when comparing timezone-naive and timezone-aware datetimes. Removed redundant tzinfo checks since ensure_utc() handles both None and naive datetimes. Fixed comparisons at: - Lines 419, 423: resolve_edge_contradictions function - Line 430: resolve_edge_contradictions function - Line 627: resolve_extracted_edge function (removed redundant tzinfo checks) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Update uv.lock 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix sorting with mixed timezone-aware/naive datetimes Normalize datetime to UTC in sort key to prevent TypeError when comparing mixed timezone-aware and timezone-naive datetimes during sorting. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-07 08:28:56 -07:00
Daniel Chalef	27d4f1097b	Add OpenTelemetry stdout example with Kuzu (#987 ) - Created examples/opentelemetry/ with working stdout tracing example - Uses Kuzu in-memory database for zero-setup requirement - Demonstrates ingestion and search with distributed tracing - Updated OTEL_TRACING.md with simplified documentation and Kuzu example - Uses local editable graphiti-core install for development 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-07 07:29:35 -07:00
Daniel Chalef	65c6c338c2	bump 0.22.0pre5 (#986 )	2025-10-06 16:11:12 -07:00
Daniel Chalef	196eb2f077	Remove JSON indentation from prompts to reduce token usage (#985 ) Changes to `to_prompt_json()` helper to default to minified JSON (no indentation) instead of 2-space indentation. This reduces token consumption in LLM prompts while maintaining all necessary information. - Changed default `indent` parameter from `2` to `None` in `prompt_helpers.py` - Updated all prompt modules to remove explicit `indent=2` arguments - Minor code formatting fixes in LLM clients 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-06 16:08:43 -07:00
Daniel Chalef	24deb4d58d	Bump pre-release version to 0.22.0pre4 (#983 ) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-05 12:33:50 -07:00
Daniel Chalef	6ad695186a	Add OpenTelemetry distributed tracing support (#982 ) * Add OpenTelemetry distributed tracing support - Add tracer abstraction with no-op and OpenTelemetry implementations - Instrument add_episode and add_episode_bulk with tracing spans - Instrument LLM client with cache-aware tracing - Add configurable span name prefix support - Refactor add_episode methods to improve code quality - Add OTEL_TRACING.md documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix linting errors in tracing implementation - Remove unused episodes_by_uuid variable - Fix tracer type annotations for context manager support - Replace isinstance tuple with union syntax - Use contextlib.suppress for exception handling - Fix import ordering and use AbstractContextManager 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Address PR review feedback on tracing implementation Critical fixes: - Remove flawed error span creation in graphiti.py that created orphaned spans - Restructure LLM client tracing to create span once at start, eliminating code duplication - Initialize LLM client tracer to NoOpTracer by default to fix type checking Enhancements: - Add comprehensive span attributes to add_episode: reference_time, entity/edge type counts, previous episodes count, invalidated edge count, community count - Optimize isinstance check for better performance 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add prompt name tracking to OpenTelemetry tracing spans Add prompt_name parameter to all LLM client generate_response() methods and set it as a span attribute in the llm.generate span. This enables better observability by identifying which prompt template was used for each LLM call. Changes: - Add prompt_name parameter to LLMClient.generate_response() base method - Add prompt_name parameter and tracing to OpenAIBaseClient, AnthropicClient, GeminiClient, and OpenAIGenericClient - Update all 14 LLM call sites across maintenance operations to include prompt_name: - edge_operations.py: 4 calls - node_operations.py: 6 calls (note: 7 listed but only 6 unique) - temporal_operations.py: 2 calls - community_operations.py: 2 calls 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix exception handling in add_episode to record errors in OpenTelemetry span Moved try-except block inside the OpenTelemetry span context and added proper error recording with span.set_status() and span.record_exception(). This ensures exceptions are captured in the distributed trace, matching the pattern used in add_episode_bulk. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-05 12:26:14 -07:00
Daniel Chalef	6789767738	@clsferguson has signed the CLA in getzep/graphiti#981	2025-10-04 20:30:22 -07:00
Daniel Chalef	8770012745	Refactor prompt structure: move MESSAGES after instructions (#980 ) * Refactor prompt structure: move MESSAGES after instructions Reordered prompt structure in extract_nodes.py to place MESSAGES section after instructions/guidelines in both extract_attributes and extract_summary functions for improved prompt clarity. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add sentence-aware text truncator for entity summaries - Created truncate_at_sentence() utility function that truncates text at sentence boundaries while respecting max character limits - Added MAX_SUMMARY_CHARS constant (250 chars) for entity summaries - Applied truncator to entity summaries in prompts (extract_nodes.py) - Applied truncator to LLM-generated summaries (node_operations.py) - Added comprehensive test suite for truncation logic 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Clean up formatting in extract_attributes prompt - Remove extra blank lines - Fix indentation of MESSAGES tag 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Bump version to 0.22.0pre3 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-04 19:06:32 -07:00
Daniel Chalef	896cb4e990	Refactor summary prompts to use character limit and prevent meta-commentary (#979 ) * Refactor summary prompts to use character limit and prevent meta-commentary - Changed summary length constraint from "8 sentences" to "250 characters" for more predictable output - Created reusable summary_instructions snippet in snippets.py with clear BAD/GOOD examples - Added explicit instruction to output only factual content without meta-commentary - Applied consistent formatting across extract_nodes.py and summarize_nodes.py - Bumped version to 0.22.0pre2 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add copyright header to snippets.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-04 15:44:00 -07:00
Daniel Chalef	8a78633e2f	Enforce shorter summaries with 8 sentence limit (#978 ) * Enforce shorter summaries with 8 sentence limit Replace 250-word limit with 8 sentence limit for node summaries to improve conciseness. Also update prompt system message for summarize_context to better reflect its dual purpose of generating summaries and attributes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Update graphiti_core/prompts/summarize_nodes.py Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> * Bump version to 0.22.0pre1 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Update graphiti_core/prompts/summarize_nodes.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>	2025-10-04 14:37:16 -07:00
Daniel Chalef	2864786dd9	Refactor node extraction; remove summary from attribute extraction (#977 ) * Refactor node extraction for better maintainability - Extract helper functions from extract_attributes_from_node to improve code organization - Add _extract_entity_attributes, _extract_entity_summary, and _build_episode_context helpers - Apply consistent formatting (double quotes per ruff configuration) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Apply consistent single quote style throughout node_operations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * cleanup * cleanup 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Bump version to 0.22.0pre0 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-04 13:37:39 -07:00
Preston Rasmussen	5a67e660dc	remove generic aoss_client interactions for release build (#975 ) * remove generic aoss_client interactions for release build * remove unused imports * update * revert changes to Neptune driver * Update graphiti_core/driver/neptune_driver.py Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> * default to sync OpenSearch client * update * aoss_client now Any type * update stubs --------- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>	2025-10-03 13:41:15 -04:00
Daniel Chalef	35857fa211	Update issue triage workflow to allow non-write users for duplicate checks (#974 )	2025-10-03 09:20:28 -07:00
Daniel Chalef	189e45617f	Add group_id parameter to language extraction function (#952 ) * Add group_id parameter to get_extraction_language_instruction Enable consumers to provide group-specific language extraction instructions by passing group_id through the call chain. Changes: - Add optional group_id parameter to get_extraction_language_instruction() - Add group_id parameter to all LLMClient.generate_response() methods - Pass group_id through to language instruction function - Maintain backward compatibility with default None value Users can now customize extraction per group: ```python def custom_instruction(group_id: str \| None = None) -> str: if group_id == 'spanish-users': return '\n\nExtract in Spanish.' return '\n\nExtract in original language.' client.get_extraction_language_instruction = custom_instruction ``` 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Pass group_id to generate_response in extraction operations Thread group_id parameter through all extraction-related generate_response() calls where it's naturally available (via episode.group_id or node.group_id). This enables consumers to override get_extraction_language_instruction() with group-specific language preferences. Changes: - edge_operations.py: Pass group_id in extract_edges() - node_operations.py: Pass episode.group_id in extract_nodes() and node.group_id in extract_attributes_from_node() - node_operations.py: Add group_id parameter to extract_nodes_reflexion() 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix type inconsistency in extract_nodes_reflexion parameter Change group_id parameter from str = '' to str \| None = None to match the pattern used throughout the codebase and align with the optional nature of group_id in generate_response(). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Remove ensure_ascii parameter and uv.lock file * Reset uv.lock to main branch version --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-03 09:05:45 -07:00
Preston Rasmussen	ff260f010e	validate nodes and edges aren't falsey (#973 ) * validate nodes and edges aren't falsey * update * update	2025-10-03 11:10:07 -04:00
Daniel Chalef	a44df4c290	Bump version to 0.21.0pre12 (#967 ) Bump version to 0.21.0pre11 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-02 22:58:10 -07:00
Daniel Chalef	590282524a	fix: Improve edge extraction entity ID validation (#968 ) * fix: Improve edge extraction entity ID validation Fixes invalid entity ID references in edge extraction that caused warnings like: "WARNING: source or target node not filled WILL_FIND. source_node_uuid: 23 and target_node_uuid: 3" Changes: - Format ENTITIES list as proper JSON in prompt for better LLM parsing - Clarify field descriptions to reference entity id from ENTITIES list - Add explicit entity ID validation as #1 extraction rule with examples - Improve error logging (removed PII, added entity count and valid range) These changes follow patterns from extract_nodes.py and dedupe_nodes.py where entity referencing works reliably. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * wip * fix: Align fact field naming and add description - Change extraction rule to reference 'fact' instead of 'fact_text' - Add descriptive text for fact field in Edge model * fix: Remove ensure_ascii parameter from to_prompt_json call Align with other to_prompt_json calls that don't use ensure_ascii * fix: Use validated target_node_idx variable consistently Line 190 was using raw edge_data.target_entity_id instead of the validated target_node_idx variable, creating inconsistency with line 189 * fix: Improve edge extraction validation checks - Add explicit check for empty nodes list - Use more explicit 0 <= idx comparison instead of -1 < idx - Prevents nonsensical error message when no entities provided * chore: Restore uv.lock from main branch Previously deleted in commit `7e4464b`, now restored to match main branch state * Update uv.lock --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-02 22:45:11 -07:00
Daniel Chalef	4a307dbf10	Optimize edge deduplication prompt for caching and clarity (#970 ) * Optimize edge deduplication prompt for caching and clarity - Restructure prompt to place invariant instructions at top and dynamic context at bottom for better LLM caching - Change 'id' to 'idx' in edge context lists to avoid confusion with other identifiers - Remove 'fact_type_id' from edge types context as LLM only needs fact_type_name - Remove dynamic range values from prompt instructions (e.g., "range 0-N") - Add debug logging before LLM call to track input sizes - Add validation logging after LLM response to catch invalid idx values - Clarify that duplicate_facts uses EXISTING FACTS idx and contradicted_facts uses INVALIDATION CANDIDATES idx 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Address terminology consistency and edge case logging - Update Pydantic field descriptions to use 'idx' instead of 'ids' for consistency - Fix debug logging to handle empty list edge case (avoid 'idx 0--1' display) Note on review feedback: - Validation is intentionally non-redundant: warnings provide visibility, list comprehensions ensure robustness - WARNING level is appropriate for LLM output issues (not system errors) - Existing test coverage is sufficient for this defensive logging addition 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-02 17:07:43 -07:00
Daniel Chalef	b28bd92c16	Remove ensure_ascii configuration parameter (#969 ) * Remove ensure_ascii configuration parameter - Changed to_prompt_json default from ensure_ascii=True to False - Removed ensure_ascii parameter from Graphiti.__init__ and GraphitiClients - Removed ensure_ascii from all function signatures and context dictionaries - Removed ensure_ascii from all test files - All JSON serialization now preserves Unicode characters by default 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * format --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-02 15:10:57 -07:00
Preston Rasmussen	bec3f02036	filter out falsey values before creating embeddings (#966 ) * filter out falsey values * update * early return	2025-10-02 15:26:51 -04:00
Daniel Chalef	5ca8b9565c	fix: Improve deduplication ID validation and logging (#965 ) * fix: Improve deduplication ID validation and logging - Add comprehensive logging to verify IDs sent to LLM (sent vs received) - Enhance prompt with explicit ID bounds (0 through N-1) - Add validation warnings for missing and extra IDs from LLM responses - Improve error message clarity for invalid dedupe IDs - Log actual IDs sent to LLM to confirm no index leakage This helps diagnose cases where the LLM returns IDs outside the valid range (e.g., ID 19 when only 0-18 were sent). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Remove redundant logging parameter Address reviewer comment about redundant third parameter in debug log statement. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Address reviewer comments on list slicing and prompt clarity - Fix list slicing bug: change <= to < to avoid gap when exactly 20 elements (previously would skip element 10 when showing 21 elements) - Consolidate redundant prompt phrasing while maintaining clarity (reduced from 3 sentences to 2, keeping essential constraints) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Remove redundant prompt text to reduce token usage Consolidate 'using these exact IDs (0 through N-1)' with following sentence to eliminate repetition. Changes: - 'using these exact IDs (0 through {N-1}). Do not skip IDs or use IDs outside this range' - 'with IDs 0 through {N-1}. Do not skip or add IDs' Saves ~15 tokens per deduplication call. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-02 12:22:07 -07:00
Daniel Chalef	443f972f45	Refactor issue workflows for improved automation (#964 ) - Consolidate issue-triage.yml and issue-deduplication.yml into single workflow with sequential jobs - Create daily_issue_maintenance.yml with three jobs: - find-legacy-duplicates: Manual job to scan all open issues for duplicates - check-stale-issues: Daily job to request confirmation on issues >60 days old - close-unconfirmed-issues: Daily job to close issues without confirmation after 14 days - Update triage to use gh CLI tools with database-specific labels (neo4j, falkordb, neptune) - Separate deduplication into dedicated job using MCP GitHub tools - Add "duplicate" label to both real-time and batch deduplication workflows - Update claude-code-review.yml to use latest Sonnet model 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-02 11:37:19 -07:00
Daniel Chalef	a24ada94bb	Bump version to 0.21.0pre10 (#962 ) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-01 16:40:33 -07:00
Daniel Chalef	644aa2b967	feat: Add optional callback to control node summary generation (#959 ) Add NodeSummaryFilter callback parameter to extract_attributes_from_nodes and extract_attributes_from_node functions, allowing consumers to selectively skip summary regeneration for specific nodes. This enables downstream applications to implement custom logic for throttling or filtering which nodes should have summaries regenerated, reducing unnecessary LLM calls and token costs. Key changes: - Add NodeSummaryFilter type alias: Callable[[EntityNode], Awaitable[bool]] - Update extract_attributes_from_nodes with optional should_summarize_node parameter - Update extract_attributes_from_node with conditional summary generation logic - Add 5 comprehensive test cases covering callback functionality - Maintain full backwards compatibility (default None = all summaries generated) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-01 16:17:48 -07:00
Daniel Chalef	4a9bcd5b10	Update Claude review prompt to focus on critical feedback (#960 ) chore: Update Claude review prompt to focus on critical feedback only Added instruction to eliminate positive feedback from code reviews, reducing noise and focusing on actionable improvements. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-01 13:31:05 -07:00
Jack Ryan	59fcc9545f	fix: Fix typo in JSON entity extraction prompt (#953 ) * fix: Fix typo in JSON entity extraction prompt Change "an entities" to "any entities" in guideline 1 of the extract_json prompt. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Update graphiti_core/prompts/extract_nodes.py Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2025-10-01 11:23:39 -05:00
Daniel Chalef	f466d5971b	Bump version to 0.21.0pre9 (#958 ) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-01 09:09:49 -07:00
Daniel Chalef	7bd8f8a2f2	chore: Update edge extraction prompt to paraphrase instead of quote (#957 ) * chore: Update edge extraction prompt to paraphrase instead of quote - Changed instruction 5 to request paraphrasing rather than verbatim quoting - Updated string quotes to use double quotes for consistency 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: Format edge_operations.py and update lock file - Minor formatting fix in edge_operations.py list comprehension - Update uv.lock with version bump to 0.21.0rc8 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-01 09:05:04 -07:00
Daniel Chalef	1ebcda19c6	bump pre8 (#956 )	2025-10-01 07:40:17 -07:00
Daniel Chalef	420676faf2	fix: Prevent duplicate edge facts within same episode (#955 ) * fix: Prevent duplicate edge facts within same episode This fixes three related bugs that allowed verbatim duplicate edge facts: 1. Fixed LLM deduplication: Changed related_edges_context to use integer indices instead of UUIDs, matching the EdgeDuplicate model expectations. 2. Fixed batch deduplication: Removed episode skip in dedupe_edges_bulk that prevented comparing edges from the same episode. Added self-comparison guard to prevent edge from comparing against itself. 3. Added fast-path deduplication: Added exact string matching before parallel processing in resolve_extracted_edges to catch within-episode duplicates early, preventing race conditions where concurrent edges can't see each other. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * test: Add tests for edge deduplication fixes Added three tests to verify the edge deduplication fixes: 1. test_dedupe_edges_bulk_deduplicates_within_episode: Verifies that dedupe_edges_bulk now compares edges from the same episode after removing the `if i == j: continue` check. 2. test_resolve_extracted_edge_uses_integer_indices_for_duplicates: Validates that the LLM receives integer indices for duplicate detection and correctly processes returned duplicate_facts. 3. test_resolve_extracted_edges_fast_path_deduplication: Confirms that the fast-path exact string matching deduplicates identical edges before parallel processing, preventing race conditions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Remove unused variables flagged by ruff - Remove unused loop variable 'j' in bulk_utils.py - Remove unused return value 'edges_by_episode' in test - Replace unused 'edge_uuid' with '_' in test loop 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-01 07:30:30 -07:00
Preston Rasmussen	4d54493064	21 pre 7 (#954 )	2025-09-30 14:51:17 -04:00
Daniel Chalef	b2ff050e57	Make natural language extraction configurable (#943 ) Replace MULTILINGUAL_EXTRACTION_RESPONSES constant with configurable get_extraction_language_instruction() function to improve determinism and allow customization. Changes: - Replace constant with function in client.py - Update all LLM client implementations to use new function - Maintain backward compatibility with same default behavior - Enable users to override function for custom language requirements Users can now customize extraction behavior by monkey-patching: ```python import graphiti_core.llm_client.client as client client.get_extraction_language_instruction = lambda: "Custom instruction" ``` 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-09-30 11:09:03 -04:00
Jack Ryan	f632a8ae9e	Improve JSON entity extraction prompt (#949 ) Add guideline to extract entities from all JSON properties, not just primary fields like name/user. This ensures comprehensive entity extraction while maintaining the existing exclusion of date properties. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-09-30 11:00:14 -04:00
Daniel Chalef	f2c4c97362	Allow Edge extraction to keep discovered edge labels (#950 ) * chore: Update dependencies and enhance edge resolution logic - Add new dependencies: boto3, opensearch-py, and langchain-aws to pyproject.toml. - Modify Graphiti class to handle additional parameters in edge resolution. - Improve edge type handling in deduplication logic by introducing custom edge type names. - Enhance tests for edge resolution to cover new scenarios and ensure correct behavior. This update improves the flexibility and functionality of edge operations while ensuring compatibility with new libraries. * refactor: Clean up test_edge_operations.py and format response returns - Remove unnecessary stubs for opensearchpy module. - Format return values in llm_client.generate_response for consistency. - Enhance readability by ensuring proper indentation and structure in test cases. This refactor improves the clarity and maintainability of the test suite for edge operations. * bump version to 0.30.0pre5 and enhance docstring for resolve_extracted_edge function - Update version in pyproject.toml to 0.30.0pre5. - Add detailed docstring to resolve_extracted_edge function in edge_operations.py, clarifying parameters and return values. This update improves documentation clarity for the edge resolution process.	2025-09-29 21:32:47 -07:00
Daniel Chalef	3fcd587276	fix: Add edge type validation based on node labels (#948 ) * fix: Add edge type validation based on node labels - Add DEFAULT_EDGE_NAME constant for 'RELATES_TO' - Implement pre-resolution validation to reset invalid edge names - Add post-resolution validation for LLM-returned fact types - Rename parameter from edge_types to edge_type_candidates for clarity - Add comprehensive tests for validation scenarios This ensures edges conform to edge_type_map constraints and prevents misclassification when edge types don't match node label pairs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: Bump version to 0.30.0pre4 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-09-29 16:35:00 -07:00
Daniel Chalef	ded2bad3f2	bump 0.30.0pre3 (#946 )	2025-09-28 19:57:15 -07:00

1 2 3 4 5 ...

660 commits