graphiti

Author	SHA1	Message	Date
Daniel Chalef	a44df4c290	Bump version to 0.21.0pre12 (#967 ) Bump version to 0.21.0pre11 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-02 22:58:10 -07:00
Daniel Chalef	590282524a	fix: Improve edge extraction entity ID validation (#968 ) * fix: Improve edge extraction entity ID validation Fixes invalid entity ID references in edge extraction that caused warnings like: "WARNING: source or target node not filled WILL_FIND. source_node_uuid: 23 and target_node_uuid: 3" Changes: - Format ENTITIES list as proper JSON in prompt for better LLM parsing - Clarify field descriptions to reference entity id from ENTITIES list - Add explicit entity ID validation as #1 extraction rule with examples - Improve error logging (removed PII, added entity count and valid range) These changes follow patterns from extract_nodes.py and dedupe_nodes.py where entity referencing works reliably. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * wip * fix: Align fact field naming and add description - Change extraction rule to reference 'fact' instead of 'fact_text' - Add descriptive text for fact field in Edge model * fix: Remove ensure_ascii parameter from to_prompt_json call Align with other to_prompt_json calls that don't use ensure_ascii * fix: Use validated target_node_idx variable consistently Line 190 was using raw edge_data.target_entity_id instead of the validated target_node_idx variable, creating inconsistency with line 189 * fix: Improve edge extraction validation checks - Add explicit check for empty nodes list - Use more explicit 0 <= idx comparison instead of -1 < idx - Prevents nonsensical error message when no entities provided * chore: Restore uv.lock from main branch Previously deleted in commit `7e4464b`, now restored to match main branch state * Update uv.lock --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-02 22:45:11 -07:00
Daniel Chalef	4a307dbf10	Optimize edge deduplication prompt for caching and clarity (#970 ) * Optimize edge deduplication prompt for caching and clarity - Restructure prompt to place invariant instructions at top and dynamic context at bottom for better LLM caching - Change 'id' to 'idx' in edge context lists to avoid confusion with other identifiers - Remove 'fact_type_id' from edge types context as LLM only needs fact_type_name - Remove dynamic range values from prompt instructions (e.g., "range 0-N") - Add debug logging before LLM call to track input sizes - Add validation logging after LLM response to catch invalid idx values - Clarify that duplicate_facts uses EXISTING FACTS idx and contradicted_facts uses INVALIDATION CANDIDATES idx 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Address terminology consistency and edge case logging - Update Pydantic field descriptions to use 'idx' instead of 'ids' for consistency - Fix debug logging to handle empty list edge case (avoid 'idx 0--1' display) Note on review feedback: - Validation is intentionally non-redundant: warnings provide visibility, list comprehensions ensure robustness - WARNING level is appropriate for LLM output issues (not system errors) - Existing test coverage is sufficient for this defensive logging addition 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-02 17:07:43 -07:00
Daniel Chalef	b28bd92c16	Remove ensure_ascii configuration parameter (#969 ) * Remove ensure_ascii configuration parameter - Changed to_prompt_json default from ensure_ascii=True to False - Removed ensure_ascii parameter from Graphiti.__init__ and GraphitiClients - Removed ensure_ascii from all function signatures and context dictionaries - Removed ensure_ascii from all test files - All JSON serialization now preserves Unicode characters by default 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * format --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-02 15:10:57 -07:00
Preston Rasmussen	bec3f02036	filter out falsey values before creating embeddings (#966 ) * filter out falsey values * update * early return	2025-10-02 15:26:51 -04:00
Daniel Chalef	5ca8b9565c	fix: Improve deduplication ID validation and logging (#965 ) * fix: Improve deduplication ID validation and logging - Add comprehensive logging to verify IDs sent to LLM (sent vs received) - Enhance prompt with explicit ID bounds (0 through N-1) - Add validation warnings for missing and extra IDs from LLM responses - Improve error message clarity for invalid dedupe IDs - Log actual IDs sent to LLM to confirm no index leakage This helps diagnose cases where the LLM returns IDs outside the valid range (e.g., ID 19 when only 0-18 were sent). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Remove redundant logging parameter Address reviewer comment about redundant third parameter in debug log statement. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Address reviewer comments on list slicing and prompt clarity - Fix list slicing bug: change <= to < to avoid gap when exactly 20 elements (previously would skip element 10 when showing 21 elements) - Consolidate redundant prompt phrasing while maintaining clarity (reduced from 3 sentences to 2, keeping essential constraints) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Remove redundant prompt text to reduce token usage Consolidate 'using these exact IDs (0 through N-1)' with following sentence to eliminate repetition. Changes: - 'using these exact IDs (0 through {N-1}). Do not skip IDs or use IDs outside this range' - 'with IDs 0 through {N-1}. Do not skip or add IDs' Saves ~15 tokens per deduplication call. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-02 12:22:07 -07:00
Daniel Chalef	443f972f45	Refactor issue workflows for improved automation (#964 ) - Consolidate issue-triage.yml and issue-deduplication.yml into single workflow with sequential jobs - Create daily_issue_maintenance.yml with three jobs: - find-legacy-duplicates: Manual job to scan all open issues for duplicates - check-stale-issues: Daily job to request confirmation on issues >60 days old - close-unconfirmed-issues: Daily job to close issues without confirmation after 14 days - Update triage to use gh CLI tools with database-specific labels (neo4j, falkordb, neptune) - Separate deduplication into dedicated job using MCP GitHub tools - Add "duplicate" label to both real-time and batch deduplication workflows - Update claude-code-review.yml to use latest Sonnet model 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-02 11:37:19 -07:00
Daniel Chalef	a24ada94bb	Bump version to 0.21.0pre10 (#962 ) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-01 16:40:33 -07:00
Daniel Chalef	644aa2b967	feat: Add optional callback to control node summary generation (#959 ) Add NodeSummaryFilter callback parameter to extract_attributes_from_nodes and extract_attributes_from_node functions, allowing consumers to selectively skip summary regeneration for specific nodes. This enables downstream applications to implement custom logic for throttling or filtering which nodes should have summaries regenerated, reducing unnecessary LLM calls and token costs. Key changes: - Add NodeSummaryFilter type alias: Callable[[EntityNode], Awaitable[bool]] - Update extract_attributes_from_nodes with optional should_summarize_node parameter - Update extract_attributes_from_node with conditional summary generation logic - Add 5 comprehensive test cases covering callback functionality - Maintain full backwards compatibility (default None = all summaries generated) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-01 16:17:48 -07:00
Daniel Chalef	4a9bcd5b10	Update Claude review prompt to focus on critical feedback (#960 ) chore: Update Claude review prompt to focus on critical feedback only Added instruction to eliminate positive feedback from code reviews, reducing noise and focusing on actionable improvements. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-01 13:31:05 -07:00
Jack Ryan	59fcc9545f	fix: Fix typo in JSON entity extraction prompt (#953 ) * fix: Fix typo in JSON entity extraction prompt Change "an entities" to "any entities" in guideline 1 of the extract_json prompt. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Update graphiti_core/prompts/extract_nodes.py Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2025-10-01 11:23:39 -05:00
Daniel Chalef	f466d5971b	Bump version to 0.21.0pre9 (#958 ) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-01 09:09:49 -07:00
Daniel Chalef	7bd8f8a2f2	chore: Update edge extraction prompt to paraphrase instead of quote (#957 ) * chore: Update edge extraction prompt to paraphrase instead of quote - Changed instruction 5 to request paraphrasing rather than verbatim quoting - Updated string quotes to use double quotes for consistency 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: Format edge_operations.py and update lock file - Minor formatting fix in edge_operations.py list comprehension - Update uv.lock with version bump to 0.21.0rc8 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-01 09:05:04 -07:00
Daniel Chalef	1ebcda19c6	bump pre8 (#956 )	2025-10-01 07:40:17 -07:00
Daniel Chalef	420676faf2	fix: Prevent duplicate edge facts within same episode (#955 ) * fix: Prevent duplicate edge facts within same episode This fixes three related bugs that allowed verbatim duplicate edge facts: 1. Fixed LLM deduplication: Changed related_edges_context to use integer indices instead of UUIDs, matching the EdgeDuplicate model expectations. 2. Fixed batch deduplication: Removed episode skip in dedupe_edges_bulk that prevented comparing edges from the same episode. Added self-comparison guard to prevent edge from comparing against itself. 3. Added fast-path deduplication: Added exact string matching before parallel processing in resolve_extracted_edges to catch within-episode duplicates early, preventing race conditions where concurrent edges can't see each other. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * test: Add tests for edge deduplication fixes Added three tests to verify the edge deduplication fixes: 1. test_dedupe_edges_bulk_deduplicates_within_episode: Verifies that dedupe_edges_bulk now compares edges from the same episode after removing the `if i == j: continue` check. 2. test_resolve_extracted_edge_uses_integer_indices_for_duplicates: Validates that the LLM receives integer indices for duplicate detection and correctly processes returned duplicate_facts. 3. test_resolve_extracted_edges_fast_path_deduplication: Confirms that the fast-path exact string matching deduplicates identical edges before parallel processing, preventing race conditions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Remove unused variables flagged by ruff - Remove unused loop variable 'j' in bulk_utils.py - Remove unused return value 'edges_by_episode' in test - Replace unused 'edge_uuid' with '_' in test loop 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-01 07:30:30 -07:00
Preston Rasmussen	4d54493064	21 pre 7 (#954 )	2025-09-30 14:51:17 -04:00
Daniel Chalef	b2ff050e57	Make natural language extraction configurable (#943 ) Replace MULTILINGUAL_EXTRACTION_RESPONSES constant with configurable get_extraction_language_instruction() function to improve determinism and allow customization. Changes: - Replace constant with function in client.py - Update all LLM client implementations to use new function - Maintain backward compatibility with same default behavior - Enable users to override function for custom language requirements Users can now customize extraction behavior by monkey-patching: ```python import graphiti_core.llm_client.client as client client.get_extraction_language_instruction = lambda: "Custom instruction" ``` 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-09-30 11:09:03 -04:00
Jack Ryan	f632a8ae9e	Improve JSON entity extraction prompt (#949 ) Add guideline to extract entities from all JSON properties, not just primary fields like name/user. This ensures comprehensive entity extraction while maintaining the existing exclusion of date properties. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-09-30 11:00:14 -04:00
Daniel Chalef	f2c4c97362	Allow Edge extraction to keep discovered edge labels (#950 ) * chore: Update dependencies and enhance edge resolution logic - Add new dependencies: boto3, opensearch-py, and langchain-aws to pyproject.toml. - Modify Graphiti class to handle additional parameters in edge resolution. - Improve edge type handling in deduplication logic by introducing custom edge type names. - Enhance tests for edge resolution to cover new scenarios and ensure correct behavior. This update improves the flexibility and functionality of edge operations while ensuring compatibility with new libraries. * refactor: Clean up test_edge_operations.py and format response returns - Remove unnecessary stubs for opensearchpy module. - Format return values in llm_client.generate_response for consistency. - Enhance readability by ensuring proper indentation and structure in test cases. This refactor improves the clarity and maintainability of the test suite for edge operations. * bump version to 0.30.0pre5 and enhance docstring for resolve_extracted_edge function - Update version in pyproject.toml to 0.30.0pre5. - Add detailed docstring to resolve_extracted_edge function in edge_operations.py, clarifying parameters and return values. This update improves documentation clarity for the edge resolution process.	2025-09-29 21:32:47 -07:00
Daniel Chalef	3fcd587276	fix: Add edge type validation based on node labels (#948 ) * fix: Add edge type validation based on node labels - Add DEFAULT_EDGE_NAME constant for 'RELATES_TO' - Implement pre-resolution validation to reset invalid edge names - Add post-resolution validation for LLM-returned fact types - Rename parameter from edge_types to edge_type_candidates for clarity - Add comprehensive tests for validation scenarios This ensures edges conform to edge_type_map constraints and prevents misclassification when edge types don't match node label pairs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: Bump version to 0.30.0pre4 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-09-29 16:35:00 -07:00
Daniel Chalef	ded2bad3f2	bump 0.30.0pre3 (#946 )	2025-09-28 19:57:15 -07:00
Daniel Chalef	02efeea8e9	Improve node dedup prompts (#942 ) * remove poetry.lock * Improve dedup prompts * normalize string formatting in dedupe_nodes.py to use single quotes	2025-09-28 12:18:06 -07:00
Pavlo Paliychuk	f5d27cb9d3	chore: Bump version (#940 )	2025-09-26 18:59:04 -04:00
Daniel Chalef	d7828d48d8	Fix index out of range errors in LLM deduplication responses (#939 ) * add tests for llm dedupe guardrails * document llm dedupe guardrails	2025-09-26 14:57:48 -07:00
Daniel Chalef	27b8dd34a5	Update pyproject.toml to 0.30.0pre1 (#938 )	2025-09-26 08:42:20 -07:00
Daniel Chalef	9aee3174bd	Refactor batch deduplication logic to enhance node resolution and track duplicate pairs (#929 ) (#936 ) * Refactor deduplication logic to enhance node resolution and track duplicate pairs (#929) * Simplify deduplication process in bulk_utils by reusing canonical nodes. * Update dedup_helpers to store duplicate pairs during resolution. * Modify node_operations to append duplicate pairs when resolving nodes. * Add tests to verify deduplication behavior and ensure correct state updates. * reveret to concurrent dedup with fanout and then reconcilation * add performance note for deduplication loop in bulk_utils * enhance deduplication logic in bulk_utils to handle missing canonical nodes gracefully * Update graphiti_core/utils/bulk_utils.py Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> * refactor deduplication logic in bulk_utils to use directed union-find for canonical UUID resolution * implement _build_directed_uuid_map for efficient UUID resolution in bulk_utils * document directed union-find lookup in bulk_utils for clarity --------- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>	2025-09-26 08:40:18 -07:00
Daniel Chalef	1e56019027	Bump v0.30.0pre0 (#932 ) * Update pyproject.toml bump v0.30.0pre0 * Update pyproject.toml	2025-09-25 07:22:45 -07:00
Daniel Chalef	7c469e8e2b	Improve node deduplication w/ deterministic matching, LLM fallbacks (#929 ) * add repository guidelines and project structure documentation * update neo4j image version and modify test command to disable specific databases * implement deduplication helpers and integrate with node operations * refactor string formatting to use single quotes in node operations * enhance deduplication helpers with UUID indexing and update resolution logic * implement exact fact matching (#931)	2025-09-25 07:13:19 -07:00
Preston Rasmussen	d6d4bbdeb7	don't save duplicate edges (#927 ) * don't save duplicate edges * remove build duplicate edges	2025-09-24 17:24:57 -04:00
Preston Rasmussen	c794f8881b	pre5 (#926 )	2025-09-24 16:38:20 -04:00
Daniel Chalef	7cf5ee6288	Skip entity attribute extraction when no fields defined (#924 )	2025-09-24 13:23:37 -04:00
Preston Rasmussen	36056ad141	Graph quality updates (#922 ) duplicate_of updates	2025-09-23 17:53:39 -04:00
Gal Shubeli	d725fcdf8e	fix-fulltext-syntax-error (#914 ) * fix-fulltext-syntax-error * update-abs-method	2025-09-23 10:52:44 -04:00
Preston Rasmussen	da71d118db	Embedding fix (#917 ) * embedding fix * pre3 * fixedmake format	2025-09-20 09:00:04 -04:00
Daniel Chalef	3ea6f9f9a8	@Brandtweary has signed the CLA in getzep/graphiti#916	2025-09-19 16:38:02 -07:00
Preston Rasmussen	3efe085a92	OpenSearch updates (#906 ) * updates * add uuid filter functionality * update * updates * bump-version * update * fix typo * use async function * update unit tests * update delete * update deletion * async update * update * update * update * update	2025-09-14 01:43:37 -04:00
Daniel Chalef	4dab259217	@luan122 has signed the CLA in getzep/graphiti#908	2025-09-12 16:14:33 -07:00
Preston Rasmussen	0884cc00e5	OpenSearch Integration for Neo4j (#896 ) * move aoss to driver * add indexes * don't save vectors to neo4j with aoss * load embeddings from aoss * add group_id routing * add search filters and similarity search * neptune regression update * update neptune for regression purposes * update index creation with aliasing * regression tested * update version * edits * claude suggestions * cleanup * updates * add embedding dim env var * use cosine sim * updates * updates * remove unused imports * update	2025-09-09 10:51:46 -04:00
Daniel Chalef	a3479758d5	@gsw945 has signed the CLA in getzep/graphiti#901	2025-09-09 05:06:30 -07:00
Daniel Chalef	b558d96a79	@DavIvek has signed the CLA in getzep/graphiti#900	2025-09-09 02:59:57 -07:00
Preston Rasmussen	ce1ae30569	Add return to add_triplet (#898 ) * update * add triplet results * Update graphiti_core/graphiti.py Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> --------- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>	2025-09-08 15:39:05 -04:00
Preston Rasmussen	7e6d93fa32	add episode bulk search results (#897 ) * add episode bulk search results * update * docstring * update	2025-09-08 14:34:32 -04:00
Daniel Chalef	792bcc52bd	@Bit-urd has signed the CLA in getzep/graphiti#895	2025-09-07 13:01:21 -07:00
Preston Rasmussen	1f5a1b890c	cleanup (#894 ) * cleanup * update * remove unused imports	2025-09-05 11:30:46 -04:00
Daniel Chalef	c0fcc82ebe	@jeanlucthumm has signed the CLA in getzep/graphiti#892	2025-09-04 11:50:10 -07:00
Preston Rasmussen	eeb0d877de	update (#891 )	2025-09-03 18:42:58 -04:00
Preston Rasmussen	81d110f944	bump version (#889 ) * bump version * remove unused imports	2025-09-03 14:08:35 -04:00
prestonrasmussen	29ba336189	remove parallel runtime and build dynamic indexes sequentially	2025-09-03 13:53:12 -04:00
Preston Rasmussen	1460172568	don't return index labels (#887 ) * don't return index labels * update tests	2025-09-02 12:02:33 -04:00
Daniel Chalef	51e880fd57	@maskshell has signed the CLA in getzep/graphiti#886	2025-09-02 00:48:19 -07:00

1 2 3 4 5 ...

681 commits