Commit graph

141 commits

Author SHA1 Message Date
Gal Shubeli
c144ff5995
[Improvement] Add GraphID isolation support for FalkorDB multi-tenant architecture (#835)
* Update node_db_queries.py

* Update node_db_queries.py

* graph-per-graphid

* fix-groupid-usage

* ruff-fix

* rev-driver-changes

* rm-un-changes

* fix lint

---------

Co-authored-by: Naseem Ali <34807727+Naseem77@users.noreply.github.com>
2025-11-03 10:56:53 -05:00
Preston Rasmussen
71f1f66d11
Search client update (#1026)
* update bulk interfae handling

* bump version

* format
2025-10-26 22:07:36 -04:00
Daniel Chalef
a5f26b6764
Fix FalkorDB index deletion implementation (#998)
* Update node_db_queries.py

* Update node_db_queries.py

* fix-delete-idx

* improve-delete

* fix uint tests

---------

Co-authored-by: Naseem Ali <34807727+Naseem77@users.noreply.github.com>
Co-authored-by: Gal Shubeli <galshubeli93@gmail.com>
2025-10-12 09:36:22 -07:00
Preston Rasmussen
604e3199a3
add search and graph operations interfaces (#984)
* add search and graph operations interfaces

* update

* update

* update

* update

* update

* update
2025-10-07 13:34:37 -04:00
Daniel Chalef
73015e980e
Fix datetime comparison errors by normalizing to UTC (#988)
* Fix datetime comparison errors by normalizing to UTC

Applied ensure_utc() to all datetime comparisons in edge_operations.py to prevent TypeError when comparing timezone-naive and timezone-aware datetimes. Removed redundant tzinfo checks since ensure_utc() handles both None and naive datetimes.

Fixed comparisons at:
- Lines 419, 423: resolve_edge_contradictions function
- Line 430: resolve_edge_contradictions function
- Line 627: resolve_extracted_edge function (removed redundant tzinfo checks)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Update uv.lock

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix sorting with mixed timezone-aware/naive datetimes

Normalize datetime to UTC in sort key to prevent TypeError when comparing mixed timezone-aware and timezone-naive datetimes during sorting.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-07 08:28:56 -07:00
Daniel Chalef
196eb2f077
Remove JSON indentation from prompts to reduce token usage (#985)
Changes to `to_prompt_json()` helper to default to minified JSON (no indentation) instead of 2-space indentation. This reduces token consumption in LLM prompts while maintaining all necessary information.

- Changed default `indent` parameter from `2` to `None` in `prompt_helpers.py`
- Updated all prompt modules to remove explicit `indent=2` arguments
- Minor code formatting fixes in LLM clients

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-06 16:08:43 -07:00
Daniel Chalef
6ad695186a
Add OpenTelemetry distributed tracing support (#982)
* Add OpenTelemetry distributed tracing support

- Add tracer abstraction with no-op and OpenTelemetry implementations
- Instrument add_episode and add_episode_bulk with tracing spans
- Instrument LLM client with cache-aware tracing
- Add configurable span name prefix support
- Refactor add_episode methods to improve code quality
- Add OTEL_TRACING.md documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix linting errors in tracing implementation

- Remove unused episodes_by_uuid variable
- Fix tracer type annotations for context manager support
- Replace isinstance tuple with union syntax
- Use contextlib.suppress for exception handling
- Fix import ordering and use AbstractContextManager

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Address PR review feedback on tracing implementation

Critical fixes:
- Remove flawed error span creation in graphiti.py that created orphaned spans
- Restructure LLM client tracing to create span once at start, eliminating code duplication
- Initialize LLM client tracer to NoOpTracer by default to fix type checking

Enhancements:
- Add comprehensive span attributes to add_episode: reference_time, entity/edge type counts, previous episodes count, invalidated edge count, community count
- Optimize isinstance check for better performance

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Add prompt name tracking to OpenTelemetry tracing spans

Add prompt_name parameter to all LLM client generate_response() methods
and set it as a span attribute in the llm.generate span. This enables
better observability by identifying which prompt template was used for
each LLM call.

Changes:
- Add prompt_name parameter to LLMClient.generate_response() base method
- Add prompt_name parameter and tracing to OpenAIBaseClient,
  AnthropicClient, GeminiClient, and OpenAIGenericClient
- Update all 14 LLM call sites across maintenance operations to include
  prompt_name:
  - edge_operations.py: 4 calls
  - node_operations.py: 6 calls (note: 7 listed but only 6 unique)
  - temporal_operations.py: 2 calls
  - community_operations.py: 2 calls

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix exception handling in add_episode to record errors in OpenTelemetry span

Moved try-except block inside the OpenTelemetry span context and added
proper error recording with span.set_status() and span.record_exception().
This ensures exceptions are captured in the distributed trace, matching
the pattern used in add_episode_bulk.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-05 12:26:14 -07:00
Daniel Chalef
8770012745
Refactor prompt structure: move MESSAGES after instructions (#980)
* Refactor prompt structure: move MESSAGES after instructions

Reordered prompt structure in extract_nodes.py to place MESSAGES section
after instructions/guidelines in both extract_attributes and extract_summary
functions for improved prompt clarity.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Add sentence-aware text truncator for entity summaries

- Created truncate_at_sentence() utility function that truncates text at
  sentence boundaries while respecting max character limits
- Added MAX_SUMMARY_CHARS constant (250 chars) for entity summaries
- Applied truncator to entity summaries in prompts (extract_nodes.py)
- Applied truncator to LLM-generated summaries (node_operations.py)
- Added comprehensive test suite for truncation logic

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Clean up formatting in extract_attributes prompt

- Remove extra blank lines
- Fix indentation of MESSAGES tag

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Bump version to 0.22.0pre3

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-04 19:06:32 -07:00
Daniel Chalef
2864786dd9
Refactor node extraction; remove summary from attribute extraction (#977)
* Refactor node extraction for better maintainability

- Extract helper functions from extract_attributes_from_node to improve code organization
- Add _extract_entity_attributes, _extract_entity_summary, and _build_episode_context helpers
- Apply consistent formatting (double quotes per ruff configuration)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Apply consistent single quote style throughout node_operations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* cleanup

* cleanup

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Bump version to 0.22.0pre0

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-04 13:37:39 -07:00
Daniel Chalef
189e45617f
Add group_id parameter to language extraction function (#952)
* Add group_id parameter to get_extraction_language_instruction

Enable consumers to provide group-specific language extraction
instructions by passing group_id through the call chain.

Changes:
- Add optional group_id parameter to get_extraction_language_instruction()
- Add group_id parameter to all LLMClient.generate_response() methods
- Pass group_id through to language instruction function
- Maintain backward compatibility with default None value

Users can now customize extraction per group:
```python
def custom_instruction(group_id: str | None = None) -> str:
    if group_id == 'spanish-users':
        return '\n\nExtract in Spanish.'
    return '\n\nExtract in original language.'

client.get_extraction_language_instruction = custom_instruction
```

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Pass group_id to generate_response in extraction operations

Thread group_id parameter through all extraction-related generate_response()
calls where it's naturally available (via episode.group_id or node.group_id).
This enables consumers to override get_extraction_language_instruction() with
group-specific language preferences.

Changes:
- edge_operations.py: Pass group_id in extract_edges()
- node_operations.py: Pass episode.group_id in extract_nodes() and
  node.group_id in extract_attributes_from_node()
- node_operations.py: Add group_id parameter to extract_nodes_reflexion()

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix type inconsistency in extract_nodes_reflexion parameter

Change group_id parameter from str = '' to str | None = None to match
the pattern used throughout the codebase and align with the optional
nature of group_id in generate_response().

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Remove ensure_ascii parameter and uv.lock file

* Reset uv.lock to main branch version

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-03 09:05:45 -07:00
Preston Rasmussen
ff260f010e
validate nodes and edges aren't falsey (#973)
* validate nodes and edges aren't falsey

* update

* update
2025-10-03 11:10:07 -04:00
Daniel Chalef
590282524a
fix: Improve edge extraction entity ID validation (#968)
* fix: Improve edge extraction entity ID validation

Fixes invalid entity ID references in edge extraction that caused warnings like:
"WARNING: source or target node not filled WILL_FIND. source_node_uuid: 23 and target_node_uuid: 3"

Changes:
- Format ENTITIES list as proper JSON in prompt for better LLM parsing
- Clarify field descriptions to reference entity id from ENTITIES list
- Add explicit entity ID validation as #1 extraction rule with examples
- Improve error logging (removed PII, added entity count and valid range)

These changes follow patterns from extract_nodes.py and dedupe_nodes.py where
entity referencing works reliably.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* wip

* fix: Align fact field naming and add description

- Change extraction rule to reference 'fact' instead of 'fact_text'
- Add descriptive text for fact field in Edge model

* fix: Remove ensure_ascii parameter from to_prompt_json call

Align with other to_prompt_json calls that don't use ensure_ascii

* fix: Use validated target_node_idx variable consistently

Line 190 was using raw edge_data.target_entity_id instead of the
validated target_node_idx variable, creating inconsistency with line 189

* fix: Improve edge extraction validation checks

- Add explicit check for empty nodes list
- Use more explicit 0 <= idx comparison instead of -1 < idx
- Prevents nonsensical error message when no entities provided

* chore: Restore uv.lock from main branch

Previously deleted in commit 7e4464b, now restored to match main branch state

* Update uv.lock

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-02 22:45:11 -07:00
Daniel Chalef
4a307dbf10
Optimize edge deduplication prompt for caching and clarity (#970)
* Optimize edge deduplication prompt for caching and clarity

- Restructure prompt to place invariant instructions at top and dynamic context at bottom for better LLM caching
- Change 'id' to 'idx' in edge context lists to avoid confusion with other identifiers
- Remove 'fact_type_id' from edge types context as LLM only needs fact_type_name
- Remove dynamic range values from prompt instructions (e.g., "range 0-N")
- Add debug logging before LLM call to track input sizes
- Add validation logging after LLM response to catch invalid idx values
- Clarify that duplicate_facts uses EXISTING FACTS idx and contradicted_facts uses INVALIDATION CANDIDATES idx

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Address terminology consistency and edge case logging

- Update Pydantic field descriptions to use 'idx' instead of 'ids' for consistency
- Fix debug logging to handle empty list edge case (avoid 'idx 0--1' display)

Note on review feedback:
- Validation is intentionally non-redundant: warnings provide visibility, list comprehensions ensure robustness
- WARNING level is appropriate for LLM output issues (not system errors)
- Existing test coverage is sufficient for this defensive logging addition

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-02 17:07:43 -07:00
Daniel Chalef
b28bd92c16
Remove ensure_ascii configuration parameter (#969)
* Remove ensure_ascii configuration parameter

- Changed to_prompt_json default from ensure_ascii=True to False
- Removed ensure_ascii parameter from Graphiti.__init__ and GraphitiClients
- Removed ensure_ascii from all function signatures and context dictionaries
- Removed ensure_ascii from all test files
- All JSON serialization now preserves Unicode characters by default

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* format

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-02 15:10:57 -07:00
Daniel Chalef
5ca8b9565c
fix: Improve deduplication ID validation and logging (#965)
* fix: Improve deduplication ID validation and logging

- Add comprehensive logging to verify IDs sent to LLM (sent vs received)
- Enhance prompt with explicit ID bounds (0 through N-1)
- Add validation warnings for missing and extra IDs from LLM responses
- Improve error message clarity for invalid dedupe IDs
- Log actual IDs sent to LLM to confirm no index leakage

This helps diagnose cases where the LLM returns IDs outside the valid
range (e.g., ID 19 when only 0-18 were sent).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: Remove redundant logging parameter

Address reviewer comment about redundant third parameter in debug log statement.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: Address reviewer comments on list slicing and prompt clarity

- Fix list slicing bug: change <= to < to avoid gap when exactly 20 elements
  (previously would skip element 10 when showing 21 elements)
- Consolidate redundant prompt phrasing while maintaining clarity
  (reduced from 3 sentences to 2, keeping essential constraints)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: Remove redundant prompt text to reduce token usage

Consolidate 'using these exact IDs (0 through N-1)' with following sentence
to eliminate repetition. Changes:
- 'using these exact IDs (0 through {N-1}). Do not skip IDs or use IDs outside this range'
- 'with IDs 0 through {N-1}. Do not skip or add IDs'

Saves ~15 tokens per deduplication call.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-02 12:22:07 -07:00
Daniel Chalef
644aa2b967
feat: Add optional callback to control node summary generation (#959)
Add NodeSummaryFilter callback parameter to extract_attributes_from_nodes
and extract_attributes_from_node functions, allowing consumers to
selectively skip summary regeneration for specific nodes.

This enables downstream applications to implement custom logic for
throttling or filtering which nodes should have summaries regenerated,
reducing unnecessary LLM calls and token costs.

Key changes:
- Add NodeSummaryFilter type alias: Callable[[EntityNode], Awaitable[bool]]
- Update extract_attributes_from_nodes with optional should_summarize_node parameter
- Update extract_attributes_from_node with conditional summary generation logic
- Add 5 comprehensive test cases covering callback functionality
- Maintain full backwards compatibility (default None = all summaries generated)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-01 16:17:48 -07:00
Daniel Chalef
7bd8f8a2f2
chore: Update edge extraction prompt to paraphrase instead of quote (#957)
* chore: Update edge extraction prompt to paraphrase instead of quote

- Changed instruction 5 to request paraphrasing rather than verbatim quoting
- Updated string quotes to use double quotes for consistency

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* chore: Format edge_operations.py and update lock file

- Minor formatting fix in edge_operations.py list comprehension
- Update uv.lock with version bump to 0.21.0rc8

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-01 09:05:04 -07:00
Daniel Chalef
420676faf2
fix: Prevent duplicate edge facts within same episode (#955)
* fix: Prevent duplicate edge facts within same episode

This fixes three related bugs that allowed verbatim duplicate edge facts:

1. Fixed LLM deduplication: Changed related_edges_context to use integer
   indices instead of UUIDs, matching the EdgeDuplicate model expectations.

2. Fixed batch deduplication: Removed episode skip in dedupe_edges_bulk
   that prevented comparing edges from the same episode. Added self-comparison
   guard to prevent edge from comparing against itself.

3. Added fast-path deduplication: Added exact string matching before parallel
   processing in resolve_extracted_edges to catch within-episode duplicates
   early, preventing race conditions where concurrent edges can't see each other.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* test: Add tests for edge deduplication fixes

Added three tests to verify the edge deduplication fixes:

1. test_dedupe_edges_bulk_deduplicates_within_episode: Verifies that
   dedupe_edges_bulk now compares edges from the same episode after
   removing the `if i == j: continue` check.

2. test_resolve_extracted_edge_uses_integer_indices_for_duplicates:
   Validates that the LLM receives integer indices for duplicate
   detection and correctly processes returned duplicate_facts.

3. test_resolve_extracted_edges_fast_path_deduplication: Confirms that
   the fast-path exact string matching deduplicates identical edges
   before parallel processing, preventing race conditions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: Remove unused variables flagged by ruff

- Remove unused loop variable 'j' in bulk_utils.py
- Remove unused return value 'edges_by_episode' in test
- Replace unused 'edge_uuid' with '_' in test loop

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-01 07:30:30 -07:00
Daniel Chalef
f2c4c97362
Allow Edge extraction to keep discovered edge labels (#950)
* chore: Update dependencies and enhance edge resolution logic

- Add new dependencies: boto3, opensearch-py, and langchain-aws to pyproject.toml.
- Modify Graphiti class to handle additional parameters in edge resolution.
- Improve edge type handling in deduplication logic by introducing custom edge type names.
- Enhance tests for edge resolution to cover new scenarios and ensure correct behavior.

This update improves the flexibility and functionality of edge operations while ensuring compatibility with new libraries.

* refactor: Clean up test_edge_operations.py and format response returns

- Remove unnecessary stubs for opensearchpy module.
- Format return values in llm_client.generate_response for consistency.
- Enhance readability by ensuring proper indentation and structure in test cases.

This refactor improves the clarity and maintainability of the test suite for edge operations.

* bump version to 0.30.0pre5 and enhance docstring for resolve_extracted_edge function

- Update version in pyproject.toml to 0.30.0pre5.
- Add detailed docstring to resolve_extracted_edge function in edge_operations.py, clarifying parameters and return values.

This update improves documentation clarity for the edge resolution process.
2025-09-29 21:32:47 -07:00
Daniel Chalef
3fcd587276
fix: Add edge type validation based on node labels (#948)
* fix: Add edge type validation based on node labels

- Add DEFAULT_EDGE_NAME constant for 'RELATES_TO'
- Implement pre-resolution validation to reset invalid edge names
- Add post-resolution validation for LLM-returned fact types
- Rename parameter from edge_types to edge_type_candidates for clarity
- Add comprehensive tests for validation scenarios

This ensures edges conform to edge_type_map constraints and prevents
misclassification when edge types don't match node label pairs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* chore: Bump version to 0.30.0pre4

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-09-29 16:35:00 -07:00
Daniel Chalef
d7828d48d8
Fix index out of range errors in LLM deduplication responses (#939)
* add tests for llm dedupe guardrails

* document llm dedupe guardrails
2025-09-26 14:57:48 -07:00
Daniel Chalef
9aee3174bd
Refactor batch deduplication logic to enhance node resolution and track duplicate pairs (#929) (#936)
* Refactor deduplication logic to enhance node resolution and track duplicate pairs (#929)

* Simplify deduplication process in bulk_utils by reusing canonical nodes.
* Update dedup_helpers to store duplicate pairs during resolution.
* Modify node_operations to append duplicate pairs when resolving nodes.
* Add tests to verify deduplication behavior and ensure correct state updates.

* reveret to concurrent dedup with fanout and then reconcilation

* add performance note for deduplication loop in bulk_utils

* enhance deduplication logic in bulk_utils to handle missing canonical nodes gracefully

* Update graphiti_core/utils/bulk_utils.py

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* refactor deduplication logic in bulk_utils to use directed union-find for canonical UUID resolution

* implement _build_directed_uuid_map for efficient UUID resolution in bulk_utils

* document directed union-find lookup in bulk_utils for clarity

---------

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
2025-09-26 08:40:18 -07:00
Daniel Chalef
7c469e8e2b
Improve node deduplication w/ deterministic matching, LLM fallbacks (#929)
* add repository guidelines and project structure documentation

* update neo4j image version and modify test command to disable specific databases

* implement deduplication helpers and integrate with node operations

* refactor string formatting to use single quotes in node operations

* enhance deduplication helpers with UUID indexing and update resolution logic

* implement exact fact matching (#931)
2025-09-25 07:13:19 -07:00
Preston Rasmussen
d6d4bbdeb7
don't save duplicate edges (#927)
* don't save duplicate edges

* remove build duplicate edges
2025-09-24 17:24:57 -04:00
Daniel Chalef
7cf5ee6288
Skip entity attribute extraction when no fields defined (#924) 2025-09-24 13:23:37 -04:00
Preston Rasmussen
da71d118db
Embedding fix (#917)
* embedding fix

* pre3

* fixedmake format
2025-09-20 09:00:04 -04:00
Preston Rasmussen
3efe085a92
OpenSearch updates (#906)
* updates

* add uuid filter functionality

* update

* updates

* bump-version

* update

* fix typo

* use async function

* update unit tests

* update delete

* update deletion

* async update

* update

* update

* update

* update
2025-09-14 01:43:37 -04:00
Preston Rasmussen
0884cc00e5
OpenSearch Integration for Neo4j (#896)
* move aoss to driver

* add indexes

* don't save vectors to neo4j with aoss

* load embeddings from aoss

* add group_id routing

* add search filters and similarity search

* neptune regression update

* update neptune for regression purposes

* update index creation with aliasing

* regression tested

* update version

* edits

* claude suggestions

* cleanup

* updates

* add embedding dim env var

* use cosine sim

* updates

* updates

* remove unused imports

* update
2025-09-09 10:51:46 -04:00
Preston Rasmussen
1f5a1b890c
cleanup (#894)
* cleanup

* update

* remove unused imports
2025-09-05 11:30:46 -04:00
prestonrasmussen
29ba336189 remove parallel runtime and build dynamic indexes sequentially 2025-09-03 13:53:12 -04:00
Preston Rasmussen
da6f3336bb
update-tests (#872)
* update-tests

* unit test update

* update tests

* update tests

* update kuzu query

* update

* update query

* update args

* fix bulk episode add

* make handling better
2025-08-31 13:19:29 -04:00
Siddhartha Sahu
8802b7db13
Add support for Kuzu as the graph driver (#799)
* Fix FalkoDB tests

* Add support for graph memory using Kuzu

* Fix lints

* Fix queries

* Add tests

* Add comments

* Add more test coverage

* Add mocked tests

* Format

* Add mocked tests II

* Refactor community queries

* Add more mocked tests

* Refactor tests to always cleanup

* Add more mocked tests

* Update kuzu

* Refactor how filters are built

* Add more mocked tests

* Refactor and cleanup

* Fix tests

* Fix lints

* Refactor tests

* Disable neptune

* Fix

* Update kuzu version

* Update kuzu to latest release

* Fix filter

* Fix query

* Fix Neptune query

* Fix bulk queries

* Fix lints

* Fix deletes

* Comments and format

* Add Kuzu to the README

* Fix bulk queries

* Test all fields of nodes and edges

* Fix lints

* Update search_utils.py

---------

Co-authored-by: Preston Rasmussen <109292228+prasmussen15@users.noreply.github.com>
2025-08-27 11:45:21 -04:00
Preston Rasmussen
309159bccb
update migration (#870)
* update migration

* bump version

* close driver
2025-08-27 11:13:10 -04:00
bechbd
41c3da2440
Fixed issue where creating indices was not called for Neptune and added missing quickstart example (#850)
* Rebased Neptune changes based on significant rework done

* Updated the README documentation

* Fixed linting and formatting

* Update README.md

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* Update graphiti_core/driver/neptune_driver.py

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* Update README.md

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* Addressed feedback from code review

* Updated the README documentation for clarity

* Updated the README and neptune_driver based on PR feedback

* Update node_db_queries.py

* bug: Fixed issue with missing call to create indicies for Neptune and added quickstart example

* chore: added pyright to ignore the attribute not in GrapHDriver

* Fixed quickstart with feedback from automated PR

---------

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Co-authored-by: Preston Rasmussen <109292228+prasmussen15@users.noreply.github.com>
2025-08-26 11:51:20 -04:00
Preston Rasmussen
0ac7ded4d1
use hnsw indexes (#859)
* use hnsw indexes

* add migration

* updates

* add group_id validation

* updates

* add type annotation

* updates

* update

* swap to prerelease
2025-08-25 12:31:35 -04:00
bechbd
ef56dc779a
Amazon Neptune Support (#793)
* Rebased Neptune changes based on significant rework done

* Updated the README documentation

* Fixed linting and formatting

* Update README.md

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* Update graphiti_core/driver/neptune_driver.py

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* Update README.md

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* Addressed feedback from code review

* Updated the README documentation for clarity

* Updated the README and neptune_driver based on PR feedback

* Update node_db_queries.py

---------

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Co-authored-by: Preston Rasmussen <109292228+prasmussen15@users.noreply.github.com>
2025-08-20 10:56:03 -04:00
Preston Rasmussen
1c27a3563b
update prompts and support thinking models (#846)
* update prompts and support thinking models

* update

* type ignore
2025-08-19 12:31:50 -04:00
Gal Shubeli
1abb4b0fa3
Fix Community Operations with FalkorDB (#824)
* Update node_db_queries.py

* Update node_db_queries.py

* fix-community-operations

---------

Co-authored-by: Naseem Ali <34807727+Naseem77@users.noreply.github.com>
2025-08-18 10:38:24 -04:00
Preston Rasmussen
baa6825708
ensure ascii default to false (#817) 2025-08-08 11:20:02 -04:00
HUGO SON
ce9ef3ca79
Add support for non-ASCII characters in LLM prompts (#805)
* Add support for non-ASCII characters in LLM prompts

- Add ensure_ascii parameter to Graphiti class (default: True)
- Create to_prompt_json helper function for consistent JSON serialization
- Update all prompt files to use new helper function
- Preserve Korean/Japanese/Chinese characters when ensure_ascii=False
- Maintain backward compatibility with existing behavior

Fixes issue where non-ASCII characters were escaped as unicode sequences
in prompts, making them unreadable in LLM logs and potentially affecting
model understanding.

* Remove unused json imports after replacing with to_prompt_json helper

- Fix ruff lint errors (F401) for unused json imports
- All prompt files now use to_prompt_json helper instead of json.dumps
- Maintains clean code style and passes lint checks

* Fix ensure_ascii propagation to all LLM calls

- Add ensure_ascii parameter to maintenance operation functions that were missing it
- Update function signatures in node_operations, community_operations, temporal_operations, and edge_operations
- Ensure all llm_client.generate_response calls receive proper ensure_ascii context
- Fix hardcoded ensure_ascii: True values that prevented non-ASCII character preservation
- Maintain backward compatibility with default ensure_ascii=True
- Complete the fix for issue #804 ensuring Korean/Japanese/Chinese characters are properly handled in LLM prompts
2025-08-08 11:07:32 -04:00
Preston Rasmussen
ab8106cb4f
move summary out of attribute extraction (#792)
* move summary out of attribute extraction

* linter

* linter

* fix db query
2025-07-31 12:15:21 -04:00
Preston Rasmussen
19bddb5528
validate pydantic objects (#783)
* validate pydantic objects

* unused imports

* linter
2025-07-29 17:54:09 -04:00
Daniel Chalef
dcc9da3f68
chore/prepare kuzu integration (#762)
* Prepare code

* Fix tests

* As -> AS, remove trailing spaces

* Enable more tests for FalkorDB

* Fix more cypher queries

* Return all created nodes and edges

* Add Neo4j service to unit tests workflow

- Introduced Neo4j as a service in the GitHub Actions workflow for unit tests.
- Configured Neo4j with appropriate ports, authentication, and health checks.
- Updated test steps to include waiting for Neo4j and running integration tests against it.
- Set environment variables for Neo4j connection in both non-integration and integration test steps.

* Update Neo4j authentication in unit tests workflow

- Changed Neo4j authentication password from 'test' to 'testpass' in the GitHub Actions workflow.
- Updated health check command to reflect the new password.
- Ensured consistency across all test steps that utilize Neo4j credentials.

* fix health check

* Fix Neo4j integration tests in CI workflow

Remove reference to non-existent test_neo4j_driver.py file from test command.
Integration tests now run via parametrized tests using the drivers list.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Add OPENAI_API_KEY to Neo4j integration tests

Neo4j integration tests require OpenAI API access for LLM functionality.
Add the secret environment variable to enable these tests to run properly.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix Neo4j Cypher syntax error in BFS search queries

Replace parameter substitution in relationship pattern ranges (*1..$depth)
with direct string interpolation (*1..{bfs_max_depth}). Neo4j doesn't allow
parameter maps in MATCH pattern ranges - they must be literal values.

Fixed in both node_bfs_search and edge_bfs_search functions.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix variable name mismatch in edge_bfs_search query

Change relationship variable from 'r' to 'e' to match ENTITY_EDGE_RETURN
constant expectations. The ENTITY_EDGE_RETURN constant references variable
'e' for relationships, but the query was using 'r', causing "Variable e
not defined" errors.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Isolate database tests in CI workflow

- FalkorDB tests: Add DISABLE_NEO4J=1 and remove Neo4j env vars
- Neo4j tests: Keep current setup without DISABLE_NEO4J flag

This ensures proper test isolation where each test suite only runs
against its intended database backend.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Siddhartha Sahu <sid@kuzudb.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-07-29 09:07:34 -04:00
Preston Rasmussen
0ac2541b35
make egg_operations more robust (#737)
update
2025-07-16 17:12:20 -04:00
Preston Rasmussen
5d45d71259
Bulk updates (#732)
* updates

* update

* update

* typo

* linter
2025-07-16 02:26:33 -04:00
Preston Rasmussen
62df6624d4
bulk utils update (#727)
* bulk utils update

* remove unused imports

* edge model type guard
2025-07-15 11:42:08 -04:00
Daniel Chalef
aa6e38856a
[REFACTOR][FIX] Move away from DEFAULT_DATABASE environment variable in favour of driver-config support (dc) (#699)
* fix: remove global DEFAULT_DATABASE usage in favor of driver-specific
config

Fixes bugs introduced in PR #607. This removes reliance on the global
DEFAULT_DATABASE environment variable. It specifies the database within
each driver. PR #607 introduced a Neo4j compatability, as the database
names are different when attempting to support FalkorDB.

This refactor improves compatability across database types and ensures
future reliance by isolating the configuraiton to the driver level.

* fix: make falkordb support optional

This ensures that the the optional dependency and subsequent import is compliant with the graphiti-core project dependencies.

* chore: fmt code

* chore: undo changes to uv.lock

* fix: undo potentially breaking changes to drive interface

* fix: ensure a default database of "None" is provided - falling back to internal default

* chore: ensure default value exists for session and delete_all_indexes

* chore: fix typos and grammar

* chore: update package versions and dependencies in uv.lock and bulk_utils.py

* docs: update database configuration instructions for Neo4j and FalkorDB

Clarified default database names and how to override them in driver constructors. Updated testing requirements to include specific commands for running integration and unit tests.

* fix: ensure params defaults to an empty dictionary in Neo4jDriver

Updated the execute_query method to initialize params as an empty dictionary if not provided, ensuring compatibility with the database configuration.

---------

Co-authored-by: Urmzd <urmzd@dal.ca>
2025-07-10 17:25:39 -04:00
Preston Rasmussen
0675ac2b7d
Bulk ingestion (#698)
* partial

* update

* update

* update

* update

* updates

* updates

* update

* update
2025-07-10 12:14:49 -04:00
Preston Rasmussen
71360d91fc
reformat (#655) 2025-07-01 12:26:15 -04:00
Daniel Chalef
8213d10d44
migrate to pyright (#646)
* migrate to pyright

* Refactor type checking to use Pyright, update dependencies, and clean up code.

- Replaced MyPy with Pyright in configuration files and CI workflows.
- Updated `pyproject.toml` and `uv.lock` to reflect new dependencies and versions.
- Adjusted type hints and fixed minor code issues across various modules for better compatibility with Pyright.
- Added new packages `backoff` and `posthog` to the project dependencies.

* Update CI workflows to install all extra dependencies for type checking and unit tests

* Update dependencies in uv.lock to replace MyPy with Pyright and add nodeenv package. Adjust type hinting in config.py for compatibility with Pyright.
2025-06-30 12:04:21 -07:00