Commit graph

42 commits

Author SHA1 Message Date
Preston Rasmussen
604e3199a3
add search and graph operations interfaces (#984)
* add search and graph operations interfaces

* update

* update

* update

* update

* update

* update
2025-10-07 13:34:37 -04:00
Daniel Chalef
b28bd92c16
Remove ensure_ascii configuration parameter (#969)
* Remove ensure_ascii configuration parameter

- Changed to_prompt_json default from ensure_ascii=True to False
- Removed ensure_ascii parameter from Graphiti.__init__ and GraphitiClients
- Removed ensure_ascii from all function signatures and context dictionaries
- Removed ensure_ascii from all test files
- All JSON serialization now preserves Unicode characters by default

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* format

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-02 15:10:57 -07:00
Daniel Chalef
420676faf2
fix: Prevent duplicate edge facts within same episode (#955)
* fix: Prevent duplicate edge facts within same episode

This fixes three related bugs that allowed verbatim duplicate edge facts:

1. Fixed LLM deduplication: Changed related_edges_context to use integer
   indices instead of UUIDs, matching the EdgeDuplicate model expectations.

2. Fixed batch deduplication: Removed episode skip in dedupe_edges_bulk
   that prevented comparing edges from the same episode. Added self-comparison
   guard to prevent edge from comparing against itself.

3. Added fast-path deduplication: Added exact string matching before parallel
   processing in resolve_extracted_edges to catch within-episode duplicates
   early, preventing race conditions where concurrent edges can't see each other.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* test: Add tests for edge deduplication fixes

Added three tests to verify the edge deduplication fixes:

1. test_dedupe_edges_bulk_deduplicates_within_episode: Verifies that
   dedupe_edges_bulk now compares edges from the same episode after
   removing the `if i == j: continue` check.

2. test_resolve_extracted_edge_uses_integer_indices_for_duplicates:
   Validates that the LLM receives integer indices for duplicate
   detection and correctly processes returned duplicate_facts.

3. test_resolve_extracted_edges_fast_path_deduplication: Confirms that
   the fast-path exact string matching deduplicates identical edges
   before parallel processing, preventing race conditions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: Remove unused variables flagged by ruff

- Remove unused loop variable 'j' in bulk_utils.py
- Remove unused return value 'edges_by_episode' in test
- Replace unused 'edge_uuid' with '_' in test loop

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-01 07:30:30 -07:00
Daniel Chalef
f2c4c97362
Allow Edge extraction to keep discovered edge labels (#950)
* chore: Update dependencies and enhance edge resolution logic

- Add new dependencies: boto3, opensearch-py, and langchain-aws to pyproject.toml.
- Modify Graphiti class to handle additional parameters in edge resolution.
- Improve edge type handling in deduplication logic by introducing custom edge type names.
- Enhance tests for edge resolution to cover new scenarios and ensure correct behavior.

This update improves the flexibility and functionality of edge operations while ensuring compatibility with new libraries.

* refactor: Clean up test_edge_operations.py and format response returns

- Remove unnecessary stubs for opensearchpy module.
- Format return values in llm_client.generate_response for consistency.
- Enhance readability by ensuring proper indentation and structure in test cases.

This refactor improves the clarity and maintainability of the test suite for edge operations.

* bump version to 0.30.0pre5 and enhance docstring for resolve_extracted_edge function

- Update version in pyproject.toml to 0.30.0pre5.
- Add detailed docstring to resolve_extracted_edge function in edge_operations.py, clarifying parameters and return values.

This update improves documentation clarity for the edge resolution process.
2025-09-29 21:32:47 -07:00
Daniel Chalef
9aee3174bd
Refactor batch deduplication logic to enhance node resolution and track duplicate pairs (#929) (#936)
* Refactor deduplication logic to enhance node resolution and track duplicate pairs (#929)

* Simplify deduplication process in bulk_utils by reusing canonical nodes.
* Update dedup_helpers to store duplicate pairs during resolution.
* Modify node_operations to append duplicate pairs when resolving nodes.
* Add tests to verify deduplication behavior and ensure correct state updates.

* reveret to concurrent dedup with fanout and then reconcilation

* add performance note for deduplication loop in bulk_utils

* enhance deduplication logic in bulk_utils to handle missing canonical nodes gracefully

* Update graphiti_core/utils/bulk_utils.py

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* refactor deduplication logic in bulk_utils to use directed union-find for canonical UUID resolution

* implement _build_directed_uuid_map for efficient UUID resolution in bulk_utils

* document directed union-find lookup in bulk_utils for clarity

---------

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
2025-09-26 08:40:18 -07:00
Preston Rasmussen
da71d118db
Embedding fix (#917)
* embedding fix

* pre3

* fixedmake format
2025-09-20 09:00:04 -04:00
Preston Rasmussen
3efe085a92
OpenSearch updates (#906)
* updates

* add uuid filter functionality

* update

* updates

* bump-version

* update

* fix typo

* use async function

* update unit tests

* update delete

* update deletion

* async update

* update

* update

* update

* update
2025-09-14 01:43:37 -04:00
Preston Rasmussen
0884cc00e5
OpenSearch Integration for Neo4j (#896)
* move aoss to driver

* add indexes

* don't save vectors to neo4j with aoss

* load embeddings from aoss

* add group_id routing

* add search filters and similarity search

* neptune regression update

* update neptune for regression purposes

* update index creation with aliasing

* regression tested

* update version

* edits

* claude suggestions

* cleanup

* updates

* add embedding dim env var

* use cosine sim

* updates

* updates

* remove unused imports

* update
2025-09-09 10:51:46 -04:00
Preston Rasmussen
1f5a1b890c
cleanup (#894)
* cleanup

* update

* remove unused imports
2025-09-05 11:30:46 -04:00
Preston Rasmussen
da6f3336bb
update-tests (#872)
* update-tests

* unit test update

* update tests

* update tests

* update kuzu query

* update

* update query

* update args

* fix bulk episode add

* make handling better
2025-08-31 13:19:29 -04:00
Siddhartha Sahu
8802b7db13
Add support for Kuzu as the graph driver (#799)
* Fix FalkoDB tests

* Add support for graph memory using Kuzu

* Fix lints

* Fix queries

* Add tests

* Add comments

* Add more test coverage

* Add mocked tests

* Format

* Add mocked tests II

* Refactor community queries

* Add more mocked tests

* Refactor tests to always cleanup

* Add more mocked tests

* Update kuzu

* Refactor how filters are built

* Add more mocked tests

* Refactor and cleanup

* Fix tests

* Fix lints

* Refactor tests

* Disable neptune

* Fix

* Update kuzu version

* Update kuzu to latest release

* Fix filter

* Fix query

* Fix Neptune query

* Fix bulk queries

* Fix lints

* Fix deletes

* Comments and format

* Add Kuzu to the README

* Fix bulk queries

* Test all fields of nodes and edges

* Fix lints

* Update search_utils.py

---------

Co-authored-by: Preston Rasmussen <109292228+prasmussen15@users.noreply.github.com>
2025-08-27 11:45:21 -04:00
Preston Rasmussen
0ac7ded4d1
use hnsw indexes (#859)
* use hnsw indexes

* add migration

* updates

* add group_id validation

* updates

* add type annotation

* updates

* update

* swap to prerelease
2025-08-25 12:31:35 -04:00
bechbd
ef56dc779a
Amazon Neptune Support (#793)
* Rebased Neptune changes based on significant rework done

* Updated the README documentation

* Fixed linting and formatting

* Update README.md

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* Update graphiti_core/driver/neptune_driver.py

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* Update README.md

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* Addressed feedback from code review

* Updated the README documentation for clarity

* Updated the README and neptune_driver based on PR feedback

* Update node_db_queries.py

---------

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Co-authored-by: Preston Rasmussen <109292228+prasmussen15@users.noreply.github.com>
2025-08-20 10:56:03 -04:00
HUGO SON
ce9ef3ca79
Add support for non-ASCII characters in LLM prompts (#805)
* Add support for non-ASCII characters in LLM prompts

- Add ensure_ascii parameter to Graphiti class (default: True)
- Create to_prompt_json helper function for consistent JSON serialization
- Update all prompt files to use new helper function
- Preserve Korean/Japanese/Chinese characters when ensure_ascii=False
- Maintain backward compatibility with existing behavior

Fixes issue where non-ASCII characters were escaped as unicode sequences
in prompts, making them unreadable in LLM logs and potentially affecting
model understanding.

* Remove unused json imports after replacing with to_prompt_json helper

- Fix ruff lint errors (F401) for unused json imports
- All prompt files now use to_prompt_json helper instead of json.dumps
- Maintains clean code style and passes lint checks

* Fix ensure_ascii propagation to all LLM calls

- Add ensure_ascii parameter to maintenance operation functions that were missing it
- Update function signatures in node_operations, community_operations, temporal_operations, and edge_operations
- Ensure all llm_client.generate_response calls receive proper ensure_ascii context
- Fix hardcoded ensure_ascii: True values that prevented non-ASCII character preservation
- Maintain backward compatibility with default ensure_ascii=True
- Complete the fix for issue #804 ensuring Korean/Japanese/Chinese characters are properly handled in LLM prompts
2025-08-08 11:07:32 -04:00
Preston Rasmussen
ab8106cb4f
move summary out of attribute extraction (#792)
* move summary out of attribute extraction

* linter

* linter

* fix db query
2025-07-31 12:15:21 -04:00
Daniel Chalef
dcc9da3f68
chore/prepare kuzu integration (#762)
* Prepare code

* Fix tests

* As -> AS, remove trailing spaces

* Enable more tests for FalkorDB

* Fix more cypher queries

* Return all created nodes and edges

* Add Neo4j service to unit tests workflow

- Introduced Neo4j as a service in the GitHub Actions workflow for unit tests.
- Configured Neo4j with appropriate ports, authentication, and health checks.
- Updated test steps to include waiting for Neo4j and running integration tests against it.
- Set environment variables for Neo4j connection in both non-integration and integration test steps.

* Update Neo4j authentication in unit tests workflow

- Changed Neo4j authentication password from 'test' to 'testpass' in the GitHub Actions workflow.
- Updated health check command to reflect the new password.
- Ensured consistency across all test steps that utilize Neo4j credentials.

* fix health check

* Fix Neo4j integration tests in CI workflow

Remove reference to non-existent test_neo4j_driver.py file from test command.
Integration tests now run via parametrized tests using the drivers list.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Add OPENAI_API_KEY to Neo4j integration tests

Neo4j integration tests require OpenAI API access for LLM functionality.
Add the secret environment variable to enable these tests to run properly.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix Neo4j Cypher syntax error in BFS search queries

Replace parameter substitution in relationship pattern ranges (*1..$depth)
with direct string interpolation (*1..{bfs_max_depth}). Neo4j doesn't allow
parameter maps in MATCH pattern ranges - they must be literal values.

Fixed in both node_bfs_search and edge_bfs_search functions.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix variable name mismatch in edge_bfs_search query

Change relationship variable from 'r' to 'e' to match ENTITY_EDGE_RETURN
constant expectations. The ENTITY_EDGE_RETURN constant references variable
'e' for relationships, but the query was using 'r', causing "Variable e
not defined" errors.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Isolate database tests in CI workflow

- FalkorDB tests: Add DISABLE_NEO4J=1 and remove Neo4j env vars
- Neo4j tests: Keep current setup without DISABLE_NEO4J flag

This ensures proper test isolation where each test suite only runs
against its intended database backend.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Siddhartha Sahu <sid@kuzudb.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-07-29 09:07:34 -04:00
Preston Rasmussen
5d45d71259
Bulk updates (#732)
* updates

* update

* update

* typo

* linter
2025-07-16 02:26:33 -04:00
Preston Rasmussen
62df6624d4
bulk utils update (#727)
* bulk utils update

* remove unused imports

* edge model type guard
2025-07-15 11:42:08 -04:00
Daniel Chalef
aa6e38856a
[REFACTOR][FIX] Move away from DEFAULT_DATABASE environment variable in favour of driver-config support (dc) (#699)
* fix: remove global DEFAULT_DATABASE usage in favor of driver-specific
config

Fixes bugs introduced in PR #607. This removes reliance on the global
DEFAULT_DATABASE environment variable. It specifies the database within
each driver. PR #607 introduced a Neo4j compatability, as the database
names are different when attempting to support FalkorDB.

This refactor improves compatability across database types and ensures
future reliance by isolating the configuraiton to the driver level.

* fix: make falkordb support optional

This ensures that the the optional dependency and subsequent import is compliant with the graphiti-core project dependencies.

* chore: fmt code

* chore: undo changes to uv.lock

* fix: undo potentially breaking changes to drive interface

* fix: ensure a default database of "None" is provided - falling back to internal default

* chore: ensure default value exists for session and delete_all_indexes

* chore: fix typos and grammar

* chore: update package versions and dependencies in uv.lock and bulk_utils.py

* docs: update database configuration instructions for Neo4j and FalkorDB

Clarified default database names and how to override them in driver constructors. Updated testing requirements to include specific commands for running integration and unit tests.

* fix: ensure params defaults to an empty dictionary in Neo4jDriver

Updated the execute_query method to initialize params as an empty dictionary if not provided, ensuring compatibility with the database configuration.

---------

Co-authored-by: Urmzd <urmzd@dal.ca>
2025-07-10 17:25:39 -04:00
Preston Rasmussen
0675ac2b7d
Bulk ingestion (#698)
* partial

* update

* update

* update

* update

* updates

* updates

* update

* update
2025-07-10 12:14:49 -04:00
Daniel Chalef
c29893d972
Excluded entity type filtering (#624)
* excluded entities filtering

* Fix variable name casing in test_entity_exclusion_int.py for consistency
2025-06-26 20:54:43 -07:00
Preston Rasmussen
2b0bc21b21
be more explicit about edge type signatures (#600)
* be more explicit about edge type signatures

* bump version

* update
2025-06-18 16:01:00 -04:00
Preston Rasmussen
14146dc46f
Add support for falkordb (#575)
* [wip] add support for falkordb

* updates

* fix-async

* progress

* fix-issues

* rm-date-handler

* red-code

* rm-uns-try

* fix-exm

* rm-un-lines

* fix-comments

* fix-se-utils

* fix-falkor-readme

* fix-falkor-cosine-score

* update-falkor-ver

* fix-vec-sim

* min-updates

* make format

* update graph driver abstraction

* poetry lock

* updates

* linter

* Update graphiti_core/search/search_utils.py

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

---------

Co-authored-by: Dudi Zimberknopf <zimber.dudi@gmail.com>
Co-authored-by: Gal Shubeli <galshubeli93@gmail.com>
Co-authored-by: Gal Shubeli <124919062+galshubeli@users.noreply.github.com>
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
2025-06-13 12:06:57 -04:00
Preston Rasmussen
db7595fe63
Edge types (#501)
* update entity edge attributes

* Adding prompts

* extract fact attributes

* edge types

* edge types no regressions

* mypy

* mypy update

* Update graphiti_core/prompts/dedupe_edges.py

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* Update graphiti_core/prompts/dedupe_edges.py

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* mypy

---------

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
2025-05-19 13:30:56 -04:00
Preston Rasmussen
e75feff45e
pre4 (#462)
* pre4

* update

* update
2025-05-08 18:25:22 -04:00
Preston Rasmussen
89c4ee8cad
make bulk save more robust (#461)
* make bulk save more robust

* updates
2025-05-08 15:34:13 -04:00
Preston Rasmussen
a26b25dc06
Add episode refactor (#399)
* partial refactor

* get relevant nodes refactor

* load edges updates

* refactor triplets

* not there yet

* node search update

* working refactor

* updates

* mypy

* mypy
2025-04-26 00:24:23 -04:00
neonconsultingllc
d0b1b2e5db
Fix for using non default neo4j database (#329)
Pass database_ correctly to driver.session to fix using non default database
2025-04-18 13:06:31 -04:00
Preston Rasmussen
088029a80c
node label filters (#265)
* node label filters

* update

* add search filters

* updates

* bump versions

* update tests

* test update
2025-02-21 12:38:01 -05:00
Preston Rasmussen
29a071b2b8
Custom ontology (#262)
* ontology

* extract and save node labels

* extract entity type properties

* neo4j upgrade needed

* add entity types

* update typing

* update types

* updates

* Update graphiti_core/utils/maintenance/node_operations.py

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* fix warning

* mypy updates

* update properties

* mypy ignore

* mypy types

* bump version

---------

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
2025-02-13 12:17:52 -05:00
Preston Rasmussen
00fe87679e
Bounded semaphore - limiting concurrency (#244)
* WIP

* add semaphore

* remove unused imports

* remove unused imports

* lower concurrency limit
2024-12-17 13:08:18 -05:00
Daniel Chalef
445dccc021
refactor: use utc_now() for consistent UTC datetime handling (#234)
* ensure utc timezones

* fix: dep cycle

---------

Co-authored-by: paulpaliychuk <pavlo.paliychuk.ca@gmail.com>
2024-12-09 10:36:04 -08:00
Preston Rasmussen
281fe072cb
add fulltext search limit (#215)
* add fulltext search limit

* format

* update

* update

* update tests

* remove unused imports

* format

* mypy
2024-11-14 12:18:18 -05:00
Preston Rasmussen
3199e893ed
add_fact endpoint (#207)
* add_fact endpoint

* bump version

* add edge invalidation

* update
2024-11-06 09:12:21 -05:00
Preston Rasmussen
b8f52670ce
Bulk add nodes and edges (#205)
* test

* only use parallel runtime if set to true

* add and test bulk add

* remove group_ids

* format

* bump version

* update readme
2024-10-31 12:31:37 -04:00
Preston Rasmussen
42fb590606
Add group ids (#89)
* set and retrieve group ids

* update add episode with group id support

* add episode and search functional

* update bulk

* mypy updates

* remove unused imports

* update unit tests

* unit tests

* add optional uuid field

* format

* mypy

* ellipsis
2024-09-06 12:33:42 -04:00
Preston Rasmussen
e56a599a72
search update (#81)
* search update

* update string literals
2024-09-04 10:05:45 -04:00
Preston Rasmussen
e9e6039b1e
Speed up add episode (#77)
* WIP

* updates

* use uuid for node dedupe

* pret-testing

* parallelized node resolution

* working add_episode

* revert to 4o

* format

* mypy update

* update types
2024-09-03 13:25:52 -04:00
Daniel Chalef
77685b063c
Feat/langgraph-example (#73)
* wip

* wip

* image + clean run

* chore: Update LANGCHAIN_TRACING_V2 to 'false' in agent.ipynb

* chore: Remove unused import in runner.ipynb

* lock file
2024-09-01 12:31:08 -07:00
Preston Rasmussen
35a4e5172b
add bulk temporal extraction and improve bulk quality and performance (#67)
* parallelize edge deduping more

* parallelize node insertion more

* improve bulk behavior performance

* dedupe nodes actually works

* add a reranker to search

* bulk dedupe episodes only across the same nodes

* add temporal extraction bulk function

* cleaned up bulk

* default to 4o

* format

* mypy

* mympy

* mypy ignore
2024-08-30 10:48:28 -04:00
Pavlo Paliychuk
0ed7739bc0
Controlled example (#37)
* chore: Add romeo runner

* fix: Linter

* dedupe fixes

* wip

* wip dump

* allbirds

* chore: Update romeo parser

* chore: Anthropic model fix

* allbirds runner

* format

* wip

* mypy updates

* update

* remove r

* update tests

* format

* wip

* wip

* wip

* chore: Strategically update the message

* chore: Add romeo runner

* fix: Linter

* wip

* wip dump

* chore: Update romeo parser

* chore: Anthropic model fix

* wip

* allbirds

* allbirds runner

* format

* wip

* wip

* mypy updates

* update

* remove r

* update tests

* format

* wip

* chore: Strategically update the message

* rebase and fix import issues

* Update package imports for graphiti_core in examples and utils

* nits

* chore: Update OpenAI GPT-4o model to gpt-4o-2024-08-06

* implement groq

* improvments & linting

* cleanup and nits

* Refactor package imports for graphiti_core in examples and utils

* Refactor package imports for graphiti_core in examples and utils

* chore: Nuke unused examples

* chore: Nuke unused examples

* chore: Only run type check on graphiti_core

* fix unit tests

* reformat

* unit test

* fix: Unit tests

* test: Add coverage for extract_date_strings_from_edge

* lint

* remove commented code

---------

Co-authored-by: prestonrasmussen <prasmuss15@gmail.com>
Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>
2024-08-26 10:30:22 -04:00
Daniel Chalef
c5e52153c4
chore: Fix packaging (#38)
* feat: Update project name and description

The project name and description in the `pyproject.toml` file have been updated to reflect the changes made to the project.

* chore: Update pyproject.toml to include core package

The `pyproject.toml` file has been updated to include the `core` package in the list of packages. This change ensures that the `core` package is included when building the project.

* fix imports

* fix importats
2024-08-25 10:07:50 -07:00
Renamed from core/utils/bulk_utils.py (Browse further)