graphiti

Author	SHA1	Message	Date
Daniel Chalef	1a6db24600	Final MMR optimization focused on 1024D vectors with smart dimensionality dispatch This commit delivers a production-ready MMR optimization specifically tailored for Graphiti's primary use case while handling high-dimensional vectors appropriately. ## Performance Improvements for 1024D Vectors - Average 1.16x speedup (13.6% reduction in search latency) - Best performance: 1.31x speedup for 25 candidates (23.5% faster) - Sub-millisecond latency: 0.266ms for 10 candidates, 0.662ms for 25 candidates - Scalable performance: Maintains improvements up to 100 candidates ## Smart Algorithm Dispatch - 1024D vectors: Uses optimized precomputed similarity matrix approach - High-dimensional vectors (≥2048D): Falls back to original algorithm to avoid overhead - Adaptive thresholds: Considers both dataset size and dimensionality for optimal performance ## Key Optimizations for Primary Use Case 1. Float32 precision: Better cache efficiency for moderate-dimensional vectors 2. Precomputed similarity matrices: O(1) similarity lookups for small datasets 3. Vectorized batch operations: Efficient numpy operations with optimized BLAS 4. Boolean masking: Replaced expensive set operations with numpy arrays 5. Smart memory management: Optimal layouts for CPU cache utilization ## Technical Implementation - Memory efficient: All test cases fit in CPU cache (max 0.43MB for 100×1024D) - Cache-conscious: Contiguous float32 arrays improve memory bandwidth - BLAS optimized: Matrix multiplication leverages hardware acceleration - Correctness maintained: All existing tests pass with identical results ## Production Impact - Real-time search: Sub-millisecond performance for typical scenarios - Scalable: Performance improvements across all tested dataset sizes - Robust: Handles edge cases and high-dimensional vectors gracefully - Backward compatible: Drop-in replacement with identical API This optimization transforms MMR from a potential bottleneck into a highly efficient operation for Graphiti's search pipeline, providing significant performance gains for the most common use case (1024D vectors) while maintaining robustness for all scenarios. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-07-18 12:28:50 -07:00
Daniel Chalef	166c67492a	Optimize MMR calculation with vectorized numpy operations This commit implements a comprehensive optimization of the Maximal Marginal Relevance (MMR) calculation in the search utilities. The key improvements include: ## Algorithm Improvements - True MMR Implementation: Replaced the previous diversity-aware scoring with proper iterative MMR algorithm that greedily selects documents one at a time - Vectorized Operations: Leveraged numpy's optimized BLAS operations through matrix multiplication instead of individual dot products - Adaptive Strategy: Uses different optimization strategies for small (≤100) and large datasets to balance performance and memory usage ## Performance Optimizations - Memory Efficiency: Reduced memory complexity from O(n²) to O(n) for large datasets - BLAS Optimization: Proper use of matrix multiplication leverages optimized BLAS libraries - Batch Normalization: Added `normalize_embeddings_batch()` for efficient L2 normalization of multiple embeddings at once - Early Termination: Stops selection when no candidates meet minimum score threshold ## Key Changes - `maximal_marginal_relevance()`: Complete rewrite with proper iterative MMR algorithm - `normalize_embeddings_batch()`: New function for efficient batch normalization - `_mmr_small_dataset()`: Optimized implementation for small datasets using precomputed similarity matrices - Added comprehensive test suite with 9 test cases covering edge cases, correctness, and performance scenarios ## Benefits - Correctness: Now implements true MMR algorithm instead of approximate diversity scoring - Memory Usage: O(n) memory complexity vs O(n²) for the original implementation - Scalability: Better performance characteristics for large datasets - Maintainability: Cleaner, more readable code with comprehensive test coverage 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-07-18 11:54:15 -07:00
Gal Shubeli	35e0692328	[Bug Fix] Fix the Group ID usage with FalkorDB (#733 ) * groupid-none * groupid-def-fulltext * lint * Update graphiti_core/helpers.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>	2025-07-17 12:35:08 -04:00
Preston Rasmussen	deda803dc5	update search filters (#706 ) * update search filters * toml	2025-07-11 10:53:15 -04:00
Daniel Chalef	aa6e38856a	[REFACTOR][FIX] Move away from DEFAULT_DATABASE environment variable in favour of driver-config support (dc) (#699 ) * fix: remove global DEFAULT_DATABASE usage in favor of driver-specific config Fixes bugs introduced in PR #607. This removes reliance on the global DEFAULT_DATABASE environment variable. It specifies the database within each driver. PR #607 introduced a Neo4j compatability, as the database names are different when attempting to support FalkorDB. This refactor improves compatability across database types and ensures future reliance by isolating the configuraiton to the driver level. * fix: make falkordb support optional This ensures that the the optional dependency and subsequent import is compliant with the graphiti-core project dependencies. * chore: fmt code * chore: undo changes to uv.lock * fix: undo potentially breaking changes to drive interface * fix: ensure a default database of "None" is provided - falling back to internal default * chore: ensure default value exists for session and delete_all_indexes * chore: fix typos and grammar * chore: update package versions and dependencies in uv.lock and bulk_utils.py * docs: update database configuration instructions for Neo4j and FalkorDB Clarified default database names and how to override them in driver constructors. Updated testing requirements to include specific commands for running integration and unit tests. * fix: ensure params defaults to an empty dictionary in Neo4jDriver Updated the execute_query method to initialize params as an empty dictionary if not provided, ensuring compatibility with the database configuration. --------- Co-authored-by: Urmzd <urmzd@dal.ca>	2025-07-10 17:25:39 -04:00
James.	7ce07942b1	Fix: Add missing name_embedding field to community search queries (#664 ) Enhanced queries in search_utils.py to include 'name_embedding' field in community full-text and similarity search functions.	2025-07-02 11:45:25 -04:00
Daniel Chalef	8213d10d44	migrate to pyright (#646 ) * migrate to pyright * Refactor type checking to use Pyright, update dependencies, and clean up code. - Replaced MyPy with Pyright in configuration files and CI workflows. - Updated `pyproject.toml` and `uv.lock` to reflect new dependencies and versions. - Adjusted type hints and fixed minor code issues across various modules for better compatibility with Pyright. - Added new packages `backoff` and `posthog` to the project dependencies. * Update CI workflows to install all extra dependencies for type checking and unit tests * Update dependencies in uv.lock to replace MyPy with Pyright and add nodeenv package. Adjust type hinting in config.py for compatibility with Pyright.	2025-06-30 12:04:21 -07:00
Gal Shubeli	6e6115c134	FalkorDB Integration: Bug Fixes and Unit Tests (#607 ) * fixes-and-tests * update-workflow * lint-fixes * mypy-fixes * fix-falkor-tests * Update poetry.lock after pyproject.toml changes * update-yml * fix-tests * comp-tests * typo * fix-tests --------- Co-authored-by: Guy Korland <gkorland@gmail.com>	2025-06-30 11:01:44 -04:00
Daniel Chalef	7537f0c972	fix: correct spacing in group IDs filter concatenation in fulltext_query function (#636 )	2025-06-27 14:09:01 -07:00
Preston Rasmussen	97593550a9	fix fulltext query (#626 ) * fix fulltext query * updates	2025-06-25 18:09:56 -04:00
Preston Rasmussen	19fde653a6	update driver (#583 ) * update driver * mypy updates * mypy updates * mypy updates * Update graphiti_core/graph_queries.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> * mypy updates * mypy * mypy updates * mypy updates * mypy updates * mypy updates --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>	2025-06-13 14:12:09 -04:00
Preston Rasmussen	14146dc46f	Add support for falkordb (#575 ) * [wip] add support for falkordb * updates * fix-async * progress * fix-issues * rm-date-handler * red-code * rm-uns-try * fix-exm * rm-un-lines * fix-comments * fix-se-utils * fix-falkor-readme * fix-falkor-cosine-score * update-falkor-ver * fix-vec-sim * min-updates * make format * update graph driver abstraction * poetry lock * updates * linter * Update graphiti_core/search/search_utils.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> --------- Co-authored-by: Dudi Zimberknopf <zimber.dudi@gmail.com> Co-authored-by: Gal Shubeli <galshubeli93@gmail.com> Co-authored-by: Gal Shubeli <124919062+galshubeli@users.noreply.github.com> Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>	2025-06-13 12:06:57 -04:00
TheEpTic	735b020624	BUG FIX: Fix trailing AND in edge_search_filter_query_constructor Cypher query (#541 ) Fix trailing AND in edge_search_filter_query_constructor Cypher query Corrected the edge_search_filter_query_constructor function to prevent trailing AND operators in generated Cypher queries, which caused Neo.ClientError.Statement.SyntaxError. Changed condition from `j != len(and_filter_query) - 1` to `j != len(and_filters) - 1` for valid_at, invalid_at, created_at, and expired_at filter blocks. Also fixed outer loop condition to use `len(filters.<field>)` instead of `len(or_list)`. Ensures valid Cypher syntax for single DateFilter cases. Co-authored-by: TheEpTic <326774+TheEpTic@users.noreply.github.com>	2025-05-29 12:55:16 -04:00
Preston Rasmussen	1eea232ef1	remove sanitize (#540 ) * remove sanitize * format	2025-05-28 19:34:44 -04:00
Preston Rasmussen	5fe2f588a6	Edge type search (#537 ) * add filters * search filter * Update graphiti_core/search/search_utils.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>	2025-05-27 13:16:28 -04:00
Preston Rasmussen	db7595fe63	Edge types (#501 ) * update entity edge attributes * Adding prompts * extract fact attributes * edge types * edge types no regressions * mypy * mypy update * Update graphiti_core/prompts/dedupe_edges.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> * Update graphiti_core/prompts/dedupe_edges.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> * mypy --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>	2025-05-19 13:30:56 -04:00
Preston Rasmussen	9baa9b7b8a	Mmr optimizations (#481 ) * update mmr calculations * update search * fixes and updates * mypy	2025-05-12 22:30:23 -04:00
Preston Rasmussen	4198483993	improve memory leak (#478 )	2025-05-12 16:32:27 -04:00
Preston Rasmussen	1f2f1eeab5	Size optimizations (#456 ) * memory optimizations for vectors * debugged * unused import * Update graphiti_core/edges.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>	2025-05-07 20:08:30 -04:00
Preston Rasmussen	8b19771d86	search update (#426 )	2025-04-30 18:25:43 -04:00
Preston Rasmussen	50b3df03c4	Lucene sanitize (#423 ) * lucene sanitize * bump version	2025-04-30 15:00:29 -04:00
Preston Rasmussen	1193b25fa3	`add_episode()` refactor (#421 ) * temporal updates * update resolve nodes * dedupe edge updates * edge dedupe * extract attributes * update dynamic pydantic model * first pass of extract node attributes * no errors * bug fixes * bug fixes * prompt updates * prompt updates * updates * updates * remove unused imports * update tests based on changes * remove unused import	2025-04-30 12:08:52 -04:00
Preston Rasmussen	a26b25dc06	Add episode refactor (#399 ) * partial refactor * get relevant nodes refactor * load edges updates * refactor triplets * not there yet * node search update * working refactor * updates * mypy * mypy	2025-04-26 00:24:23 -04:00
Preston Rasmussen	432d2295c6	Revert episodes (#387 ) * episode search fixes and optimizations * remove extra return string * Update graphiti_core/utils/maintenance/graph_data_operations.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>	2025-04-22 12:03:09 -04:00
Preston Rasmussen	009467650f	Node episodes list (#381 ) * added episode list virtual field * in progress tests * add tests * update search return type * linter * copyright notice * mark integration tests	2025-04-20 23:20:19 -04:00
Preston Rasmussen	e73aaf8171	mmr update (#369 ) * mmr update * bump version * format	2025-04-17 10:14:50 -04:00
Preston Rasmussen	45b15a06f2	add episode scope to search (#362 ) * add episode scope to search * bump version * linter * Update graphiti_core/search/search_helpers.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> * mypy --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>	2025-04-15 19:27:56 -04:00
Preston Rasmussen	11e19a35b7	add reranker_min_score (#355 ) * add reranker_min_score * update divide by 0 case * center node always gets a score of .1 * linter	2025-04-15 12:33:37 -04:00
Preston Rasmussen	6aa25a1901	update context string (#346 ) * update context string * Update graphiti_core/search/search_helpers.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> * remove unused imports * bump version --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>	2025-04-10 06:57:58 -04:00
Preston Rasmussen	502b6da1c7	Add search_ and deprecate _search (#342 ) * add search_ and deprecate _search. Add formatting helper * add search helpers file * move SearchResults * Update graphiti_core/search/search_helpers.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> * remove unused imports --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>	2025-04-09 15:59:21 -04:00
Preston Rasmussen	7f20b21572	Entity attributes in prompts (#284 ) * add node attributes to prompts * tested * attribute update	2025-03-04 16:34:19 -05:00
Preston Rasmussen	1d2417ec26	Search optimizations (#280 ) fix node distance search	2025-02-27 11:51:10 -05:00
Preston Rasmussen	9efa6762d7	entity typo (#274 )	2025-02-24 12:44:17 -05:00
Preston Rasmussen	088029a80c	node label filters (#265 ) * node label filters * update * add search filters * updates * bump versions * update tests * test update	2025-02-21 12:38:01 -05:00
Preston Rasmussen	29a071b2b8	Custom ontology (#262 ) * ontology * extract and save node labels * extract entity type properties * neo4j upgrade needed * add entity types * update typing * update types * updates * Update graphiti_core/utils/maintenance/node_operations.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> * fix warning * mypy updates * update properties * mypy ignore * mypy types * bump version --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>	2025-02-13 12:17:52 -05:00
Preston Rasmussen	6ef2f5e097	Date filters (#240 ) * add search filters * add search filters * mypy * mypy * update filtering * date-filters * update * update filter queries * update dictionary	2025-01-28 11:52:53 -05:00
Preston Rasmussen	00fe87679e	Bounded semaphore - limiting concurrency (#244 ) * WIP * add semaphore * remove unused imports * remove unused imports * lower concurrency limit	2024-12-17 13:08:18 -05:00
Preston Rasmussen	34496ffa6a	Abstract Neo4j filters in search queries (#243 ) * move null check for search queries to python * update search filtering * update * update	2024-12-16 21:45:45 -05:00
Preston Rasmussen	6a152ab91a	fix node distance reranker (#231 )	2024-12-06 12:08:54 -05:00
Preston Rasmussen	0fbe5c0704	Pagination for get by group_id (#218 ) * add pagination to subgraphs * update pagination * update LiteralString import * cleanup * cleanup * update embedding dims	2024-12-02 11:17:37 -05:00
Preston Rasmussen	52c590878a	Update edge search (#216 ) * update edge fulltext search * bump version	2024-11-15 14:32:11 -05:00
Preston Rasmussen	281fe072cb	add fulltext search limit (#215 ) * add fulltext search limit * format * update * update * update tests * remove unused imports * format * mypy	2024-11-14 12:18:18 -05:00
Preston Rasmussen	eba9f40ca2	add reflexion (#212 ) * add reflexion * clean up boolean logic * update conditional * cap reflexion iterations * don't do an extra reflection step	2024-11-13 11:58:56 -05:00
Preston Rasmussen	857a8f61cf	add search recipes (#210 )	2024-11-06 14:59:17 -05:00
Preston Rasmussen	6536401c8c	return no results with empty search string (#206 ) * return no results with empty search string * update * bump version	2024-11-04 10:50:49 -05:00
Preston Rasmussen	b8f52670ce	Bulk add nodes and edges (#205 ) * test * only use parallel runtime if set to true * add and test bulk add * remove group_ids * format * bump version * update readme	2024-10-31 12:31:37 -04:00
Preston Rasmussen	63a1b11142	update new names with input_data (#204 )	2024-10-29 11:03:31 -04:00
Preston Rasmussen	7bb0c78d5d	Update reranker limits (#203 ) * update reranker limits * update versions * format * update names * fix: voyage linter --------- Co-authored-by: paulpaliychuk <pavlo.paliychuk.ca@gmail.com>	2024-10-28 14:50:16 -04:00
Preston Rasmussen	ceb60a3d33	Cross encoder reranker in search query (#202 ) * cross encoder reranker * update reranker * add openai reranker * format * mypy * update * updates * MyPy typing * bump version	2024-10-25 12:29:27 -04:00
Pavlo Paliychuk	544f9e3fba	chore: Set up cross encoder client (#201 ) * chore: Set up cross encoder client * fix: deps * chore: move voyage to dev deps	2024-10-24 11:36:10 -04:00

1 2

81 commits