Commit graph

303 commits

Author SHA1 Message Date
Preston Rasmussen
49aeaf75f2
Add mmr reranking (#180)
* mmr start

* add mmr function

* normalize

* add mmr options to search

* update communities

* build communities

* format

* clean up normalization

* normalize in mmr

* update
2024-10-08 13:55:10 -04:00
Preston Rasmussen
e15c872900
Fix edge invalidation (#174)
* update edge operations

* add new tests
2024-10-07 11:45:31 -04:00
Preston Rasmussen
377225eec5
add addepisode return object (#172)
* add addepisode return

* format
2024-10-03 15:39:57 -04:00
Preston Rasmussen
c8ff5be8ce
Msc benchmark update (#173)
* eval update

* I sped it up

* make format

* search updates

* updates

* cleanup

* make format

* remove unused imports

* poetry lock
2024-10-03 15:39:35 -04:00
Preston Rasmussen
ec2e51c5ec
test escape characters (#171)
* test escape characters

* format

* tests

* run tests

* copyright
2024-10-03 10:08:30 -04:00
Preston Rasmussen
ae9b5eca9c
update lucene sanitizer (#170)
* update lucene sanitizer

* update
2024-10-02 11:58:12 -04:00
Pavlo Paliychuk
a7148d6260
feat: Dedicated embedder interface (#159)
* feat: Add Embedder interface and implement openai embedder

* feat: Add voyage ai embedder
2024-09-27 12:47:04 -04:00
ARNO
5bd18fc7dd
feat: configurable embedding model (#156)
* feat: configurable embedding model

format

* chore: Update comment

* chore: Pass embedding model in search utils

---------

Co-authored-by: paulpaliychuk <pavlo.paliychuk.ca@gmail.com>
2024-09-26 13:31:22 -07:00
Preston Rasmussen
fd341a6f16
Add MSC benchmark and improve search performance (#157)
* test cases

* test

* benchmark

* eval updates

* improve search performance

* remove data

* formatting

* add None type to config

* update sanitization

* push version

* maketrans update

* mypy
2024-09-26 16:12:38 -04:00
Pavlo Paliychuk
b537cf56e5
chore: Make deleting groups safer (#155)
* chore: Make deleting groups safer

* chore: Use appropriate errors in delete group checks

* chore: Add GroupsEdgesNotFound error type
2024-09-24 20:08:09 -04:00
Pavlo Paliychuk
44b016da6b
feat: async close and multi-group search support (#151)
* chore: Support a list of group_ids on search + await driver.close()

* fix: formatter and linter

* chore: Version bump
2024-09-24 16:13:04 -04:00
Preston Rasmussen
794b705664
Group id fix (#152)
* node distance and group_ids fixed

* get all with no group_id passed

* push

* push

* remove comments

* mypy

* mypy ids

* please mypy

* trust

* last one
2024-09-24 15:55:30 -04:00
Preston Rasmussen
5506a01e24
In memory label propagation community detection (#136)
* WIP

* in memory graph detection

* format

* add comments

* update readme

* fixed an issue where solo nodes would throw an error when building communities
2024-09-23 11:05:44 -04:00
Pavlo Paliychuk
2fc1b00602
feat: add FastAPI lifespan and healthcheck endpoint (#144)
* chore: Add healthcheck endpoint + build indexes and constraints on svc startup

* chore: Bring back driver close call
2024-09-23 10:12:35 -04:00
Daniel Chalef
5d2121e1a3
limit community building concurrency (#142) 2024-09-22 13:38:54 -07:00
Daniel Chalef
14d5ce0b36
Override default max tokens for Anthropic and Groq clients (#143)
* Override default max tokens for Anthropic and Groq clients

* Override default max tokens for Anthropic and Groq clients

* Override default max tokens for Anthropic and Groq clients
2024-09-22 11:33:54 -07:00
Daniel Chalef
a1d871e179
chore: Update DEFAULT_MAX_TOKENS to 16384 in config.py (#138) 2024-09-22 09:57:41 -07:00
Daniel Chalef
9b71b46c0f
feat: Refactor OpenAIClient initialization and add client parameter (#140)
The code changes refactor the `OpenAIClient` initialization to accept an optional `client` parameter. This allows the client to be passed in from outside, providing more flexibility and enabling easier testing.
2024-09-21 12:09:04 -07:00
Daniel Chalef
32b51530ec
feat: Fix bug in dedupe_node_list function (#137)
The code changes fix a bug in the `dedupe_node_list` function where a node instance was not found in the node map. The bug is now handled by logging a warning message and skipping the iteration. This ensures that the function continues to execute without any errors.
2024-09-20 21:03:20 -07:00
Daniel Chalef
6d065d363a
Handle JSONDecodeError in is_server_or_retry_error function (#133)
feat: handle JSONDecodeError in is_server_or_retry_error function
2024-09-20 11:16:04 -07:00
Preston Rasmussen
bfd8d3bb68
Add group_id CRUD endpoints and option store content bool (#130)
* add group_ids CRUD

* option to not store content

* ellipsis
2024-09-19 16:16:40 -04:00
Preston Rasmussen
e398f95612
Mentions reranker (#124)
* documentation update

* update communities

* mentions reranker

* fix episode edge mentions

* get episode mentions

* add communities to mentions endpoint

* rebase

* defaults episodes to empty list

* update
2024-09-18 15:44:28 -04:00
Pavlo Paliychuk
529a1aaecf
fix: update UUID generation and message handling (#123)
* chore: Update uuid generation + service fixes

* chore: Version bump
2024-09-18 12:48:44 -04:00
Preston Rasmussen
a18b3179ee
Add community update (#121)
* documentation update

* update communities

* update runner

* make format

* mypy

* oops

* add update_communities
2024-09-18 11:37:34 -04:00
Pavlo Paliychuk
ebb1ec2463
fix: Syntax error on node crud (#119) 2024-09-17 12:19:20 -04:00
Pavlo Paliychuk
19a6ebc6fe
Fix groupless search (#118)
* fix(search): 🐛 Search across null group_ids

* chore: Version bump

* chore: Set group_ids to none if it's an empty list

* fix: Check for group ids being a list before setting it to None if empty

* fix check

* chore: Simplify group_ids check

* chore: Simplify the check further
2024-09-16 16:23:07 -04:00
Preston Rasmussen
d7c20c1f59
Search refactor + Community search (#111)
* WIP

* WIP

* WIP

* community search

* WIP

* WIP

* integration tested

* tests

* tests

* mypy

* mypy

* format
2024-09-16 14:03:05 -04:00
Preston Rasmussen
85cf8e5840
Improve node distance reranker speed (#107)
* much faster

* clean up code

* variable rename
2024-09-12 11:23:45 -04:00
Pavlo Paliychuk
8085b52f2a
feat: add error handling for missing nodes and edges, introduce new API endpoints, and update ZepGraphiti class (#104)
* feat: Expose crud operations to service + add graphiti errors

* fix: linter
2024-09-11 12:53:17 -04:00
Preston Rasmussen
c0a740ff60
Community nodes (#103)
* add gds

* community work

* save progress

* community updates

* e2e communities

* troubleshooting

* updates

* communities

* remove unused import
2024-09-11 12:06:35 -04:00
Preston Rasmussen
4122d350a5
add extract nodes from text prompt (#106) 2024-09-11 12:06:08 -04:00
Daniel Chalef
b214baa85f
Add py.typed file (#105)
* Add py.typed file



---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/getzep/graphiti?shareId=XXXX-XXXX-XXXX-XXXX).

* Update pyproject.toml
2024-09-11 08:44:06 -04:00
Daniel Chalef
6851b1063a
Fix llm client retry (#102)
* Fix llm client retry

* feat: Improve llm client retry error message
2024-09-10 08:15:27 -07:00
Daniel Chalef
3f12254916
Fix missing default None for add_episode_bulk (#101)
Fix missing default None for add_episode and add_episode_bulk



---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/getzep/graphiti?shareId=XXXX-XXXX-XXXX-XXXX).
2024-09-09 22:12:59 -04:00
Preston Rasmussen
42fb590606
Add group ids (#89)
* set and retrieve group ids

* update add episode with group id support

* add episode and search functional

* update bulk

* mypy updates

* remove unused imports

* update unit tests

* unit tests

* add optional uuid field

* format

* mypy

* ellipsis
2024-09-06 12:33:42 -04:00
Preston Rasmussen
a29c3557d3
fix clearing name embeddings bug (#87)
fix bug
2024-09-05 14:09:19 -04:00
Preston Rasmussen
299021173b
Add episode refactor (#85)
* temp commit while moving

* fix name embedding bug

* invalidation

* format

* tests on runner examples

* format

* ellipsis

* ruff

* fix

* format

* minor prompt change
2024-09-05 12:05:44 -04:00
Preston Rasmussen
e56a599a72
search update (#81)
* search update

* update string literals
2024-09-04 10:05:45 -04:00
Preston Rasmussen
e9e6039b1e
Speed up add episode (#77)
* WIP

* updates

* use uuid for node dedupe

* pret-testing

* parallelized node resolution

* working add_episode

* revert to 4o

* format

* mypy update

* update types
2024-09-03 13:25:52 -04:00
Daniel Chalef
77685b063c
Feat/langgraph-example (#73)
* wip

* wip

* image + clean run

* chore: Update LANGCHAIN_TRACING_V2 to 'false' in agent.ipynb

* chore: Remove unused import in runner.ipynb

* lock file
2024-09-01 12:31:08 -07:00
Daniel Chalef
fe20c0f51d
Node Distance Reranker: Limit max hops (and cleanup prints) (#72)
* limit SHORTEST max hops

* cleanup prints
2024-09-01 12:16:04 -07:00
Preston Rasmussen
35a4e5172b
add bulk temporal extraction and improve bulk quality and performance (#67)
* parallelize edge deduping more

* parallelize node insertion more

* improve bulk behavior performance

* dedupe nodes actually works

* add a reranker to search

* bulk dedupe episodes only across the same nodes

* add temporal extraction bulk function

* cleaned up bulk

* default to 4o

* format

* mypy

* mympy

* mypy ignore
2024-08-30 10:48:28 -04:00
Preston Rasmussen
06d8d9359f
Add Missing Node and edge CRUD (#51)
* add CRUD operations and fix search limit bugs

* format

* update tests

* å

* update tests to double limit call

* add default field

* format

* import correct field
2024-08-27 16:18:01 -04:00
Pavlo Paliychuk
e821a6195a
chore: Move anthropic to dev deps, remove anthropic and groq clients from __init__ (#61) 2024-08-27 16:03:08 -04:00
Daniel Chalef
2d0705fc1b
Add get_nodes_by_query method to Graphiti class (#49)
* Add get_nodes_by_query method to Graphiti class

Add a method to the Graphiti class that wraps `get_relevant_nodes` and returns a list of nodes given a query.

* Add `get_nodes_by_query` method to the `Graphiti` class in `graphiti_core/graphiti.py`.
* Import `generate_embedding` from `graphiti_core/llm_client/utils.py`.
* Use `generate_embedding` to generate an embedding for the query.
* Call `get_relevant_nodes` with the generated embedding and return the relevant nodes.

Add an embedding function to `llm_client/utils.py`.

* Add `generate_embedding` function to `graphiti_core/llm_client/utils.py`.
* Accept an embedder and model_id as parameters.
* Generate an embedding for the given text and return it.

---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/getzep/graphiti?shareId=XXXX-XXXX-XXXX-XXXX).

* address comments left by @danielchalef on #49 (Add get_nodes_by_query method to Graphiti class);

* fix ellipsis name in cla config

* feat: Add get_nodes_by_query method to Graphiti class

* chore: Cleanup unused files, add hybrid node search, add tests

---------

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Co-authored-by: paulpaliychuk <pavlo.paliychuk.ca@gmail.com>
2024-08-26 20:00:28 -07:00
Daniel Chalef
7ca4f7fe5b
Update search method to return EntityEdge objects (#48)
---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/getzep/graphiti?shareId=XXXX-XXXX-XXXX-XXXX).
2024-08-26 17:24:35 -07:00
Daniel Chalef
a6d63f0c0d
Add text episode type (#46)
Add a new `text` episode type and update the `extract_nodes` function to handle it.

* **EpisodeType Enum:**
  - Add `text` to the `EpisodeType` enum in `graphiti_core/nodes.py`.
  - Update the `from_str` method to handle the `text` episode type.

* **extract_nodes Function:**
  - Update the `extract_nodes` function in `graphiti_core/utils/maintenance/node_operations.py` to handle the `text` episode type.
  - Use the `message` type prompt for both `message` and `text` episodes.

---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/getzep/graphiti?shareId=XXXX-XXXX-XXXX-XXXX).
2024-08-26 15:51:13 -07:00
Preston Rasmussen
2d01e5d7b7
Search node centering (#45)
* add new search reranker and update search

* node distance reranking

* format

* rebase

* no need for enumerate

* mypy typing

* defaultdict update

* rrf prelim ranking
2024-08-26 18:34:57 -04:00
Daniel Chalef
fc4bf3bde2
Implement retry for LLMClient (#44)
* implement retry

* chore: Refactor tenacity retry logic and improve LLMClient error handling

* poetry

* remove unnecessary try
2024-08-26 12:53:16 -07:00
Daniel Chalef
895afc7be1
implement diskcache (#39)
* chore: Add romeo runner

* fix: Linter

* wip

* wip dump

* chore: Update romeo parser

* chore: Anthropic model fix

* wip

* allbirds

* allbirds runner

* format

* wip

* wip

* mypy updates

* update

* remove r

* update tests

* format

* wip

* chore: Strategically update the message

* rebase and fix import issues

* Update package imports for graphiti_core in examples and utils

* nits

* chore: Update OpenAI GPT-4o model to gpt-4o-2024-08-06

* implement groq

* improvments & linting

* cleanup and nits

* Refactor package imports for graphiti_core in examples and utils

* Refactor package imports for graphiti_core in examples and utils

* implement diskcache

* remove debug stuff

* log cache hit when debugging only

* Improve LLM config. Fix bugs (#41)

Refactor LLMConfig class to allow None values for model and base_url

* chore: Resolve mc

---------

Co-authored-by: paulpaliychuk <pavlo.paliychuk.ca@gmail.com>
Co-authored-by: prestonrasmussen <prasmuss15@gmail.com>
2024-08-26 13:13:05 -04:00