Commit graph

1903 commits

Author SHA1 Message Date
hajdul88
bd6aafe9b7 fix: fixes event loop handling on windows in dynamic steps example 2025-01-16 18:17:11 +01:00
hajdul88
935763b08d fix: fixing changed lancedb search + pruning 2025-01-16 17:32:44 +01:00
Vasilije
1c4a605eb7
Merge pull request #437 from topoteretes/feature/cog-761-project-graphiti-graph-to-memory
feat: adds cognee node and edge embeddings for graphiti graph
2025-01-16 10:03:31 +01:00
hajdul88
86ee12aefc
Merge branch 'dev' into feature/cog-761-project-graphiti-graph-to-memory 2025-01-15 18:35:42 +01:00
alekszievr
3494521cae
Support 4 different rag options in eval (#439)
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.

* Load dataset file by filename, outsource utilities

* restructure metric selection

* Add comprehensiveness, diversity and empowerment metrics

* add promptfoo as an option

* refactor RAG solution in eval;2C

* LLM as a judge metrics implemented in a uniform way

* Use requests.get instead of wget

* clean up promptfoo config template

* minor fixes

* get promptfoo path instead of hardcoding

* minor fixes

* Add LLM as a judge prompts

* Support 4 different rag options in eval

* Minor refactor and logger usage
2025-01-15 15:34:13 +01:00
hajdul88
9e63bacaa7
Merge branch 'dev' into feature/cog-761-project-graphiti-graph-to-memory 2025-01-15 11:49:10 +01:00
hajdul88
1db44de7de feat: adds graphiti demo notebook 2025-01-15 11:45:06 +01:00
alekszievr
6653d73556
Feat/cog 950 improve metric selection (#435)
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.

* Load dataset file by filename, outsource utilities

* restructure metric selection

* Add comprehensiveness, diversity and empowerment metrics

* add promptfoo as an option

* refactor RAG solution in eval;2C

* LLM as a judge metrics implemented in a uniform way

* Use requests.get instead of wget

* clean up promptfoo config template

* minor fixes

* get promptfoo path instead of hardcoding

* minor fixes

* Add LLM as a judge prompts

* Minor refactor and logger usage
2025-01-15 10:45:55 +01:00
Vasilije
7c8d258188
Merge pull request #440 from topoteretes/llama-index-integration-google-colab
docs: Update LlamaIndex integration notebook
2025-01-14 19:40:39 +01:00
Igor Ilic
259414add0 docs: Update LlamaIndex integration notebook 2025-01-14 15:32:27 +01:00
hajdul88
dd8a488003
Merge branch 'dev' into feature/cog-761-project-graphiti-graph-to-memory 2025-01-14 14:00:27 +01:00
hajdul88
d0646a1694 feat: Implements generation and retrieval and adjusts imports 2025-01-14 13:59:27 +01:00
hajdul88
124a26335e feat: changes model independent edge method 2025-01-14 13:58:56 +01:00
hajdul88
84d04aafe1 feat: restructures graphiti object indexing 2025-01-14 13:58:13 +01:00
alekszievr
a4ad1702ed
Feat/cog 946 abstract eval dataset (#418)
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.

* Load dataset file by filename, outsource utilities

* Use requests.get instead of wget
2025-01-14 11:33:55 +01:00
hajdul88
5e9471ebad fix: removes get_model_independent_graph method from abstract class as graphiti does not support networkx 2025-01-14 09:16:46 +01:00
Boris Arzentar
12031e6c43 Merge remote-tracking branch 'origin/main' into dev 2025-01-13 22:04:23 +01:00
Boris Arzentar
8786fc35e7 version: Increase version to 0.1.22 2025-01-13 21:57:02 +01:00
hajdul88
19f885581d
Merge branch 'dev' into feature/cog-761-project-graphiti-graph-to-memory 2025-01-13 17:24:39 +01:00
hajdul88
25d8f5e337
Merge pull request #436 from topoteretes/feature/cog-762-deleting-in-memory-embeddings-from-bruteforce-search-and
feat: deletes on the fly embeddings and uses edge collections
2025-01-13 17:24:27 +01:00
hajdul88
c351047c36 feat: adds cognee node and edge embeddings for graphiti graph 2025-01-13 17:22:59 +01:00
Igor Ilic
8621127834
Merge branch 'dev' into feature/cog-762-deleting-in-memory-embeddings-from-bruteforce-search-and 2025-01-13 16:00:15 +01:00
Vasilije
a77a87e856
Merge pull request #422 from topoteretes/test-ubuntu-24.04
fix: Fix ubuntu 24.04 segmentation fault
2025-01-13 15:39:49 +01:00
Igor Ilic
32d7b0712a
Merge branch 'dev' into test-ubuntu-24.04 2025-01-13 15:07:39 +01:00
Igor Ilic
0ce2339587 fix: Attempt to resolve issue with Ubuntu 24.04 segmentation fault 2025-01-13 15:00:12 +01:00
hajdul88
d2f06e2654
Merge branch 'dev' into feature/cog-762-deleting-in-memory-embeddings-from-bruteforce-search-and 2025-01-13 12:54:23 +01:00
Vasilije
f9ddcaf75f
Merge pull request #428 from topoteretes/llama-index-notebook
Llama index cognee integration notebook
2025-01-13 12:22:55 +01:00
Igor Ilic
b317c1e23d
Merge branch 'dev' into llama-index-notebook 2025-01-13 12:02:44 +01:00
Igor Ilic
6163dec6c0 fix: Resolve api key issue with llama index integration notebook 2025-01-13 11:59:43 +01:00
Igor Ilic
adee79d7a5 fix: Change nbformat on llama index integration notebook 2025-01-13 11:54:05 +01:00
hajdul88
07fcce7970 feat: deletes on the fly embeddings as uses edge collections 2025-01-13 11:46:46 +01:00
hajdul88
30c69bd40d
Merge pull request #434 from topoteretes/feature/cog-964-fixing-logging-formatting
Fix: Fixes logging setup
2025-01-13 10:00:16 +01:00
hajdul88
ea8628c527 Fix: Fixes logging setup 2025-01-13 09:49:56 +01:00
Vasilije
2d7635db89
Update README.md 2025-01-12 20:58:35 +01:00
Boris Arzentar
0aacd3c38b fix: update dependencies of the mcp server 2025-01-12 19:08:26 +01:00
Boris Arzentar
886e9c7eb3 fix: update dependencies of the mcp server 2025-01-12 18:52:58 +01:00
Boris
fbc9cefdab
Version 0.1.21 (#431)
* feat: Add error handling in case user is already part of database and permission already given to group

Added error handling in case permission is already given to group and user is already part of group

Feature COG-656

* feat: Add user verification for accessing data

Verify user has access to data before returning it

Feature COG-656

* feat: Add compute search to cognee

Add compute search to cognee which makes searches human readable

Feature COG-656

* feat: Add simple instruction for system prompt

Add simple instruction for system prompt

Feature COG-656

* pass pydantic model tocognify

* feat: Add unauth access error to getting data

Raise unauth access error when trying to read data without access

Feature COG-656

* refactor: Rename query compute to query completion

Rename searching type from compute to completion

Refactor COG-656

* chore: Update typo in code

Update typo in string in code

Chore COG-656

* Add mcp to cognee

* Add simple README

* Update cognee-mcp/mcpcognee/__main__.py

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Create dockerhub.yml

* Update get_cognify_router.py

* fix: Resolve reflection issue when running cognee a second time after pruning data

When running cognee a second time after pruning data some metadata doesn't get pruned.
This makes cognee believe some tables exist that have been deleted

Fix

* fix: Add metadata reflection fix to sqlite as well

Added fix when reflecting metadata to sqlite as well

Fix

* update

* Revert "fix: Add metadata reflection fix to sqlite as well"

This reverts commit 394a0b2dfb.

* COG-810 Implement a top-down dependency graph builder tool (#268)

* feat: parse repo to call graph

* Update/repo_processor/top_down_repo_parse.py task

* fix: minor improvements

* feat: file parsing jedi script optimisation

---------

* Add type to DataPoint metadata (#364)

* Add type to DataPoint metadata

* Add missing index_fields

* Use DataPoint UUID type in pgvector create_data_points

* Make _metadata mandatory everywhere

* Fixes

* Fixes to our demo

* feat: Add search by dataset for cognee

Added ability to search by datasets for cognee users

Feature COG-912

* feat: outsources chunking parameters to extract chunk from documents … (#289)

* feat: outsources chunking parameters to extract chunk from documents task

* fix: Remove backend lock from UI

Removed lock that prevented using multiple datasets in cognify

Fix COG-912

* COG 870 Remove duplicate edges from the code graph (#293)

* feat: turn summarize_code into generator

* feat: extract run_code_graph_pipeline, update the pipeline

* feat: minimal code graph example

* refactor: update argument

* refactor: move run_code_graph_pipeline to cognify/code_graph_pipeline

* refactor: indentation and whitespace nits

* refactor: add deprecated use comments and warnings

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>

* test: Added test for getting of documents for search

Added test to verify getting documents related to datasets intended for search

Test COG-912

* Structured code summarization (#375)

* feat: turn summarize_code into generator

* feat: extract run_code_graph_pipeline, update the pipeline

* feat: minimal code graph example

* refactor: update argument

* refactor: move run_code_graph_pipeline to cognify/code_graph_pipeline

* refactor: indentation and whitespace nits

* refactor: add deprecated use comments and warnings

* Structured code summarization

* add missing prompt file

* Remove summarization_model argument from summarize_code and fix typehinting

* minor refactors

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>

* fix: Resolve issue with cognify router graph model default value

Resolve issue with default value for graph model in cognify endpoint

Fix

* chore: Resolve typo in getting documents code

Resolve typo in code

chore COG-912

* Update .github/workflows/dockerhub.yml

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Update .github/workflows/dockerhub.yml

* Update .github/workflows/dockerhub.yml

* Update .github/workflows/dockerhub.yml

* Update get_cognify_router.py

* fix: Resolve syntax issue with cognify router

Resolve syntax issue with cognify router

Fix

* feat: Add ruff pre-commit hook for linting and formatting

Added formatting and linting on pre-commit hook

Feature COG-650

* chore: Update ruff lint options in pyproject file

Update ruff lint options in pyproject file

Chore

* test: Add ruff linter github action

Added linting check with ruff in github actions

Test COG-650

* feat: deletes executor limit from get_repo_file_dependencies

* feat: implements mock feature in LiteLLM engine

* refactor: Remove changes to cognify router

Remove changes to cognify router

Refactor COG-650

* fix: fixing boolean env for github actions

* test: Add test for ruff format for cognee code

Test if code is formatted for cognee

Test COG-650

* refactor: Rename ruff gh actions

Rename ruff gh actions to be more understandable

Refactor COG-650

* chore: Remove checking of ruff lint and format on push

Remove checking of ruff lint and format on push

Chore COG-650

* feat: Add deletion of local files when deleting data

Delete local files when deleting data from cognee

Feature COG-475

* fix: changes back the max workers to 12

* feat: Adds mock summary for codegraph pipeline

* refacotr: Add current development status

Save current development status

Refactor

* Fix langfuse

* Fix langfuse

* Fix langfuse

* Add evaluation notebook

* Rename eval notebook

* chore: Add temporary state of development

Add temp development state to branch

Chore

* fix: Add poetry.lock file, make langfuse mandatory

Added langfuse as mandatory dependency, added poetry.lock file

Fix

* Fix: fixes langfuse config settings

* feat: Add deletion of local files made by cognee through data endpoint

Delete local files made by cognee when deleting data from database through endpoint

Feature COG-475

* test: Revert changes on test_pgvector

Revert changes on test_pgvector which were made to test deletion of local files

Test COG-475

* chore: deletes the old test for the codegraph pipeline

* test: Add test to verify deletion of local files

Added test that checks local files created by cognee will be deleted and those not created by cognee won't

Test COG-475

* chore: deletes unused old version of the codegraph

* chore: deletes unused imports from code_graph_pipeline

* Ingest non-code files

* Fixing review findings

* Ingest non-code files (#395)

* Ingest non-code files

* Fixing review findings

* test: Update test regarding message

Update assertion message, add veryfing of file existence

* Handle retryerrors in code summary (#396)

* Handle retryerrors in code summary

* Log instead of print

* fix: updates the acreate_structured_output

* chore: Add logging to sentry when file which should exist can't be found

Log to sentry that a file which should exist can't be found

Chore COG-475

* Fix diagram

* fix: refactor mcp

* Add Smithery CLI installation instructions and badge

* Move readme

* Update README.md

* Update README.md

* Cog 813 source code chunks (#383)

* fix: pass the list of all CodeFiles to enrichment task

* feat: introduce SourceCodeChunk, update metadata

* feat: get_source_code_chunks code graph pipeline task

* feat: integrate get_source_code_chunks task, comment out summarize_code

* Fix code summarization (#387)

* feat: update data models

* feat: naive parse long strings in source code

* fix: get_non_py_files instead of get_non_code_files

* fix: limit recursion, add comment

* handle embedding empty input error (#398)

* feat: robustly handle CodeFile source code

* refactor: sort imports

* todo: add support for other embedding models

* feat: add custom logger

* feat: add robustness to get_source_code_chunks

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* feat: improve embedding exceptions

* refactor: format indents, rename module

---------

Co-authored-by: alekszievr <44192193+alekszievr@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Fix diagram

* Fix instructions

* adding and fixing files

* Update README.md

* ruff format

* Fix linter issues

* Implement PR review

* Comment out profiling

* fix: add allowed extensions

* fix: adhere UnstructuredDocument.read() to Document

* feat: time code graph run and add mock support

* Fix ollama, work on visualization

* fix: Fixes faulty logging format and sets up error logging in dynamic steps example

* Overcome ContextWindowExceededError by checking token count while chunking (#413)

* fix: Fixes duplicated edges in cognify by limiting the recursion depth in add datapoints

* Adjust AudioDocument and handle None token limit

* Handle azure models as well

* Add clean logging to code graph example

* Remove setting envvars from arg

* fix: fixes create_cognee_style_network_with_logo unit test

* fix: removes accidental remained print

* Get embedding engine instead of passing it. Get it from vector engine instead of direct getter.

* Fix visualization

* Get embedding engine instead of passing it in code chunking.

* Fix poetry issues

* chore: Update version of poetry install action

* chore: Update action to trigger on pull request for any branch

* chore: Remove if in github action to allow triggering on push

* chore: Remove if condition to allow gh actions to trigger on push to PR

* chore: Update poetry version in github actions

* chore: Set fixed ubuntu version to 22.04

* chore: Update py lint to use ubuntu 22.04

* chore: update ubuntu version to 22.04

* feat: implements the first version of graph based completion in search

* chore: Update python 3.9 gh action to use 3.12 instead

* chore: Update formatting of utils.py

* Fix poetry issues

* Adjust integration tests

* fix: Fixes ruff formatting

* Handle circular import

* fix: Resolve profiler issue with partial and recursive logger imports

Resolve issue for profiler with partial and recursive logger imports

* fix: Remove logger from __init__.py file

* test: Test profiling on HEAD branch

* test: Return profiler to base branch

* Set max_tokens in config

* Adjust SWE-bench script to code graph pipeline call

* Adjust SWE-bench script to code graph pipeline call

* fix: Add fix for accessing dictionary elements that don't exits

Using get for the text key instead of direct access to handle situation if the text key doesn't exist

* feat: Add ability to change graph database configuration through cognee

* feat: adds pydantic types to graph layer models

* feat: adds basic retriever for swe bench

* Match Ruff version in config to the one in github actions

* feat: implements code retreiver

* Fix: fixes unit test for codepart search

* Format with Ruff 0.9.0

* Fix: deleting incorrect repo path

* fix: resolve issue with langfuse dependency installation when integrating cognee in different packages

* version: Increase version to 0.1.21

---------

Co-authored-by: Igor Ilic <igorilic03@gmail.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Rita Aleksziev <alekszievr@gmail.com>
Co-authored-by: vasilije <vas.markovic@gmail.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: alekszievr <44192193+alekszievr@users.noreply.github.com>
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
Co-authored-by: Henry Mao <1828968+calclavia@users.noreply.github.com>
2025-01-10 19:37:50 +01:00
Boris Arzentar
e983c216f0 version: Increase version to 0.1.21 2025-01-10 19:04:24 +01:00
Vasilije
6865f51226
Merge pull request #430 from topoteretes/fix-langfuse-dependency-installation
fix: resolve issue with langfuse dependency installation
2025-01-10 18:13:08 +01:00
Igor Ilic
16eefe4875 fix: resolve issue with langfuse dependency installation when integrating cognee in different packages 2025-01-10 18:02:15 +01:00
Igor Ilic
a5c91e8f0e
Merge branch 'dev' into llama-index-notebook 2025-01-10 17:26:05 +01:00
Igor Ilic
1653cdda46 test: Add github action for testing llama index cognee integration notebook 2025-01-10 17:24:38 +01:00
Igor Ilic
9a4613a9dd docs: Add LlamaIndex Cognee integration notebook
Added LlamaIndex Cognee integration notebook
2025-01-10 16:49:23 +01:00
Vasilije
e9c40ed4c1
Merge pull request #426 from topoteretes/feature/cog-971-preparing-swe-bench-run
Fix: deleting incorrect repo path
2025-01-10 16:05:45 +01:00
hajdul88
46c33655ca
Merge branch 'dev' into feature/cog-971-preparing-swe-bench-run 2025-01-10 15:58:31 +01:00
Vasilije
b8885971bc
Merge pull request #425 from topoteretes/format_with_updated_ruff
Format with Ruff 0.9.0
2025-01-10 15:57:48 +01:00
hajdul88
48e6394d4e
Merge branch 'dev' into feature/cog-971-preparing-swe-bench-run 2025-01-10 15:57:32 +01:00
hajdul88
e2ad54d88e Fix: deleting incorrect repo path 2025-01-10 15:54:45 +01:00
Rita Aleksziev
872bc89648 Format with Ruff 0.9.0 2025-01-10 15:11:00 +01:00
Vasilije
f694ca283f
Merge pull request #424 from topoteretes/feature/cog-971-preparing-swe-bench-run
Feature/cog 971 preparing swe bench run
2025-01-10 14:59:48 +01:00