Commit graph

3769 commits

Author SHA1 Message Date
Boris Arzentar
b2c632cc8f
Merge remote-tracking branch 'origin/dev' into feature/cog-3014-refactor-delete-feature 2025-10-18 16:30:05 +02:00
Vasilije
559d5009f7
feat: Batch document handling (#1469)
<!-- .github/pull_request_template.md -->

## Description
Add a batch system for document processing to limit number of parallel
documents being processed in Cognee

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [x] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-10-18 09:48:52 +02:00
Vasilije
9d0261f375
fix: search without prior cognify (#1548)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

Running search when `cognee.add()` is called, but `cognee.cognify()`
wasn't called yet goes through the whole search operation to throw a
cryptic error:
```
Error during graph projection: EntityNotFoundError: Empty graph projected from the database. (Status code: 404)
```

## How to reproduce
modify `dynamic_steps_example.py` to not run cognify

## This PR

Checks graph before searching, and throws an informative exception to
ensure cognify was run

| Logs Before | Logs After |
|--------------|------------|
| `Error during graph projection: EntityNotFoundError: Empty graph
projected from the database. (Status code: 404)` |
`2025-10-17T11:05:58.465315 [warning ] Search attempt on an empty
knowledge graph [cognee.shared.logging_utils]` |

## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-10-17 19:07:56 +02:00
Daulet Amirkhanov
4e2a777860 tests: update tests after last refactoring 2025-10-17 14:18:47 +01:00
Igor Ilic
6baf2d6806
Merge branch 'dev' into batch-document-handling 2025-10-17 13:45:32 +02:00
Daulet Amirkhanov
41fd854c7e
Merge branch 'dev' into fix/search-without-prior-cognify 2025-10-17 12:09:47 +01:00
Daulet Amirkhanov
c313fcd029 log warning on attempts to search on an empty knowledge graph 2025-10-17 12:06:35 +01:00
Daulet Amirkhanov
3ee50c192f refactor emptiness check to be boolean, and optimize query 2025-10-17 12:01:06 +01:00
Boris Arzentar
cd8b952ca8
fix: unit tests 2025-10-16 18:27:43 +02:00
Boris Arzentar
a46b5010e2
Merge remote-tracking branch 'origin/dev' into feature/cog-3014-refactor-delete-feature 2025-10-16 16:37:17 +02:00
Boris Arzentar
d9995a865c
fix: check delete permissions before deleting dataset data 2025-10-16 16:31:59 +02:00
hajdul88
9821a01a47
feat: Redis lock integration and Kuzu agentic access fix (#1504)
<!-- .github/pull_request_template.md -->

## Description
This PR introduces a shared locked mechanism in KuzuAdapter to avoid use
case when multiple subprocesses from different environments are trying
to use the same Kuzu adatabase.

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [x] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [x] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
None

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-10-16 15:48:20 +02:00
Igor Ilic
a03a20e053
Merge branch 'dev' into batch-document-handling 2025-10-16 14:30:11 +02:00
Vasilije
f0c332928d
test: fix windows deletion and library test (#1522)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->
Soft deletion and library tests SOMETIMES fail on Windows. Deletion had
a buffer issue, a low level error, and the library test had an issue
where it cannot delete a file that is already used by another process
(this cannot be done on Windows).

## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-10-16 12:15:57 +02:00
Boris Arzentar
48991912a9
Merge remote-tracking branch 'origin/dev' into feature/cog-3014-refactor-delete-feature 2025-10-16 10:03:49 +02:00
Igor Ilic
2e1bfe78b1 refactor: rename variable to be more understandable 2025-10-15 20:26:59 +02:00
Igor Ilic
2fb06e0729 refactor: forwarding of data batch size rework 2025-10-15 20:18:48 +02:00
Igor Ilic
ad4a732e28
Merge branch 'dev' into batch-document-handling 2025-10-15 20:05:04 +02:00
Igor Ilic
8720dd0922
fix: Resolve issue with data element incremental loading for multiple… (#1549)
… datasets

<!-- .github/pull_request_template.md -->

## Description
Resolve issue with Data element incremental loading when in multiple
datasets

## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-10-15 19:45:00 +02:00
Daulet Amirkhanov
2a6256634e chore: revert temporary change to dynamic_steps_example.py 2025-10-15 17:35:46 +01:00
Igor Ilic
c9a3f48398 fix: Resolve issue with data element incremental loading for multiple datasets 2025-10-15 18:26:01 +02:00
Daulet Amirkhanov
a854e4f426 chore: update GraphDBInterface to not throw NotImplementedError for count_nodes() 2025-10-15 17:22:51 +01:00
Daulet Amirkhanov
9e38a30c49 refactor: keep only count_nodes 2025-10-15 17:20:45 +01:00
Daulet Amirkhanov
dede5fa6fd add unit tests for empty graph check on search 2025-10-15 17:09:13 +01:00
Daulet Amirkhanov
ea4a93efb1 Implement count_nodes and count_edges methods for Neo4j 2025-10-15 16:57:53 +01:00
Daulet Amirkhanov
9367fa5d03 Prior to search, check if knowledge graph is empty 2025-10-15 16:39:48 +01:00
Daulet Amirkhanov
f3ec180102 Implement count_edges and count_methods for Kuzu 2025-10-15 16:39:25 +01:00
Daulet Amirkhanov
8692cd1338 feat: add count_nodes and count_edges methods to GraphDBInterface 2025-10-15 16:03:17 +01:00
Boris Arzentar
a062ecbf9d
fix: use Dataset instead of dataset_id 2025-10-15 11:34:39 +02:00
Boris Arzentar
310d713fac
fix: Dataset import 2025-10-14 23:50:06 +02:00
Boris Arzentar
fda0edc075
fix: pass context to distributed cognee tasks 2025-10-14 23:19:19 +02:00
Boris Arzentar
5a0500254b
fix: update poetry.lock 2025-10-14 22:45:52 +02:00
Boris Arzentar
e6166d24bd
fix: lint error 2025-10-14 22:02:24 +02:00
Boris Arzentar
fdc6113d11
fix: get_graph_from_model added nodes and edges default values 2025-10-14 21:55:30 +02:00
hajdul88
3f0eb3b41f
Merge branch 'dev' into feature/cog-3014-refactor-delete-feature 2025-10-14 17:16:48 +02:00
Daulet Amirkhanov
c73e8964a1
Change error logging to warning for missing playwright and protego imports in bs4_crawler.py (#1536)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

`bs4_crawler.py` missing imports is not a critical issue.

It's not part of core cognee, and can be fixed by installing `pip
install "cognee[scraping]"`.

Printing `logger.error()` also breaks our integration tests, so this PR
uses `logger.warning()` instead

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-10-14 15:11:50 +01:00
Daulet Amirkhanov
ca9db23e89
fix: Resolve issue with MCP (#1546)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-10-14 15:11:31 +01:00
Igor Ilic
42ca782e59 fix: Resolve issue with MCP 2025-10-14 15:44:21 +02:00
Boris Arzentar
2ea24dae4d
fix: move datasets import in delete cli command 2025-10-14 15:43:29 +02:00
Daulet Amirkhanov
8a0ec8ff97
Merge branch 'dev' into fix/fix-failing-cli-integrations-test 2025-10-14 14:23:23 +01:00
Boris Arzentar
82daeee6b9
Merge remote-tracking branch 'origin/dev' into feature/cog-3014-refactor-delete-feature 2025-10-14 15:13:14 +02:00
Igor Ilic
0b7fb562d3
Sync poetry and uv lock updates from main to dev (#1544)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-10-14 15:09:04 +02:00
Boris Arzentar
ed272776e2
fix: import datasets directly in delete cli command 2025-10-14 15:08:30 +02:00
Boris Arzentar
344b057d44
fix: change the way datasets import is mocked 2025-10-14 14:35:25 +02:00
Daulet Amirkhanov
3fb241bd23 Merge remote-tracking branch 'origin/main' into merge-main-into-dev 2025-10-14 13:34:29 +01:00
Vasilije
b3c10a0ab0
chore: Update poetry lock (#1542)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-10-14 14:02:29 +02:00
Daulet Amirkhanov
04147c3eec Change error logging to warning for missing playwright and protego imports in bs4_crawler.py 2025-10-14 12:47:41 +01:00
Boris Arzentar
f83fa12703
Merge remote-tracking branch 'origin/dev' into feature/cog-3014-refactor-delete-feature 2025-10-14 13:44:15 +02:00
Igor Ilic
255def5ba9 chore: Update poetry lock 2025-10-14 13:38:41 +02:00
Boris Arzentar
0df22fe703
fix: correct mock paths 2025-10-14 13:37:42 +02:00