Commit graph

4235 commits

Author SHA1 Message Date
chinu0609
7bd7079aac fix: vecto_engine.delte_data_points 2025-11-18 22:17:23 +05:30
chinu0609
d351c9a009 fix: return chunk payload 2025-11-10 21:58:01 +05:30
chinu0609
84bd2f38f7 fix: remove uneccessary imports 2025-11-07 12:12:46 +05:30
chinu0609
84c8e07ddd fix: remove uneccessary imports 2025-11-07 12:03:17 +05:30
chinu0609
85a2bac062 fix: min to days 2025-11-06 23:18:25 +05:30
chinu0609
ce4a5c8311 Merge branch 'main' of github.com:chinu0609/cognee 2025-11-06 23:05:12 +05:30
chinu0609
b327756e5f Merge remote-tracking branch 'upstream/dev' 2025-11-06 23:03:45 +05:30
Chinmay Bhosale
bd71540d75
Merge pull request #5 from chinu0609/delete-last-acessed
feat: adding cleanup function and adding update_node_acess_timestamps…
2025-11-06 23:02:18 +05:30
chinu0609
fdf037b3d0 fix: min to days 2025-11-06 23:00:56 +05:30
lxobr
5bc83968f8
feature: text chunker with overlap (#1732)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->
- Implements `TextChunkerWithOverlap` with configurable
`chunk_overlap_ratio`
- Abstracts chunk_data generation via `get_chunk_data` callable
(defaults to `chunk_by_paragraph`)
- Parametrized tests verify `TextChunker` and `TextChunkerWithOverlap`
(0% overlap) produce identical output for all edge cases.
- Overlap-specific tests validate `TextChunkerWithOverlap` behavior 

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [x] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2025-11-06 15:22:48 +01:00
hajdul88
c0e5ce04ce
Fix: fixes session history test for multiuser mode (#1746)
<!-- .github/pull_request_template.md -->

## Description
Fixes failing session history test

## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-11-06 14:13:55 +01:00
Vasilije
69c7aa2559
test: add load tests (#1573)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->
Added a load test to out codebase. The test runs N adds of a pdf, then
cognifies them and runs N searches. Cognify and the searches are
measured, with certain constraints on how fast they should be. We can
tweak the values if necessary, these are values for the gpt-5-mini
model.

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduce a load test for S3 ingest, cognify, and concurrent searches
with timing thresholds, and wire it into CI.
> 
> - **Tests**:
> - Add `cognee/tests/test_load.py` to measure end-to-end load: prunes
data/system, ingests from `s3://cognee-test-load-s3-bucket`, runs
`cognify` then concurrent GRAPH_COMPLETION searches, records timings
across reps, and asserts avg ≤ 8m and each run ≤ 10m.
> - **CI**:
> - Add `test-load` job in `.github/workflows/e2e_tests.yml`: installs
AWS deps, raises file descriptor limit, configures S3/env secrets, and
executes the new load test.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
c7598122bb. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2025-11-06 08:42:01 +01:00
Vasilije
8cc55ac0b2
refactor: Enable multi user mode by default if graph and vector db pr… (#1695)
…oviders support it

<!-- .github/pull_request_template.md -->

## Description
Enable multi user mode by default for supported graph and vector DBs

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [x] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-11-06 08:40:06 +01:00
chinu0609
c5f0c4af87 fix: add text_doc flag 2025-11-05 20:22:17 +05:30
Vasilije
c7598122bb
Merge branch 'dev' into feature/cog-3165-add-load-tests 2025-11-05 14:49:29 +01:00
chinu0609
ff263c0132 fix: add column check in migration 2025-11-05 18:40:58 +05:30
chinu0609
9041a804ec fix: add text_doc flag 2025-11-05 18:32:49 +05:30
hajdul88
eaf8d718b0
feat: introduces memify pipeline to save cache sessions into cognee (#1731)
<!-- .github/pull_request_template.md -->

## Description
This PR introduces a new memify pipeline to save cache sessions in
cognee. The QA sessions are added to the main knowledge base as separate
documents.


## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [x] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
None

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-11-05 10:27:54 +01:00
chinu0609
3c0e915812 fix: removing hard relations 2025-11-05 12:25:51 +05:30
Igor Ilic
bee2fe3ba7
feat: Add initial custom pipeline (#1716)
<!-- .github/pull_request_template.md -->

## Description
Add run_custom_pipeline to have a way to execute a custom collection of tasks in Cognee

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-11-04 17:58:34 +01:00
chinu0609
d34fd9237b feat: adding last_acessed in the Data model 2025-11-04 22:04:32 +05:30
Vasilije
d54cd85575
CI: Limit deletion integration tests to 60 minutes (#1730)
<!-- .github/pull_request_template.md -->

## Description
Added a test timeout for Soft and Hard deletion integration tests. They
occasionally hand. It's a temporary solution to unblock the CI. The
reason of hanging will be solved within a separate issue.
## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [x] Other (please specify):
CI improvement
## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-11-04 13:31:31 +01:00
Vasilije
0ef2d51246
fix: Add logs to docker (#1656)
<!-- .github/pull_request_template.md -->

## Description
Adds detailed logging to docker

## Type of Change
<!-- Please check the relevant option -->
- [ X] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-11-04 13:31:00 +01:00
Pavel Zorin
e11a8b9a51 CI: Added timeouts for all OS tests 2025-11-04 12:13:48 +01:00
Igor Ilic
46c509778f refactor: Rename access control functions 2025-11-04 12:06:16 +01:00
Igor Ilic
53521c2068
Update cognee/context_global_variables.py
Co-authored-by: Pavel Zorin <pazonec@yandex.ru>
2025-11-03 19:42:51 +01:00
Igor Ilic
c81d06d364
Update cognee/context_global_variables.py
Co-authored-by: Pavel Zorin <pazonec@yandex.ru>
2025-11-03 19:37:52 +01:00
Andrej Milicevic
a7d63df98c test: add extra aws dependency to load test 2025-11-03 18:15:18 +01:00
Andrej Milicevic
eb8df45dab test: increase file descriptor limit on workflow load test 2025-11-03 18:10:19 +01:00
Andrej Milicevic
d0a3bfd39f merged dev into this branch 2025-11-03 17:08:08 +01:00
Andrej Milicevic
4424bdc764 test: fix path based on pr comment 2025-11-03 17:06:51 +01:00
Igor Ilic
2ab2cffd07 chore: update test_search_db to work with all graph providers 2025-11-03 16:37:03 +01:00
Pavel Zorin
b8241e58e5 CI: Limit deletion integration tests to 60 minutes 2025-11-03 16:20:03 +01:00
Igor Ilic
baac00923c
Merge branch 'dev' into enable-multi-user-mode-default 2025-11-03 15:57:06 +01:00
chinu0609
5080e8f8a5 feat: genarlizing getting entities from triplets 2025-11-03 00:59:04 +05:30
Vasilije
8d7c4d5384
CI: (dev)Extract Windows and MacOS tests to separate job (#1715)
<!-- .github/pull_request_template.md -->

## Description
Reduces the amount of macOS and Windows jobs by running these tests only
for the latest python version in the list.
`different-os-tests-basic` runs tests on ubuntu and python `'["3.10.x",
"3.11.x", "3.12.x", "3.13.x"]'`
`different-os-tests-extended` runs windows and macOS tests for python
`3.13.x`.

<img width="255" height="385" alt="Screenshot 2025-10-31 at 12 36 11"
src="https://github.com/user-attachments/assets/73813d69-70a1-40a6-8197-b323ddc7d365"
/>

<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-11-01 06:50:08 +01:00
Igor Ilic
f368a1a4d5 fix: set tests to not use multi-user mode 2025-10-31 20:10:05 +01:00
Pavel Zorin
5e015d6a4e
Fix ollama tests (#1714)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-10-31 18:47:58 +03:00
Igor Ilic
4c8b821197 fix: resolve test failing 2025-10-31 14:55:52 +01:00
Igor Ilic
00a1fe71d7 fix: Use multi-user mode search 2025-10-31 14:33:07 +01:00
Igor Ilic
3c09433ade fix: Resolve docling test 2025-10-31 13:57:12 +01:00
Pavel Zorin
737f792ac6 use api/embed for ollama api 2025-10-31 13:43:29 +01:00
Pavel Zorin
5d2d4e51f1 Ollama: Use openAI compatible embeggings API 2025-10-31 13:30:26 +01:00
Pavel Zorin
645bda38e3 chore: Fix Ollama test / update Ollama API usage 2025-10-31 13:23:49 +01:00
Pavel Zorin
a60e53964c
Potential fix for code scanning alert no. 399: Workflow does not contain permissions
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2025-10-31 12:37:38 +01:00
chinu0609
f1afd1f0a2 feat: adding cleanup function and adding update_node_acess_timestamps in completion retriever and graph_completion retriever 2025-10-31 15:49:34 +05:30
Chinmay Bhosale
4b43afcdab
Merge pull request #4 from chinu0609/delete-last-acessed
Delete last acessed
2025-10-31 00:25:33 +05:30
chinu0609
6f06e4a5eb fix: removing node_type and try except 2025-10-31 00:17:13 +05:30
chinu0609
13396871c9 Merge branch 'delete-last-acessed' of github.com:chinu0609/cognee into delete-last-acessed 2025-10-31 00:12:10 +05:30
chinu0609
5f6f0502c8 fix: removing last_acessed_at from individual model and adding it to DataPoint 2025-10-31 00:00:18 +05:30