Commit graph

4265 commits

Author SHA1 Message Date
Igor Ilic
59f758d5c2 feat: Add test for multi tenancy, add ability to share name for dataset across tenants for one user 2025-11-07 15:50:49 +01:00
Igor Ilic
b0a4f775f4 Merge branch 'multi-tenancy' of github.com:topoteretes/cognee into multi-tenancy 2025-11-06 19:12:29 +01:00
Igor Ilic
96c8bba580 refactor: Add db creation as step in MCP creation 2025-11-06 19:12:09 +01:00
Igor Ilic
5dbfea5084
Merge branch 'dev' into multi-tenancy 2025-11-06 18:55:18 +01:00
Igor Ilic
7dec6bfded refactor: Add migrations as part of python package 2025-11-06 18:10:04 +01:00
Igor Ilic
61e1c2903f fix: Remove issue with default user creation 2025-11-06 17:00:46 +01:00
Igor Ilic
bcc59cf9a0 fix: Remove default user creation 2025-11-06 16:57:59 +01:00
Igor Ilic
0d68175167 fix: remove database creation from migrations 2025-11-06 16:53:22 +01:00
Igor Ilic
efb46c99f9 fix: resolve issue with sqlite migration 2025-11-06 16:47:42 +01:00
Igor Ilic
c146de3a4d fix: Remove creation of database and db tables from env.py 2025-11-06 16:41:00 +01:00
Igor Ilic
ef3a382669 refactor: use batch insert for SQLite as well 2025-11-06 16:23:54 +01:00
lxobr
5bc83968f8
feature: text chunker with overlap (#1732)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->
- Implements `TextChunkerWithOverlap` with configurable
`chunk_overlap_ratio`
- Abstracts chunk_data generation via `get_chunk_data` callable
(defaults to `chunk_by_paragraph`)
- Parametrized tests verify `TextChunker` and `TextChunkerWithOverlap`
(0% overlap) produce identical output for all edge cases.
- Overlap-specific tests validate `TextChunkerWithOverlap` behavior 

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [x] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2025-11-06 15:22:48 +01:00
Igor Ilic
ac751bacf0 fix: Resolve SQLite migration issue 2025-11-06 14:51:25 +01:00
Igor Ilic
ac6dd08855 fix: Resolve issue with sqlite index creation 2025-11-06 14:35:26 +01:00
hajdul88
c0e5ce04ce
Fix: fixes session history test for multiuser mode (#1746)
<!-- .github/pull_request_template.md -->

## Description
Fixes failing session history test

## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-11-06 14:13:55 +01:00
Vasilije
69c7aa2559
test: add load tests (#1573)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->
Added a load test to out codebase. The test runs N adds of a pdf, then
cognifies them and runs N searches. Cognify and the searches are
measured, with certain constraints on how fast they should be. We can
tweak the values if necessary, these are values for the gpt-5-mini
model.

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduce a load test for S3 ingest, cognify, and concurrent searches
with timing thresholds, and wire it into CI.
> 
> - **Tests**:
> - Add `cognee/tests/test_load.py` to measure end-to-end load: prunes
data/system, ingests from `s3://cognee-test-load-s3-bucket`, runs
`cognify` then concurrent GRAPH_COMPLETION searches, records timings
across reps, and asserts avg ≤ 8m and each run ≤ 10m.
> - **CI**:
> - Add `test-load` job in `.github/workflows/e2e_tests.yml`: installs
AWS deps, raises file descriptor limit, configures S3/env secrets, and
executes the new load test.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
c7598122bb. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2025-11-06 08:42:01 +01:00
Vasilije
8cc55ac0b2
refactor: Enable multi user mode by default if graph and vector db pr… (#1695)
…oviders support it

<!-- .github/pull_request_template.md -->

## Description
Enable multi user mode by default for supported graph and vector DBs

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [x] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-11-06 08:40:06 +01:00
Igor Ilic
ce64f242b7 refactor: add droping of index as well 2025-11-05 18:04:05 +01:00
Igor Ilic
1ef5805c57 fix: Resolve issue with sync migration 2025-11-05 17:50:13 +01:00
Igor Ilic
9b6cbaf389 chore: Add multi tenant migration 2025-11-05 17:24:11 +01:00
Igor Ilic
c4807a0c67 refactor: Use user_tenants table to update 2025-11-05 16:14:37 +01:00
Igor Ilic
fa4c50f972 fix: Resolve issue with sync migration not working for postgresql 2025-11-05 16:05:33 +01:00
Vasilije
c7598122bb
Merge branch 'dev' into feature/cog-3165-add-load-tests 2025-11-05 14:49:29 +01:00
Igor Ilic
9fc4199958 fix: Resolve issue with cleaning acl table 2025-11-05 13:18:47 +01:00
Igor Ilic
1643b13c95 chore: add table creation for multi-tenancy to migration 2025-11-05 12:43:01 +01:00
Igor Ilic
6a7d8ba106
Merge branch 'dev' into multi-tenancy 2025-11-05 12:17:49 +01:00
hajdul88
eaf8d718b0
feat: introduces memify pipeline to save cache sessions into cognee (#1731)
<!-- .github/pull_request_template.md -->

## Description
This PR introduces a new memify pipeline to save cache sessions in
cognee. The QA sessions are added to the main knowledge base as separate
documents.


## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [x] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
None

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-11-05 10:27:54 +01:00
Igor Ilic
c2aaec2a82 refactor: Resolve issue with permissions example 2025-11-04 23:34:51 +01:00
Igor Ilic
7782f246d3 refactor: Update permissions example to work with new changes 2025-11-04 20:54:00 +01:00
Igor Ilic
f002d3bf0e refactor: Update permissions example 2025-11-04 20:24:16 +01:00
Igor Ilic
db2a32dd17 test: Resolve issue permission example 2025-11-04 19:17:02 +01:00
Igor Ilic
fb102f29a8 chore: Add alembic migration for multi-tenant system 2025-11-04 19:03:56 +01:00
Igor Ilic
a6487cfdc1
Merge branch 'dev' into multi-tenancy 2025-11-04 18:01:19 +01:00
Igor Ilic
bee2fe3ba7
feat: Add initial custom pipeline (#1716)
<!-- .github/pull_request_template.md -->

## Description
Add run_custom_pipeline to have a way to execute a custom collection of tasks in Cognee

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-11-04 17:58:34 +01:00
Igor Ilic
cd32b492a4 refactor: Add filtering of non current tenant results when authorizing dataset 2025-11-04 17:56:01 +01:00
Igor Ilic
f4117c42e9 fix: Resolve issue with entity extraction test 2025-11-04 16:43:41 +01:00
Igor Ilic
69ee8ae0a9 Merge branch 'multi-tenancy' of github.com:topoteretes/cognee into multi-tenancy 2025-11-04 16:42:55 +01:00
Igor Ilic
64c7b857d6
Merge branch 'dev' into multi-tenancy 2025-11-04 16:42:51 +01:00
Igor Ilic
9d771acc24 refactor: filter out search results 2025-11-04 13:35:50 +01:00
Vasilije
d54cd85575
CI: Limit deletion integration tests to 60 minutes (#1730)
<!-- .github/pull_request_template.md -->

## Description
Added a test timeout for Soft and Hard deletion integration tests. They
occasionally hand. It's a temporary solution to unblock the CI. The
reason of hanging will be solved within a separate issue.
## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [x] Other (please specify):
CI improvement
## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-11-04 13:31:31 +01:00
Vasilije
0ef2d51246
fix: Add logs to docker (#1656)
<!-- .github/pull_request_template.md -->

## Description
Adds detailed logging to docker

## Type of Change
<!-- Please check the relevant option -->
- [ X] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-11-04 13:31:00 +01:00
Igor Ilic
ea675f29d6 fix: Resolve typo in accessing dictionary for dataset_id 2025-11-04 13:15:49 +01:00
Igor Ilic
ac257dca1d refactor: Account for async change for identify function 2025-11-04 13:13:42 +01:00
Igor Ilic
ff388179fb feat: Add dataset_id calculation that handles legacy dataset_id 2025-11-04 13:11:57 +01:00
Igor Ilic
b0f85c9e99 feat: add legacy and modern data_id calculating 2025-11-04 13:01:10 +01:00
Igor Ilic
e3b707a0c2 refactor: Change variable names, add setting of current tenant to be optional for tenant creation 2025-11-04 12:20:17 +01:00
Pavel Zorin
e11a8b9a51 CI: Added timeouts for all OS tests 2025-11-04 12:13:48 +01:00
Igor Ilic
46c509778f refactor: Rename access control functions 2025-11-04 12:06:16 +01:00
Igor Ilic
53521c2068
Update cognee/context_global_variables.py
Co-authored-by: Pavel Zorin <pazonec@yandex.ru>
2025-11-03 19:42:51 +01:00
Igor Ilic
c81d06d364
Update cognee/context_global_variables.py
Co-authored-by: Pavel Zorin <pazonec@yandex.ru>
2025-11-03 19:37:52 +01:00