Commit graph

3396 commits

Author SHA1 Message Date
Geoff-Robin
9d801f5fe0 Done creating models.py and ingest_database_schema.py 2025-09-27 00:16:44 +02:00
Boris
726d4d8535
fix: limit onnxruntime version (#1473)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-09-26 10:21:20 +02:00
Boris Arzentar
1deab2d54e
fix: limit onnxruntime version 2025-09-26 09:57:53 +02:00
Vasilije
8246a6a02f
fix: Remove creation of default user during search (#1455)
<!-- .github/pull_request_template.md -->

## Description
Removed default user creation during brute force search. Even when a
user is provided to search it's not forwarded to the Retrievers, the
retrievers always created a default user and sent telemetry as the
default user which is inaccurate, they also create a default user even
when there shouldn't be one.

if this information is necessary for telemetry we should forward the
user information that was sent through search through the retrievers and
not always create a default user

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [x] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Changes Made
Removed user as parameter from brute force search, removed default user
creation that was supplied as parameter to brute force search

## Testing
Ran simple example, waiting for CI/CD results

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-09-25 21:11:42 +02:00
Vasilije
9b1d2bfdf9
feat: cors issue if frontend runs on any port other than 3000 (#1457)
<!-- .github/pull_request_template.md -->

## Description
<!-- 
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Changes Made
<!-- List the specific changes made in this PR -->
- 
- 
- 

## Testing
<!-- Describe how you tested your changes -->

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## Related Issues
<!-- Link any related issues using "Fixes #issue_number" or "Relates to
#issue_number" -->

## Additional Notes
<!-- Add any additional notes, concerns, or context for reviewers -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-09-25 21:11:01 +02:00
Boris
013baa3bfc
Merge branch 'dev' into remove-default-user-search 2025-09-25 21:02:00 +02:00
Boris
09fc566504
Merge branch 'dev' into feature/cog-2728-bug-cors-issue-if-frontend-runs-on-any-port-other-than-3000 2025-09-25 21:01:50 +02:00
Igor Ilic
f6254aa5fa
fix: Resolve issue with only_context [COG-3032] (#1452)
<!-- .github/pull_request_template.md -->

## Description
Resolve issue of only context search not working without backend access
control enabled and not being forwarded when backend access control is
enabled

## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Testing
Tested in local SaaS by calling search endpoint on different datasets
with different parameters

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## Related Issues
Fixes issue #COG-3032

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-09-25 20:58:33 +02:00
Boris
ad694ff9fe
Merge branch 'dev' into remove-default-user-search 2025-09-25 18:53:09 +02:00
Boris
f041589d20
Merge branch 'dev' into feature/cog-2728-bug-cors-issue-if-frontend-runs-on-any-port-other-than-3000 2025-09-25 18:53:06 +02:00
Boris
b50556b877
Merge branch 'dev' into fix-only-context 2025-09-25 18:53:03 +02:00
Boris
668dd933ff
chore: Limit pylance to 0.36 because of intel MacOS13 (#1468)
<!-- .github/pull_request_template.md -->

## Description
Latest pylance version 0.37 is not supported for intel architecture
MacOS versions, limit pylance version until MacOS13 can be deprecated
for support

Similar situation is happening with ruff with their new release 10min
ago, it's why it's also limited in this PR

## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-09-25 18:52:45 +02:00
Igor Ilic
f09376429b refactor: Remove telemetry call 2025-09-25 17:39:29 +02:00
Igor Ilic
3b9415ee88 test: Resolve failing unit tests 2025-09-25 17:27:11 +02:00
Igor Ilic
f2edfaa9b9 refactor: Add scikit learn for evals 2025-09-25 17:23:14 +02:00
Igor Ilic
88655031ff chore: Remove scikit dependency 2025-09-25 17:14:56 +02:00
Boris
ade412cd9e
Merge branch 'dev' into remove-default-user-search 2025-09-25 17:10:15 +02:00
Boris
2e8e02bf59
Merge branch 'dev' into feature/cog-2728-bug-cors-issue-if-frontend-runs-on-any-port-other-than-3000 2025-09-25 17:09:35 +02:00
Igor Ilic
71e1070820 Merge branch 'dev' into pylance-fix 2025-09-25 17:08:28 +02:00
Boris
91f21247e0
Merge branch 'dev' into fix-only-context 2025-09-25 17:06:39 +02:00
Igor Ilic
bb0ae06a0a Merge branch 'dev' of github.com:topoteretes/cognee into dev 2025-09-25 17:04:50 +02:00
Igor Ilic
bcc1747ab8 chore: resolve ruff v0.13.2 support issue 2025-09-25 17:02:58 +02:00
Vasilije
997b85e1ce
fix: Cog 2826 clean up poetry (#1305)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-09-25 16:56:26 +02:00
Igor Ilic
cf3f5945e7 chore: Limit pylance to 0.36 for MacOS13 2025-09-25 16:44:55 +02:00
Igor Ilic
4054307b15 refactor: Remove comment 2025-09-25 16:03:11 +02:00
Igor Ilic
50032dd133 fix: install aws for gh action 2025-09-25 16:02:30 +02:00
Igor Ilic
664459e239 refactor: Install baml only for BAML test 2025-09-25 15:30:27 +02:00
Igor Ilic
61ef6fa444 chore: Update pyproject 2025-09-25 15:26:10 +02:00
Igor Ilic
6f8f9bf7de refactor: make comment more understandable 2025-09-25 13:58:52 +02:00
Igor Ilic
8265ec0334 refactor: Add missing install info 2025-09-25 13:57:14 +02:00
Igor Ilic
d1724c710b refactor: Add proper pip install command for optional extras 2025-09-25 13:55:01 +02:00
Igor Ilic
ca2e63bd84 refactor: Move postgres handling to database creation time 2025-09-25 13:49:04 +02:00
Igor Ilic
d2d0d0de4e refactor: install cognee defined baml version for CI/CD 2025-09-25 13:32:09 +02:00
Igor Ilic
8cbc3eb877 Merge branch 'dev' into COG-2826 2025-09-25 13:31:21 +02:00
Igor Ilic
fbe0b6e2ce
Merge branch 'dev' into fix-only-context 2025-09-25 11:50:41 +02:00
Boris Arzentar
9715c0106e
fix: add env variable for changing frontend app url 2025-09-24 10:49:40 +02:00
Chaitany
03858bc06b
fix: Fixes get_filtered_graph_data in kuzu adapter
output format is same as the othere adpaters get_filtered_graph_data.

<!-- .github/pull_request_template.md -->

## Description
<!-- 
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->


## Type of Change
<!-- Please check the relevant option -->
- [-] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Changes Made
<!-- List the specific changes made in this PR -->
Only minimal changes are made in the file for adapter for kuzu database.

## Testing
<!-- Describe how you tested your changes -->

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [-] **I have tested my changes thoroughly before submitting this PR**
- [-] **This PR contains minimal changes necessary to address the
issue/feature**
- [-] My code follows the project's coding standards and style
guidelines
- [-] I have added tests that prove my fix is effective or that my
feature works
- [-] I have added necessary documentation (if applicable)
- [-] All new and existing tests pass
- [-] I have searched existing PRs to ensure this change hasn't been
submitted already
- [-] I have linked any relevant issues in the description
- [-] My commits have clear and descriptive messages

## Related Issues
<!-- Link any related issues using "Fixes #issue_number" or "Relates to
#issue_number" -->
#1436 
## Additional Notes
<!-- Add any additional notes, concerns, or context for reviewers -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2025-09-24 09:43:23 +02:00
Igor Ilic
a9c507b36e fix: Remove creation of default user during search 2025-09-23 18:43:05 +02:00
Daulet Amirkhanov
38b83a5ec1
fix: handle reasoning_effort gracefully across models (#1447)
<!-- .github/pull_request_template.md -->

## Description
The async LLM client fails with non-reasoning models like gpt-4o with
the error: `Completion error: litellm.BadRequestError: OpenAIException -
Unrecognized request argument supplied: reasoning_effort`.

This PR add a sets the `drop_params` config to True to drop all
unsupported model configs instead of catching with a try-except block.

Additionally, the `reasoning_effort` wasn't being set for the sync
client. It adds the parameter for both async & sync for consistency.

## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Changes Made
- Set `litellm.drop_params=True` to auto-drop unsupported parameters
- Changed `reasoning_effort` from `extra_body` to a direct parameter
- Add `reasoning_effort` to the sync client
- Removed redundant retry for `reasoning_effort`

## Testing
<!-- Describe how you tested your changes -->

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## Related Issues
<!-- Link any related issues using "Fixes #issue_number" or "Relates to
#issue_number" -->

## Additional Notes
<!-- Add any additional notes, concerns, or context for reviewers -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-09-23 16:16:46 +01:00
Daulet Amirkhanov
726f49c8ab
Merge branch 'dev' into fix/reasoning-effort-parameter 2025-09-23 16:01:22 +01:00
Igor Ilic
afa47c28b0 fix: Resolve issue with only_context 2025-09-23 13:41:12 +02:00
Vasilije
f3e04142ca
fix: added auto tagging (#1424)
<!-- .github/pull_request_template.md -->

## Description

Added auto tagging so that core team PRs always get the same lable
<!-- 
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [x ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Changes Made
<!-- List the specific changes made in this PR -->
- 
- 
- 

## Testing
<!-- Describe how you tested your changes -->

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## Related Issues
<!-- Link any related issues using "Fixes #issue_number" or "Relates to
#issue_number" -->

## Additional Notes
<!-- Add any additional notes, concerns, or context for reviewers -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-09-23 13:17:03 +02:00
Vasilije
c329e3a1b4
Baml refactor (#1354)
<!-- .github/pull_request_template.md -->

## Description
Refactor BAML and LLMGateway to reduce code duplication and allow
dynamic response model generation for BAML

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-09-22 11:58:00 +02:00
Igor Ilic
87c79b52e3 chore: format files 2025-09-22 11:33:19 +02:00
Igor Ilic
023f5ea632 Merge branch 'dev' into baml-refactor 2025-09-22 11:25:59 +02:00
Igor Ilic
f4a7945473 refactor: Move creation of baml dynamic type to own file 2025-09-22 11:20:48 +02:00
oryx1729
1f63b6db55 Make ruff happy 2025-09-20 17:02:31 -07:00
oryx1729
766b300fbc fix: handle reasoning_effort parameter gracefully across models
- Set litellm.drop_params=True to auto-drop unsupported parameters
- Changed reasoning_effort from extra_body to direct parameter
- Added reasoning_effort to both async and sync methods
- Removed redundant retry logic for unsupported parameters
- Ensures compatibility with models that don't support reasoning_effort

This fixes errors when using models that don't support the reasoning_effort
parameter while maintaining the optimization for models that do support it.
2025-09-20 16:35:51 -07:00
Vasilije
f14751bca7
Clarify UI running instructions in README
Updated instructions to include setting LLM_API_KEY before running the UI.
2025-09-19 18:26:19 +02:00
Chaitany
96eb0d448a
feat(#1357): Lexical chunk retriever (#1392)
<!-- .github/pull_request_template.md -->

## Description
<!-- 
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->
I Implemented Lexical Chunk Retriever In the LexicalRetriever class is
Inherite the BaseRetriever and The DocumentChunk are lazy loaded when
first time query is made because it save time during object
initialization
and the function get_context and the get_completion are Implemented same
as the ChunksRetriever the only diffrence is that the DocumentChunk are
converted to match the output type of the ChunksRetriever using function
get_own_properties in the utils.

## Type of Change
<!-- Please check the relevant option -->
- [-] Bug fix (non-breaking change that fixes an issue)
- [-] New feature (non-breaking change that adds functionality)
- [-] Breaking change (fix or feature that would cause existing
functionality to change)
- [-] Documentation update
- [-] Code refactoring
- [-] Performance improvement
- [-] Other (please specify):

## Changes Made
<!-- List the specific changes made in this PR -->
- Added LexicalRetriever base class with customizable tokenizer & scorer
     - Implemented caching of DocumentChunk tokens and payloads 
- Added robust initialization with error handling and logging -
Implemented get_context with top_k ranking and optional scores
- Implemented get_completion consistent with BaseRetriever interface
- Added JaccardChunksRetriever demo using set/multiset Jaccard
similarity
- Support for stopwords and multiset frequency-aware similarity -
Integrated logging for initialization, scoring, and retrieval

## Testing

- Manual tests: initialized retriever, retrieved chunks with toy corpus
    - Edge cases: empty corpus, empty query, scorer/tokenizer errors 
    - Verified Jaccard similarity results for single/multiset cases 
    - Code formatted and linted


## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [-] **I have tested my changes thoroughly before submitting this PR**
- [-] **This PR contains minimal changes necessary to address the
issue/feature**
- [-] My code follows the project's coding standards and style
guidelines
- [-] I have added tests that prove my fix is effective or that my
feature works
- [-] I have added necessary documentation (if applicable)
- [-] All new and existing tests pass
- [-] I have searched existing PRs to ensure this change hasn't been
submitted already
- [-] I have linked any relevant issues in the description
- [-] My commits have clear and descriptive messages

## Related Issues
<!-- Link any related issues using "Fixes #issue_number" or "Relates to
#issue_number" -->
Relates to  #1392
## Additional Notes
<!-- Add any additional notes, concerns, or context for reviewers -->
Int the cognee/modules/chunking/models/DocumentChunk.py
don't remove the optional  from is_part_of attributes.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Andrej Milicevic <milicevicandrej@yahoo.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Igor Ilic <igorilic03@gmail.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
2025-09-19 18:24:33 +02:00