Commit graph

4598 commits

Author SHA1 Message Date
hajdul88
4ac40de95e
Merge branch 'dev' into feature/cog-3532-empower-test_search-db-retrievers-tests-reorg 2025-12-15 15:21:19 +01:00
Vasilije
69e36cc834
feat: add bedrock as supported llm provider (#1830)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->
Added support for AWS Bedrock, and the models that are available there.
This was a contributor PR that was never finished, so now I polished it
up and made it work.

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [x] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added AWS Bedrock as a new LLM provider with support for multiple
authentication methods.
* Integrated three new AI models: Claude 4.5 Sonnet, Claude 4.5 Haiku,
and Amazon Nova Lite.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-15 14:33:57 +01:00
Vasilije
d5bd75c627
test: remove anything codify related from mcp test (#1890)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->
After removing codify, I didn't remove codify-related stuff from mcp
test. Hopefully now I've removed everything that was failing.

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
  * Removed test coverage for codify and developer rules functionality.
* Updated search functionality test to exclude additional search type
filtering.

* **Chores**
  * Increased default logging verbosity from ERROR to INFO level.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-15 14:33:22 +01:00
Igor Ilic
c94225f505
fix: make ontology key an optional param in cognify (#1894)
<!-- .github/pull_request_template.md -->

## Description
Make ontology key optional in Swagger and None by default (it was
"string" by default before change which was causing issues when running
cognify endpoint)

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
* Enhanced API documentation with additional examples and validation
metadata to improve request clarity and validation guidance.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-15 14:30:22 +01:00
hajdul88
fc89f71e7c fix: fixes hotpot and twowiki tests (that are using url to download dataset) 2025-12-12 15:57:49 +01:00
hajdul88
499c717f85 chore: deletes unit tests (splitting PR into two) 2025-12-12 14:43:14 +01:00
hajdul88
3a48930c3b
Merge branch 'dev' into feature/cog-3532-empower-test_search-db-retrievers-tests-reorg 2025-12-12 14:25:01 +01:00
Andrej Milicevic
116b6f1eeb chore: formatting 2025-12-12 13:46:16 +01:00
Andrej Milicevic
a225d7fc61 test: revert some changes 2025-12-12 13:44:58 +01:00
Igor Ilic
127d9860df
feat: Add dataset database handler info (#1887)
<!-- .github/pull_request_template.md -->

## Description
Add info on dataset database handler used for dataset database

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Datasets now record their assigned vector and graph database handlers,
allowing per-dataset backend selection.

* **Chores**
  * Database schema expanded to store handler identifiers per dataset.
* Deletion/cleanup processes now use dataset-level handler info for
accurate removal across backends.

* **Tests**
* Tests updated to include and validate the new handler fields in
dataset creation outputs.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-12 13:22:03 +01:00
Igor Ilic
ede884e0b0
feat: make pipeline processing cache optional (#1876)
<!-- .github/pull_request_template.md -->

## Description
Make the pipeline cache mechanism optional, have it turned off by
default but use it for add and cognify like it has been used until now

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [x] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [ x I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Introduced pipeline caching across ingestion, processing, and custom
pipeline flows with per-run controls to enable or disable caching.
  * Added an option for incremental loading in custom pipeline runs.

* **Behavior Changes**
* One pipeline path now explicitly bypasses caching by default to always
re-run when invoked.
* Disabling cache forces re-processing instead of early exit; cache
reset still enables re-execution.

* **Tests**
* Added tests validating caching, non-caching, and cache-reset
re-execution behavior.

* **Chores**
  * Added CI job to run pipeline caching tests.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-12 13:11:31 +01:00
Andrej Milicevic
a337f4e54c test: testing logger 2025-12-12 13:02:55 +01:00
Andrej Milicevic
bce6094010 test: change logger 2025-12-12 12:43:54 +01:00
Andrej Milicevic
c48b274571 test: remove delete error from mcp test 2025-12-12 11:53:40 +01:00
Andrej Milicevic
3b8a607b5f test: fix errors in mcp test 2025-12-12 11:37:27 +01:00
hajdul88
3ce5b2da4c
Merge branch 'dev' into feature/cog-3532-empower-test_search-db-retrievers-tests-reorg 2025-12-12 09:02:31 +01:00
Igor Ilic
7b3d997a06
Merge main vol7 (#1891)
<!-- .github/pull_request_template.md -->

## Description
Add commits from main to dev branch

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Refactor**
* Removed permission validation checks from the data processing
pipeline, streamlining the overall workflow and reducing processing
steps.
* Updated task sequences across task handlers to reflect the removal of
the validation step.

* **Documentation**
* Updated processing pipeline documentation and example code to reflect
the new streamlined task sequence.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-11 20:11:54 +01:00
Igor Ilic
59f8d12fa3 Merge branch 'main' into merge-main-vol7 2025-12-11 19:11:24 +01:00
Andrej Milicevic
e211e66275 chore: remove quick option to isolate mcp CI test 2025-12-11 18:29:17 +01:00
Andrej Milicevic
0f50c993ac chore: add quick option to isolate mcp CI test 2025-12-11 18:20:07 +01:00
Andrej Milicevic
248ba74592 test: remove codify-related stuff from mcp test 2025-12-11 18:18:42 +01:00
hajdul88
994d802eba
Merge branch 'dev' into feature/cog-3532-empower-test_search-db-retrievers-tests-reorg 2025-12-11 14:32:21 +01:00
Igor Ilic
46ddd4fd12
feat: add dataset database handler logic and neo4j/lancedb/kuzu handlers (#1776)
<!-- .github/pull_request_template.md -->

## Description
Add ability to use multi tenant multi user mode with Neo4j

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **New Features**
* Multi-user support with per-dataset database isolation enabled by
default, allowing backend access control for secure data separation.
* Configurable database handlers via environment variables
(GRAPH_DATASET_DATABASE_HANDLER, VECTOR_DATASET_DATABASE_HANDLER) for
flexible deployment options.

* **Chores**
* Database schema migration to support per-user dataset database
configurations.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-11 14:15:20 +01:00
hajdul88
eecf1fb11d adds unit test for graph summary completion retriever 2025-12-11 13:23:48 +01:00
hajdul88
dbde7427ef adds unit test for user qa feedback 2025-12-11 13:23:29 +01:00
hajdul88
303fcd1803 fix linting 2025-12-11 13:18:25 +01:00
hajdul88
4d0dca0b6c ruff 2025-12-11 13:13:12 +01:00
hajdul88
ca593bba15 adds full coverage to conv history save 2025-12-11 13:13:01 +01:00
hajdul88
5f6560d6a7 adds 100% coverage to completion 2025-12-11 13:12:44 +01:00
Igor Ilic
0a1ed79340 refactor: change neo4j_aura to neo4j_aura_dev 2025-12-11 13:05:23 +01:00
hajdul88
22af6a050a adds 100% coverage to brute force triplet search 2025-12-11 12:56:42 +01:00
hajdul88
c2e402dbab adds 100% coverage to graph completion retriever 2025-12-11 12:53:23 +01:00
hajdul88
4fd883b0fd chore: removes dead code 2025-12-11 12:53:02 +01:00
hajdul88
0d0fe3aa30 ruff 2025-12-11 12:27:27 +01:00
hajdul88
c5c60ccad0 adds 100% cov to temporal retriever 2025-12-11 12:27:14 +01:00
hajdul88
430df0db15 feat: adds 100 cov to cot 2025-12-11 12:13:51 +01:00
hajdul88
2531e50a56 fix linting 2025-12-11 11:47:02 +01:00
hajdul88
88307ce382 increases coverage in context extension retriever 2025-12-11 11:43:19 +01:00
hajdul88
670a0fbb69 increases coverage for cot completion retriever 2025-12-11 11:38:11 +01:00
hajdul88
36e82909dc fix linting 2025-12-11 11:05:21 +01:00
hajdul88
3d99db256e Merge branch 'feature/cog-3532-empower-test_search-db-retrievers-tests-reorg' of github.com:topoteretes/cognee into feature/cog-3532-empower-test_search-db-retrievers-tests-reorg 2025-12-11 11:01:13 +01:00
hajdul88
75436eeae1 increasing the coverage of chunks retriever 2025-12-11 11:00:48 +01:00
hajdul88
aa263142b8 increasing coverage of graph completion 2025-12-11 10:58:58 +01:00
hajdul88
a75cb07aec
Merge branch 'dev' into feature/cog-3532-empower-test_search-db-retrievers-tests-reorg 2025-12-11 10:54:46 +01:00
Pavel Zorin
fe7e97be45
Chore: Remove Ontology file size limit. Code duplications (#1880)
<!-- .github/pull_request_template.md -->

## Description
We received a complaint about the 10MB file size limit. 
Removed code duplications
More strict types
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Support for supplying optional per-file descriptions when uploading
multiple ontologies.

* **Improvements**
* Removed the 10MB file size limit for ontology uploads, allowing larger
files.
* Streamlined and more robust upload handling with improved per-file
validation and safer upload behavior.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-11 10:49:55 +01:00
hajdul88
ef042d9a29 feat: increases coverage for rag completion retriever 2025-12-11 09:49:42 +01:00
hajdul88
610b30579b chore: updated default value in triplet retriever constructor 2025-12-10 19:10:01 +01:00
hajdul88
ee00af9266 feat: increasing the coverage of summaries retriever 2025-12-10 19:04:28 +01:00
hajdul88
21a84d3100 feat: increasing coverage of temporal retriever 2025-12-10 18:58:13 +01:00
hajdul88
29f94e39b9 feat: increasing coverage in triplet retriever 2025-12-10 18:53:00 +01:00