Commit graph

4492 commits

Author SHA1 Message Date
hajdul88
c5c60ccad0 adds 100% cov to temporal retriever 2025-12-11 12:27:14 +01:00
hajdul88
430df0db15 feat: adds 100 cov to cot 2025-12-11 12:13:51 +01:00
hajdul88
2531e50a56 fix linting 2025-12-11 11:47:02 +01:00
hajdul88
88307ce382 increases coverage in context extension retriever 2025-12-11 11:43:19 +01:00
hajdul88
670a0fbb69 increases coverage for cot completion retriever 2025-12-11 11:38:11 +01:00
hajdul88
36e82909dc fix linting 2025-12-11 11:05:21 +01:00
hajdul88
3d99db256e Merge branch 'feature/cog-3532-empower-test_search-db-retrievers-tests-reorg' of github.com:topoteretes/cognee into feature/cog-3532-empower-test_search-db-retrievers-tests-reorg 2025-12-11 11:01:13 +01:00
hajdul88
75436eeae1 increasing the coverage of chunks retriever 2025-12-11 11:00:48 +01:00
hajdul88
aa263142b8 increasing coverage of graph completion 2025-12-11 10:58:58 +01:00
hajdul88
a75cb07aec
Merge branch 'dev' into feature/cog-3532-empower-test_search-db-retrievers-tests-reorg 2025-12-11 10:54:46 +01:00
Pavel Zorin
fe7e97be45
Chore: Remove Ontology file size limit. Code duplications (#1880)
<!-- .github/pull_request_template.md -->

## Description
We received a complaint about the 10MB file size limit. 
Removed code duplications
More strict types
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Support for supplying optional per-file descriptions when uploading
multiple ontologies.

* **Improvements**
* Removed the 10MB file size limit for ontology uploads, allowing larger
files.
* Streamlined and more robust upload handling with improved per-file
validation and safer upload behavior.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-11 10:49:55 +01:00
hajdul88
ef042d9a29 feat: increases coverage for rag completion retriever 2025-12-11 09:49:42 +01:00
hajdul88
610b30579b chore: updated default value in triplet retriever constructor 2025-12-10 19:10:01 +01:00
hajdul88
ee00af9266 feat: increasing the coverage of summaries retriever 2025-12-10 19:04:28 +01:00
hajdul88
21a84d3100 feat: increasing coverage of temporal retriever 2025-12-10 18:58:13 +01:00
hajdul88
29f94e39b9 feat: increasing coverage in triplet retriever 2025-12-10 18:53:00 +01:00
hajdul88
9d9a388804 feat: adds unit test for context extension retriever and chunks retriever 2025-12-10 18:31:03 +01:00
hajdul88
9d900f48cd feat: adds unit test for cot retriever 2025-12-10 18:28:11 +01:00
hajdul88
a8c999be12 feat: adds unit test for graph completion 2025-12-10 18:18:53 +01:00
hajdul88
f99dd140fe feat: adds unit test for rag completion retriever 2025-12-10 18:18:40 +01:00
hajdul88
ff1add1af0 ruff 2025-12-10 18:18:24 +01:00
hajdul88
0329777290 feat: adds unit test to triplet retriever test without session and with provided context 2025-12-10 18:14:49 +01:00
hajdul88
ac58058eaa ruff ruff 2025-12-10 18:07:22 +01:00
hajdul88
508086c513 feat: adds unit test for summaries retriever 2025-12-10 18:07:07 +01:00
hajdul88
ab6a1d1b5b feat: adds additional unit tests for temporal retriever 2025-12-10 17:57:40 +01:00
Pavel Zorin
88f61f9bdb Added filename check 2025-12-10 17:24:31 +01:00
hajdul88
1fef4c1ab3
Merge branch 'dev' into feature/cog-3532-empower-test_search-db-retrievers-tests-reorg 2025-12-10 17:10:21 +01:00
hajdul88
001fbe699e
feat: Adds edge centered payload and embedding structure during ingestion (#1853)
<!-- .github/pull_request_template.md -->

## Description
This pull request introduces edge‑centered payloads to the ingestion
process. Payloads are stored in the Triplet_text collection which is
compatible with the triplet_embedding memify pipeline.

Changes in This PR:

- Refactored custom edge handling, from now on they can be passed to the
add_data_points method so the ingestion is centralized and is happening
in one place.
- Added private methods to handle edge centered payload creation inside
the add_data_points.py
- Added unit tests to cover the new functionality
- Added integration tests
- Added e2e tests

Acceptance Criteria and Testing
Scenario 1:
-Set TRIPLET_EMBEDDING env var to True
-Run prune, add, cognify
-Verify the vector DB contains a non empty Triplet_text collection and
the number of triplets are matching with the number of edges in the
graph database
-Use the new triplet_completion search type and confirm it works
correctly.

Scenario 2:
-Set TRIPLET_EMBEDDING env var to True
-Run prune, add, cognify
-Verify the vector DB does not have the Triplet_text collection 
-You should receive an error indicating that the Triplet_text is not
available


## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [x] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Triplet embeddings supported—embeddings created from graph edges plus
connected node text
  * Ability to supply custom edges when adding data points
  * New configuration toggle to enable/disable triplet embedding

* **Tests**
* Added comprehensive unit and end-to-end tests for edge-centered
payloads and triplet embedding
  * New CI job to run the edge-centered payload e2e test

* **Bug Fixes**
* Adjusted server start behavior to surface process output in parent
logs

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Pavel Zorin <pazonec@yandex.ru>
2025-12-10 17:10:06 +01:00
hajdul88
4791d255be feat: adds integration test for temporal retriever 2025-12-10 14:49:59 +01:00
hajdul88
a14dacdc0f Update test_chunks_retriever.py 2025-12-10 14:03:40 +01:00
hajdul88
b4fb4ce49b ruff ruff 2025-12-10 11:51:12 +01:00
hajdul88
714eeaffc4 feat: adds some new tests to the triplet retriever int test 2025-12-10 11:48:18 +01:00
hajdul88
2896bd10ee feat: adds asserts to summaries retriever int test 2025-12-10 11:47:51 +01:00
hajdul88
b0d806526a feat: adds missing checks to rag completion test 2025-12-10 11:47:29 +01:00
hajdul88
e01bd80cc9 feat: adds missing checks to cot integration test 2025-12-10 11:47:10 +01:00
hajdul88
6a057711a2 feat: adds missing tests to context extension retriever integration test 2025-12-10 11:46:37 +01:00
hajdul88
723bcd70a2 feat: adds missing checks to graph completion retriever integration test 2025-12-10 11:46:16 +01:00
hajdul88
49f4938e11 feat: adds missing checks to chunks retriever 2025-12-10 11:45:37 +01:00
hajdul88
2bbaf8b6a0 feat: adds chunks retriever tests with new fixture structure 2025-12-10 11:21:22 +01:00
hajdul88
85014eaac3 feat: adds context extension + COT graph completion tests with new fixture structure 2025-12-10 11:20:56 +01:00
hajdul88
e7f3e851c0 feat: adds graph completion retriever tests with new fixture 2025-12-10 11:20:29 +01:00
hajdul88
48a3da6ff0 feat: adds rag completion retriever with restructured fixture 2025-12-10 11:20:06 +01:00
hajdul88
8199274298 feat: adds test_structured_output integration test with new fixture 2025-12-10 11:19:12 +01:00
hajdul88
3ac0e980f0 feat: adds summaries retriever with new fixture 2025-12-10 11:18:26 +01:00
hajdul88
7961e96710 chore: removes integration tests that pretended to be unit tests 2025-12-10 11:00:20 +01:00
Pavel Zorin
2ca194c28f fix format 2025-12-09 18:22:44 +01:00
Pavel Zorin
d932ee4bd9 Specify file type 2025-12-09 17:58:34 +01:00
Pavel Zorin
d0b914acaa Chore: Remove Ontology file size limit. Code duplications 2025-12-09 17:55:43 +01:00
Vasilije
49f7c5188c
feat: avoid double edge vector search in triplet search (#1877)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->
Eliminates double vector search for edges by ensuring all edge lookups
happen once in the retrieval layer.
- `brute_force_triplet_search`: Always includes
"EdgeType_relationship_name" in collections
- `CogneeGraph.map_vector_distances_to_graph_edges`: Removed internal
vector search fallback; only maps provided distances.
- Tests updated to reflect the new behavior.

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [x] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Bug Fixes**
* Ensured relationship edges are automatically included in search
collections, improving search completeness and accuracy.

* **Refactor**
* Simplified graph edge distance mapping logic by removing unnecessary
external dependencies, resulting in more efficient edge processing
during retrieval operations.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-09 13:23:57 +01:00
lxobr
c04d255aca feat: remove secondary search 2025-12-08 17:29:25 +01:00