cognee/cognee/tests
Vasilije 310e9e97ae
feat: list vector distance in cogneegraph (#1926)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

- `map_vector_distances_to_graph_nodes` and
`map_vector_distances_to_graph_edges` accept both single-query (flat
list) and multi-query (nested list) inputs.
- `query_list_length` controls the mode: omit it for single-query
behavior, or provide it to enable multi-query mode with strict length
validation and per-query results.
- `vector_distance` on `Node` and `Edge` is now a list (one distance per
query). Constructors set it to `None`, and `reset_distances` initializes
it at the start of each search.
- `Node.update_distance_for_query` and `Edge.update_distance_for_query`
are the only methods that write to `vector_distance`. They ensure the
list has enough elements and keep unmatched queries at the penalty
value.
- `triplet_distance_penalty` is the default distance value used
everywhere. Unmatched nodes/edges and missing scores all use this same
penalty for consistency.
- `edges_by_distance_key` is an index mapping edge labels to matching
edges. This lets us update all edges with the same label at once,
instead of scanning the full edge list repeatedly.
- `calculate_top_triplet_importances` returns `List[Edge]` for
single-query mode and `List[List[Edge]]` for multi-query mode.


## Acceptance Criteria
<!--
* Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements;
* Include instructions on how to verify the changes. Describe how to
test it locally;
* Proof that it's sufficiently tested.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [x] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [x] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Multi-query support for mapping/scoring node and edge distances and a
configurable triplet distance penalty.
* Distance-keyed edge indexing for more accurate distance-to-edge
matching.

* **Refactor**
* Vector distance metadata changed from scalars to per-query lists;
added reset/normalization and per-query update flows.
* Node/edge distance initialization now supports deferred/listed
distances.

* **Tests**
* Updated and expanded tests for multi-query flows, list-based
distances, edge-key handling, and related error cases.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-23 14:47:27 +01:00
..
cli_tests Remove all references to SearchType.INSIGHTS across the codebase, meaningfully replacing it with SearchType.GRAPH_COMPLETION where applicable. 2025-10-08 12:13:59 +01:00
integration chore: moves unit tests into their correct directory 2025-12-16 17:33:20 +01:00
subprocesses feat: Redis lock integration and Kuzu agentic access fix (#1504) 2025-10-16 15:48:20 +02:00
tasks fix: Resolve issue with entity extraction test 2025-11-04 16:43:41 +01:00
test_data feat: csv ingestion loader & chunk 2025-10-22 16:56:46 +08:00
unit feat: list vector distance in cogneegraph (#1926) 2025-12-23 14:47:27 +01:00
__init__.py
test_add_docling_document.py fix: Resolve docling test 2025-10-31 13:57:12 +01:00
test_advanced_pdf_loader.py
test_chromadb.py COG-3050 - remove insights search (#1506) 2025-10-11 09:09:56 +02:00
test_cleanup_unused_data.py fix linting 2025-12-19 10:38:44 +01:00
test_cognee_server_start.py chore: introduces 1 file upload in ontology endpoint (#1899) 2025-12-15 18:30:35 +01:00
test_concurrent_subprocess_access.py feat: Redis lock integration and Kuzu agentic access fix (#1504) 2025-10-16 15:48:20 +02:00
test_conversation_history.py feature: adds triplet embedding via memify (#1832) 2025-12-02 18:27:08 +01:00
test_custom_data_label.py fix: Resolve issues with data label PR, add tests and upgrade migration 2025-12-16 20:59:17 +01:00
test_custom_model.py
test_dataset_database_handler.py feat: Add dataset database handler info (#1887) 2025-12-12 13:22:03 +01:00
test_dataset_delete.py feat: Add database deletion on dataset delete (#1893) 2025-12-15 18:15:48 +01:00
test_deduplication.py test: Rollback deduplication test 2025-10-01 18:10:57 +02:00
test_delete_by_id.py
test_delete_hard.py
test_delete_soft.py
test_edge_centered_payload.py feat: Adds edge centered payload and embedding structure during ingestion (#1853) 2025-12-10 17:10:06 +01:00
test_edge_ingestion.py feat: optimize repeated entity extraction (#1682) 2025-10-30 13:56:06 +01:00
test_feedback_enrichment.py fix: Use same dataset name accross cognee calls 2025-10-30 17:40:00 +01:00
test_graph_visualization_permissions.py
test_kuzu.py Revert "Revert "fix: search without prior cognify"" 2025-10-22 13:21:51 +01:00
test_lancedb.py COG-3050 - remove insights search (#1506) 2025-10-11 09:09:56 +02:00
test_library.py refactor: fix search result for library test 2025-10-29 19:14:53 +01:00
test_load.py test: fix path based on pr comment 2025-11-03 17:06:51 +01:00
test_multi_tenancy.py feat: Add test for multi tenancy, add ability to share name for dataset across tenants for one user 2025-11-07 15:50:49 +01:00
test_neo4j.py Revert "Revert "fix: search without prior cognify"" 2025-10-22 13:21:51 +01:00
test_neptune_analytics_graph.py
test_neptune_analytics_hybrid.py
test_neptune_analytics_vector.py COG-3050 - remove insights search (#1506) 2025-10-11 09:09:56 +02:00
test_parallel_databases.py feat: enable multi user for falkor (#1689) 2025-11-11 17:03:48 +01:00
test_permissions.py updates old no asserts test + yml 2025-12-19 10:32:45 +01:00
test_pgvector.py COG-3050 - remove insights search (#1506) 2025-10-11 09:09:56 +02:00
test_pipeline_cache.py feat: make pipeline processing cache optional (#1876) 2025-12-12 13:11:31 +01:00
test_relational_db_migration.py Relational DB migration test search (#1752) 2025-11-12 21:32:22 +01:00
test_remote_kuzu.py COG-3050 - remove insights search (#1506) 2025-10-11 09:09:56 +02:00
test_remote_kuzu_stress.py
test_s3.py
test_s3_file_storage.py Deprecate SearchType.INSIGHTS, replace all references to default search type - SearchType.GRAPH_COMPLETION 2025-10-08 12:13:59 +01:00
test_search_db.py fix: improve graph distance mapping 2025-12-18 14:52:35 +01:00
test_starter_pipelines.py
test_telemetry.py
test_temporal_graph.py