cognee/cognee/modules/retrieval
lxobr 6223ecf05b
feat: optimize repeated entity extraction (#1682)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

- Added an `edge_text` field to edges that auto-fills from
`relationship_type` if not provided.
- Containts edges now store descriptions for better embedding
- Updated and refactored indexing so that edge_text gets embedded and
exposed
- Updated retrieval to use the new embeddings 
- Added a test to verify edge_text exists in the graph with the correct
format.

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [x] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [x] Code refactoring
- [x] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-10-30 13:56:06 +01:00
..
context_providers fix: Remove creation of default user during search 2025-09-23 18:43:05 +02:00
entity_extractors feat: entity brute force triplet search [COG-1325] (#589) 2025-03-05 11:17:58 +01:00
exceptions adds new base errors to retrieval exceptions 2025-08-13 12:36:31 +02:00
utils feat: optimize repeated entity extraction (#1682) 2025-10-30 13:56:06 +01:00
__init__.py Transition to new retrievers, update searches (#585) 2025-02-27 15:25:24 +01:00
base_feedback.py chore: fixes ruff 2025-08-18 18:36:04 +02:00
base_graph_retriever.py feat: adds session id to get_completion methods 2025-10-16 16:26:58 +02:00
base_retriever.py feat: adds session id to get_completion methods 2025-10-16 16:26:58 +02:00
chunks_retriever.py ruff formatting 2025-10-15 18:02:10 +02:00
code_retriever.py ruff formatting 2025-10-15 18:02:10 +02:00
coding_rules_retriever.py feat: implement combined context search (#1341) 2025-09-10 16:33:08 +02:00
completion_retriever.py chore: renames conversation history save method 2025-10-20 10:28:03 +02:00
cypher_search_retriever.py log warning and early exit when graph is empty and is queried 2025-10-22 13:21:51 +01:00
EntityCompletionRetriever.py chore: renames conversation history save method 2025-10-20 10:28:03 +02:00
graph_completion_context_extension_retriever.py chore: renames conversation history save method 2025-10-20 10:28:03 +02:00
graph_completion_cot_retriever.py refactor: unify structured and str completion 2025-10-23 12:30:55 +02:00
graph_completion_retriever.py log warning and early exit when graph is empty and is queried 2025-10-22 13:21:51 +01:00
graph_summary_completion_retriever.py feat: Add only_context and system prompt flags for search 2025-08-28 13:43:37 +02:00
jaccard_retrival.py chore: format files 2025-09-22 11:33:19 +02:00
lexical_retriever.py ruff formatting 2025-10-15 18:02:10 +02:00
natural_language_retriever.py log warning and early exit when graph is empty and is queried 2025-10-22 13:21:51 +01:00
summaries_retriever.py ruff formatting 2025-10-15 18:02:10 +02:00
temporal_retriever.py feature: adds the concept of now to the qa for temporal queries (#1685) 2025-10-28 15:27:29 +01:00
user_qa_feedback.py fix: fixes distributed pipeline (#1454) 2025-10-09 14:06:25 +02:00