chore: adds id generation to memify triplet embedding pipeline (#1895)
<!-- .github/pull_request_template.md -->
## Description
This PR adds id generation to the Triplet objects in triplet embedding
memify pipeline. In some edge cases duplicated elements could have been
ingested into the collection
## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):
## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->
## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **Enhancements**
* Relationship data now includes unique identifiers for improved
tracking and data management capabilities.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
This commit is contained in:
parent
69e36cc834
commit
bad22ba26b
1 changed files with 7 additions and 1 deletions
|
|
@ -1,5 +1,6 @@
|
|||
from typing import AsyncGenerator, Dict, Any, List, Optional
|
||||
from cognee.infrastructure.databases.graph.get_graph_engine import get_graph_engine
|
||||
from cognee.modules.engine.utils import generate_node_id
|
||||
from cognee.shared.logging_utils import get_logger
|
||||
from cognee.modules.graph.utils.convert_node_to_data_point import get_all_subclasses
|
||||
from cognee.infrastructure.engine import DataPoint
|
||||
|
|
@ -155,7 +156,12 @@ def _process_single_triplet(
|
|||
|
||||
embeddable_text = f"{start_node_text}-›{relationship_text}-›{end_node_text}".strip()
|
||||
|
||||
triplet_obj = Triplet(from_node_id=start_node_id, to_node_id=end_node_id, text=embeddable_text)
|
||||
relationship_name = relationship.get("relationship_name", "")
|
||||
triplet_id = generate_node_id(str(start_node_id) + str(relationship_name) + str(end_node_id))
|
||||
|
||||
triplet_obj = Triplet(
|
||||
id=triplet_id, from_node_id=start_node_id, to_node_id=end_node_id, text=embeddable_text
|
||||
)
|
||||
|
||||
return triplet_obj, None
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue