fix: fixes lancedb batch handling (#1872)
<!-- .github/pull_request_template.md -->
## Description
Fixes lancedb batch handling issue. Duplicated elements could appear in
the collections when duplicates happen in the same insert
batch.
## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):
## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->
## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Improved data integrity by implementing deduplication logic to
eliminate duplicate entries and ensure only the latest version is
retained.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
This commit is contained in:
parent
9571641199
commit
d5bf5cf4e9
1 changed files with 2 additions and 0 deletions
|
|
@ -193,6 +193,8 @@ class LanceDBAdapter(VectorDBInterface):
|
|||
for (data_point_index, data_point) in enumerate(data_points)
|
||||
]
|
||||
|
||||
lance_data_points = list({dp.id: dp for dp in lance_data_points}.values())
|
||||
|
||||
async with self.VECTOR_DB_LOCK:
|
||||
await (
|
||||
collection.merge_insert("id")
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue