cognee

Author	SHA1	Message	Date
Andrej Milicevic	116b6f1eeb	chore: formatting	2025-12-12 13:46:16 +01:00
Andrej Milicevic	a225d7fc61	test: revert some changes	2025-12-12 13:44:58 +01:00
Andrej Milicevic	a337f4e54c	test: testing logger	2025-12-12 13:02:55 +01:00
Andrej Milicevic	bce6094010	test: change logger	2025-12-12 12:43:54 +01:00
Andrej Milicevic	c48b274571	test: remove delete error from mcp test	2025-12-12 11:53:40 +01:00
Andrej Milicevic	3b8a607b5f	test: fix errors in mcp test	2025-12-12 11:37:27 +01:00
Andrej Milicevic	e211e66275	chore: remove quick option to isolate mcp CI test	2025-12-11 18:29:17 +01:00
Andrej Milicevic	0f50c993ac	chore: add quick option to isolate mcp CI test	2025-12-11 18:20:07 +01:00
Andrej Milicevic	248ba74592	test: remove codify-related stuff from mcp test	2025-12-11 18:18:42 +01:00
Igor Ilic	46ddd4fd12	feat: add dataset database handler logic and neo4j/lancedb/kuzu handlers (#1776 ) <!-- .github/pull_request_template.md --> ## Description Add ability to use multi tenant multi user mode with Neo4j ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] I have tested my changes thoroughly before submitting this PR - [ ] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * New Features * Multi-user support with per-dataset database isolation enabled by default, allowing backend access control for secure data separation. * Configurable database handlers via environment variables (GRAPH_DATASET_DATABASE_HANDLER, VECTOR_DATASET_DATABASE_HANDLER) for flexible deployment options. * Chores * Database schema migration to support per-user dataset database configurations. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-12-11 14:15:20 +01:00
Igor Ilic	0a1ed79340	refactor: change neo4j_aura to neo4j_aura_dev	2025-12-11 13:05:23 +01:00
Pavel Zorin	fe7e97be45	Chore: Remove Ontology file size limit. Code duplications (#1880 ) <!-- .github/pull_request_template.md --> ## Description We received a complaint about the 10MB file size limit. Removed code duplications More strict types <!-- Please provide a clear, human-generated description of the changes in this PR. DO NOT use AI-generated descriptions. We want to understand your thought process and reasoning. --> ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] I have tested my changes thoroughly before submitting this PR - [ ] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Support for supplying optional per-file descriptions when uploading multiple ontologies. * Improvements * Removed the 10MB file size limit for ontology uploads, allowing larger files. * Streamlined and more robust upload handling with improved per-file validation and safer upload behavior. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-12-11 10:49:55 +01:00
Pavel Zorin	88f61f9bdb	Added filename check	2025-12-10 17:24:31 +01:00
hajdul88	001fbe699e	feat: Adds edge centered payload and embedding structure during ingestion (#1853 ) <!-- .github/pull_request_template.md --> ## Description This pull request introduces edge‑centered payloads to the ingestion process. Payloads are stored in the Triplet_text collection which is compatible with the triplet_embedding memify pipeline. Changes in This PR: - Refactored custom edge handling, from now on they can be passed to the add_data_points method so the ingestion is centralized and is happening in one place. - Added private methods to handle edge centered payload creation inside the add_data_points.py - Added unit tests to cover the new functionality - Added integration tests - Added e2e tests Acceptance Criteria and Testing Scenario 1: -Set TRIPLET_EMBEDDING env var to True -Run prune, add, cognify -Verify the vector DB contains a non empty Triplet_text collection and the number of triplets are matching with the number of edges in the graph database -Use the new triplet_completion search type and confirm it works correctly. Scenario 2: -Set TRIPLET_EMBEDDING env var to True -Run prune, add, cognify -Verify the vector DB does not have the Triplet_text collection -You should receive an error indicating that the Triplet_text is not available ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [x] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [x] I have tested my changes thoroughly before submitting this PR - [x] This PR contains minimal changes necessary to address the issue/feature - [x] My code follows the project's coding standards and style guidelines - [x] I have added tests that prove my fix is effective or that my feature works - [x] I have added necessary documentation (if applicable) - [x] All new and existing tests pass - [x] I have searched existing PRs to ensure this change hasn't been submitted already - [x] I have linked any relevant issues in the description - [x] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Triplet embeddings supported—embeddings created from graph edges plus connected node text * Ability to supply custom edges when adding data points * New configuration toggle to enable/disable triplet embedding * Tests * Added comprehensive unit and end-to-end tests for edge-centered payloads and triplet embedding * New CI job to run the edge-centered payload e2e test * Bug Fixes * Adjusted server start behavior to surface process output in parent logs <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Pavel Zorin <pazonec@yandex.ru>	2025-12-10 17:10:06 +01:00
Pavel Zorin	2ca194c28f	fix format	2025-12-09 18:22:44 +01:00
Pavel Zorin	d932ee4bd9	Specify file type	2025-12-09 17:58:34 +01:00
Pavel Zorin	d0b914acaa	Chore: Remove Ontology file size limit. Code duplications	2025-12-09 17:55:43 +01:00
Vasilije	49f7c5188c	feat: avoid double edge vector search in triplet search (#1877 ) <!-- .github/pull_request_template.md --> ## Description <!-- Please provide a clear, human-generated description of the changes in this PR. DO NOT use AI-generated descriptions. We want to understand your thought process and reasoning. --> Eliminates double vector search for edges by ensuring all edge lookups happen once in the retrieval layer. - `brute_force_triplet_search`: Always includes "EdgeType_relationship_name" in collections - `CogneeGraph.map_vector_distances_to_graph_edges`: Removed internal vector search fallback; only maps provided distances. - Tests updated to reflect the new behavior. ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [x] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [x] I have tested my changes thoroughly before submitting this PR - [x] This PR contains minimal changes necessary to address the issue/feature - [x] My code follows the project's coding standards and style guidelines - [x] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [x] All new and existing tests pass - [x] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [x] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Ensured relationship edges are automatically included in search collections, improving search completeness and accuracy. * Refactor * Simplified graph edge distance mapping logic by removing unnecessary external dependencies, resulting in more efficient edge processing during retrieval operations. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-12-09 13:23:57 +01:00
lxobr	c04d255aca	feat: remove secondary search	2025-12-08 17:29:25 +01:00
Vasilije	7a3138edf8	fix: remove double quotes from llmconfig str params (#1758 ) <!-- .github/pull_request_template.md --> ## Description <!-- Please provide a clear, human-generated description of the changes in this PR. DO NOT use AI-generated descriptions. We want to understand your thought process and reasoning. --> Recently a few cases cryptic errors like in issue #1721 have occurred across cognee use cases. Debugging #1721 however, I found out that if LLM_API_KEY happens to have `"` quotation marks as part of it's value, for example, when already part of the ENV <img width="1014" height="507" alt="Screenshot 2025-11-07 at 16 58 22" src="https://github.com/user-attachments/assets/54b7cbb0-5bdc-4b40-b2b1-aed6c5d3d886" /> Then it makes it's way into Cognee and gets treated as part of the API key. By default, we do not do sanitization nor cleanup. While most of the time quotation marks get handled for us: 1. `export KEY="VALUE"` will strip it 2. python dotenv will strip it if read from `.env` But issues like https://github.com/docker/cli/issues/3630 and #1721 demonstrate that we have to have some handling on our end instead of assuming it's stripped. ## This PR This PR sets up a list of string params we want to strip + some that we may want to. We may want to avoid doing this for all params, which is why I went with selective approach. TODO: add testing ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] I have tested my changes thoroughly before submitting this PR - [ ] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Configuration values with surrounding quotes are now automatically normalized and cleaned during system initialization, ensuring consistent and predictable data handling across all configuration parameters. * Tests * Added comprehensive unit tests to validate automatic quote removal from configuration values, covering various scenarios including quoted, unquoted, empty, and edge cases with mixed and internal quotes. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-12-08 05:10:23 +01:00
Vasilije	40bbdd1ac7	fix: install nvm and node for -ui cli command (#1836 ) <!-- .github/pull_request_template.md --> ## Description <!-- Please provide a clear, human-generated description of the changes in this PR. DO NOT use AI-generated descriptions. We want to understand your thought process and reasoning. --> ## Type of Change <!-- Please check the relevant option --> - [x] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] I have tested my changes thoroughly before submitting this PR - [ ] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Enhanced Node.js and npm environment management for improved system compatibility on Unix-like platforms. * Chores * Updated Next.js to v16, React to v19.2, and Auth0 SDK to v4.13.1 for compatibility and performance improvements. * Removed CrewAI workflow trigger component. * Removed user feedback submission form. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-12-08 05:09:49 +01:00
Vasilije	52b0029fbf	fix: Resolve issue with BAML rate limit handling (#1813 ) <!-- .github/pull_request_template.md --> ## Description Add rate limit handling for BAML ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] I have tested my changes thoroughly before submitting this PR - [ ] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Improvements * Centralized rate limiting for LLM and embedding requests to better manage API throughput. * Retry policies adjusted: longer initial backoff and reduced max retries to improve stability under rate limits. * Chores * Refactored rate-limiting implementation from decorator-based to context-manager usage across services. * Tests * Unit tests and mocks updated to reflect the new context-manager rate-limiting approach. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-12-08 05:07:14 +01:00
Igor Ilic	67a4d40257	chore: regen poetry lock file	2025-12-05 19:51:26 +01:00
Igor Ilic	2f572ae509	test: Update embeding limiter test	2025-12-05 19:18:48 +01:00
Igor Ilic	a66b2ceeca	refactor: reduce ammount of retry attempts for baml llm calls	2025-12-05 18:58:59 +01:00
Igor Ilic	7deaa6e8e9	feat: Add RPM limiting to Cognee	2025-12-05 18:56:34 +01:00
Igor Ilic	0c97a400b0	feat: Add RPM control	2025-12-05 15:40:24 +01:00
Igor Ilic	5d0586da28	Merge branch 'dev' into baml-rate-limit-handling	2025-12-05 13:24:07 +01:00
hajdul88	d5bf5cf4e9	fix: fixes lancedb batch handling (#1872 ) <!-- .github/pull_request_template.md --> ## Description Fixes lancedb batch handling issue. Duplicated elements could appear in the collections when duplicates happen in the same insert batch. ## Type of Change <!-- Please check the relevant option --> - [x] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [x] I have tested my changes thoroughly before submitting this PR - [x] This PR contains minimal changes necessary to address the issue/feature - [x] My code follows the project's coding standards and style guidelines - [x] I have added tests that prove my fix is effective or that my feature works - [x] I have added necessary documentation (if applicable) - [x] All new and existing tests pass - [x] I have searched existing PRs to ensure this change hasn't been submitted already - [x] I have linked any relevant issues in the description - [x] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Improved data integrity by implementing deduplication logic to eliminate duplicate entries and ensure only the latest version is retained. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-12-05 12:26:45 +01:00
Vasilije	9571641199	refactor: move codify pipeline out of main repo (#1738 ) <!-- .github/pull_request_template.md --> ## Description <!-- Please provide a clear, human-generated description of the changes in this PR. DO NOT use AI-generated descriptions. We want to understand your thought process and reasoning. --> This PR removes codify, and the code graph pipeline, out of the repository. It also introduces a Custom Pipeline interface, which can be used in the future to define custom pipelines. ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [x] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] I have tested my changes thoroughly before submitting this PR - [ ] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.	2025-12-04 23:10:39 -08:00
Igor Ilic	4afde917a9	Main merge vol4 (#1856 ) <!-- .github/pull_request_template.md --> ## Description Merge main changes into dev branch ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] I have tested my changes thoroughly before submitting this PR - [ ] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Documentation * Overhauled README: renamed tagline, clarified product positioning, reorganized Get Started into Open Source and Cloud paths, streamlined Quickstart, refreshed demos, navigation, and citation/contributing sections. * New Features * Added MCP tools for developer rules management, interaction logging, and enhanced search modes (SUMMARIES, CYPHER, FEELING_LUCKY). * Bug Fixes * Improved embedding handling to support alternate response formats. * Chores * Updated test/dev dependency versions. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-12-04 17:48:11 +01:00
Pavel Zorin	386e0c8234	chore: uv lock check in pre-test workflow (#1773 ) <!-- .github/pull_request_template.md --> `.github/actions/cognee_setup/action.yml` - makes `uv.lock` rebuild disabled by default. `.github/workflows/pre_test.yml` - lightweight checks that signal that PR is not ready for testing. For that moment it verifies that `uv.lock` corresponds to `pyproject.toml`. For the case of failure the run provides the instructions how to fix the `uv.lock` Warning: made an exclusion for python 3.13 dependencies. This doesn't look good to me ## Description Make ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [x] Other (please specify): CI improvement ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] I have tested my changes thoroughly before submitting this PR - [ ] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Made uv lockfile rebuilding optional in CI (disabled by default). * Added a pre-validation workflow to check lockfile integrity before tests. * Ensured pre-test validation runs before basic and end-to-end test jobs. * Updated dependencies to support Python 3.13 with version-specific lxml constraints and adjusted docs dependency structure. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-12-04 15:50:39 +01:00
Pavel Zorin	7033b63cd4	update poetry lock	2025-12-04 13:26:53 +01:00
Pavel Zorin	6d1f1d183a	Removed python-version from pre-test	2025-12-04 11:51:32 +01:00
Pavel Zorin	ed0c0d1823	Potential fix for code scanning alert no. 407: Workflow does not contain permissions Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>	2025-12-04 11:51:32 +01:00
Pavel Zorin	a50810240b	Vesion exclusion for lxml and python 3.13	2025-12-04 11:51:28 +01:00
Pavel Zorin	c7a2138966	CI: uv lock check. Pre-test workflow.	2025-12-04 11:47:35 +01:00
Igor Ilic	7d7f8a249a	Merge branch 'dev' into main-merge-vol4	2025-12-04 10:32:10 +01:00
Igor Ilic	f1c5b9a55f	fix: Resolve DB caching issues when deleting databases	2025-12-03 18:05:47 +01:00
Igor Ilic	fd84edeb74	refactor: change getting of tables during deletion	2025-12-03 15:43:41 +01:00
Igor Ilic	45f32f8bfd	Merge branch 'dev' into multi-tenant-neo4j	2025-12-03 14:37:13 +01:00
Igor Ilic	1961efcc33	fix: Handle scenario when there is no relational database on prune time	2025-12-03 14:27:06 +01:00
Igor Ilic	f4078d1247	feat: Add ability to delete lance and kuzu datasets, add prune to work with multi user mode	2025-12-03 13:10:18 +01:00
Igor Ilic	5698c609f5	test: Update tests with regards to auto scaling changes	2025-12-03 11:47:10 +01:00
Boris Arzentar	0d2e84f58e	test: test_strip_quotes_from_strings	2025-12-03 10:59:17 +01:00
Boris	3288ef01a4	Merge branch 'dev' into fix/remove-double-quotes-from-llmconfig-str-params	2025-12-03 10:05:49 +01:00
hajdul88	d4d190ac2b	feature: adds triplet embedding via memify (#1832 ) <!-- .github/pull_request_template.md --> ## Description This PR introduces triplet embeddings via a new create_triplet_embeddings memify pipeline. The pipeline reads the graph in batches, extracts properties from graph elements based on their datapoint types, and generates combined triplet embeddings. These embeddings are stored in the vector database as a new collection. Changes in This PR: -Added a new create_triplet_embeddings memify pipeline. -Added a new get_triplet_datapoints memify task. -Introduced a new triplet_completion search type. -Added full test coverage --Unit tests: memify task, pipeline, and retriever --Integration tests: memify task, pipeline, and retriever --End-to-end tests: updated session history tests and multi-DB search tests; added tests for triplet_completion and memify pipeline execution Acceptance Criteria and Testing Scenario 1: -Run default add, cognify pipelines -Run create triplet embeddings memify pipeline -Verify the vector DB contains a non empty Triplet_text collection. -Use the new triplet_completion search type and confirm it works correctly. Scenario 2: -Run the default add and cognify pipelines. -Do not run the triplet embeddings memify pipeline. -Attempt to use the triplet_completion search type. -You should receive an error indicating that the triplet embeddings memify pipeline must be executed first. ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [x] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [x] I have tested my changes thoroughly before submitting this PR - [x] This PR contains minimal changes necessary to address the issue/feature - [x] My code follows the project's coding standards and style guidelines - [x] I have added tests that prove my fix is effective or that my feature works - [x] I have added necessary documentation (if applicable) - [x] All new and existing tests pass - [x] I have searched existing PRs to ensure this change hasn't been submitted already - [x] I have linked any relevant issues in the description - [x] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Triplet-based search with LLM-powered completions (TRIPLET_COMPLETION) * Batch triplet retrieval and a triplet embeddings pipeline for extraction, indexing, and optional background processing * Context retrieval from triplet embeddings with optional caching and conversation-history support * New Triplet data type exposed for indexing and search * Examples * End-to-end example demonstrating triplet embeddings extraction and TRIPLET_COMPLETION search * Tests * Unit and integration tests covering triplet extraction, retrieval, embedding pipeline, and completion flows <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Pavel Zorin <pazonec@yandex.ru>	2025-12-02 18:27:08 +01:00
Igor Ilic	4e8b2ffc3e	chore: Update lock files	2025-12-02 17:51:30 +01:00
Igor Ilic	c7810e9fdb	Merge branch 'dev' into main-merge-vol4	2025-12-02 17:40:09 +01:00
Igor Ilic	1282905888	feat: add password encryption for Neo4j	2025-12-02 16:34:16 +01:00

1 2 3 4 5 ...

4516 commits