cognee

Author	SHA1	Message	Date
Vasilije	beb8932fea	fix: Handle Dependabot security issues (#1968 ) <!-- .github/pull_request_template.md --> ## Description Fix security issue with langchain raised by Dependabot: https://github.com/topoteretes/cognee/security/dependabot/73 Older version of langchain has an issue ## Acceptance Criteria <!-- * Key requirements to the new feature or modification; * Proof that the changes work and meet the requirements; * Include instructions on how to verify the changes. Describe how to test it locally; * Proof that it's sufficiently tested. --> ## Type of Change <!-- Please check the relevant option --> - [X ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ X] I have tested my changes thoroughly before submitting this PR - [X ] This PR contains minimal changes necessary to address the issue/feature - [ X] My code follows the project's coding standards and style guidelines - [ X] I have added tests that prove my fix is effective or that my feature works - [ X] I have added necessary documentation (if applicable) - [X ] All new and existing tests pass - [X ] I have searched existing PRs to ensure this change hasn't been submitted already - [ X] I have linked any relevant issues in the description - [ X] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Addresses Dependabot alerts by updating critical dependencies and refreshing the Python lockfile. > > - Adds `langchain-core` to optional deps and updates locked version to `1.2.6` (introduces `uuid-utils`) > - Tightens HTTP stack: raises `aiohttp` to `>=3.13.3`, adds `urllib3` runtime dep (locked to `2.6.2`) > - Bumps frontend `next` to `16.1.7` > - Regenerates `uv.lock` with numerous package/version updates and platform wheels; adjusts `kubernetes` to `33.1.0` with `oauthlib` dep > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `1eb4197f1a`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Updated Next.js to 16.1.7. * Relaxed aiohttp dependency constraint. * Added urllib3 as a dependency. * Added langchain-core to optional dependencies. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-01-08 21:29:20 +01:00
Vasilije	c3c8961631	Merge branch 'dev' into ffix_sec	2026-01-08 21:29:02 +01:00
Vasilije	abc6faff34	fix: fix security issue (#1967 ) <!-- .github/pull_request_template.md --> ## Description Fix security issue reported by the user https://github.com/topoteretes/cognee/issues/1950 ## Acceptance Criteria <!-- * Key requirements to the new feature or modification; * Proof that the changes work and meet the requirements; * Include instructions on how to verify the changes. Describe how to test it locally; * Proof that it's sufficiently tested. --> ## Type of Change <!-- Please check the relevant option --> - [x] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [x] I have tested my changes thoroughly before submitting this PR - [x] This PR contains minimal changes necessary to address the issue/feature - [x] My code follows the project's coding standards and style guidelines - [x] I have added tests that prove my fix is effective or that my feature works - [x] I have added necessary documentation (if applicable) - [x] All new and existing tests pass - [x] I have searched existing PRs to ensure this change hasn't been submitted already - [x] I have linked any relevant issues in the description - [x] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- CURSOR_SUMMARY --> --- > [!NOTE] > - Dependencies: Adds `cbor2>=5.8.0` to `pyproject.toml`; updates `uv.lock` (including version bump and wheels) to reflect new dependency. > - CI/Docs: Refines `.github/pull_request_template.md` (simplified change types; renamed `Screenshots` section to request proof of local tests passing). > - Code cleanup: Minor formatting changes in `LiteLLMEmbeddingEngine.py` and `get_api_auth_backend.py` with no functional impact. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `aa4ab1ed8a`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Added the cbor2 serialization library to project dependencies. * Documentation * Updated the pull request template: simplified change-type options, tightened acceptance criteria, expanded the pre-submission checklist with additional verification items, and renamed/clarified the screenshots section to request local test evidence. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-01-08 21:28:21 +01:00
Vasilije	ada0a2be4f	Merge branch 'dev' into fix_security_issue	2026-01-08 21:28:11 +01:00
Vasilije	b1ff473a38	COG-3395: Chore: pre-commit, pre-commit action, contribution guide update (#1979 ) ## Description Revisited the `CONTRIBUTING.md`: * Added the `Required tools` * Pre-commit requirement. It replaces `ruff` and other linting guides * Fixed `test_library.py` paths. Made sure that the testing guide is complete and works * Added a `pre-commit` step to `Pre-Test` workflow. It will fail if `pre-commit` has issues and no other tests will be triggered * Added a sufficient LLM configuration example for tests. Moved `cognee/.env.example` to the project root for convenience >>> Requires: https://github.com/topoteretes/cognee/pull/1980 <<< ## Acceptance Criteria `pre-commit` action works Tested pre-commit locally. If a commit violates the rules - it rejects it and fixes the issues. Then we need to `git commit ...` again. ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [x] Other (please specify): CI and DevExp improvement ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] I have tested my changes thoroughly before submitting this PR - [ ] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Documentation * Expanded contributor guide with setup, required tools, testing instructions, examples, and updated PR submission guidance. * Updated pull-request checklist to reference contributing instructions. * Chores * Added three new local environment variables for LLM configuration and updated example env file. * Added a pre-commit validation step to CI. * Updated ignore list to exclude a local environment file. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-01-08 19:59:20 +01:00
Pavel Zorin	3e602fdad7	Renamed the pre_test workflow	2026-01-08 19:19:11 +01:00
Pavel Zorin	15a88accac	Chore: use pre-commit action	2026-01-08 19:19:11 +01:00
Pavel Zorin	b0fe1a8439	CI: Speed up pre-test workflow	2026-01-08 19:19:11 +01:00
Pavel Zorin	962ddf4257	Chore: pre-commit, pre-commit action, contribution guide update	2026-01-08 19:19:07 +01:00
Vasilije	fde921ca3e	chore: Remove trailing whitespaces in the project, fix YAMLs (#1980 ) <!-- .github/pull_request_template.md --> ## Description Removes trailing whitespaces from all files in the project. Needed by https://github.com/topoteretes/cognee/pull/1979 ## Acceptance Criteria <!-- * Key requirements to the new feature or modification; * Proof that the changes work and meet the requirements; * Include instructions on how to verify the changes. Describe how to test it locally; * Proof that it's sufficiently tested. --> ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] I have tested my changes thoroughly before submitting this PR - [ ] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Added `topK` parameter support in search functionality to control result count (1-100). * Added Python tool configuration via mise.toml. * Documentation * Enhanced issue templates with improved UI metadata, labels, and clearer guidance for bug reports, feature requests, and documentation issues. * Expanded CONTRIBUTING.md with comprehensive contribution guidelines and community information. * Chores * Removed unused modules: `cognee.modules.retrieval` and `cognee.tasks.temporal_graph`. * Applied consistent formatting and whitespace normalization across configuration files, workflows, and documentation. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-01-08 19:16:09 +01:00
Pavel Zorin	7a48e22b13	chore: Remove trailing whitespaces in the project, fix YAMLs	2026-01-08 17:15:53 +01:00
vasilije	1eb4197f1a	add uv lock	2026-01-08 16:05:36 +01:00
Vasilije	c50b5fa139	Merge branch 'dev' into fix_security_issue	2026-01-08 16:00:21 +01:00
Vasilije	42dc9351f2	Merge branch 'dev' into ffix_sec	2026-01-08 15:53:44 +01:00
Vasilije	5cf63617a1	Fix dev branch ci (#1978 ) <!-- .github/pull_request_template.md --> ## Description Resolve issues with CI for dev branch with slight contributor PR refactors ## Acceptance Criteria <!-- * Key requirements to the new feature or modification; * Proof that the changes work and meet the requirements; * Include instructions on how to verify the changes. Describe how to test it locally; * Proof that it's sufficiently tested. --> ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] I have tested my changes thoroughly before submitting this PR - [ ] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.	2026-01-08 15:49:59 +01:00
Igor Ilic	7de3356b1f	fix: Resolve issue with migration order	2026-01-08 14:28:39 +01:00
Igor Ilic	00697c4491	chore: Update poetry lock	2026-01-08 14:21:05 +01:00
Vasilije	1772439ea5	Update aiohttp version in pyproject.toml	2026-01-08 13:49:39 +01:00
Igor Ilic	fd6a77deec	refactor: Add TODO for missing llm config parameters	2026-01-08 13:31:25 +01:00
Igor Ilic	f3215e16f9	refactor: Remove silent handling of lifetime assignment	2026-01-08 12:51:11 +01:00
Igor Ilic	07b91f3a5f	refactor: Remove comment from Dockerfile	2026-01-08 12:45:03 +01:00
vasilije	af72dd2fc2	fixes to ruff format	2026-01-07 16:26:36 +01:00
vasilije	aa4ab1ed8a	reformat	2026-01-06 18:05:34 +01:00
vasilije	f1f955b76a	fix	2026-01-06 18:03:43 +01:00
vasilije	555eef69e3	added update to pr template	2026-01-06 17:53:42 +01:00
vasilije	5c365abf66	added update to pr template	2026-01-06 17:53:38 +01:00
vasilije	295f623db3	fix security issue	2026-01-06 17:47:54 +01:00
Vasilije	34c6652939	add configurable JWT expiration, cookie domain, CORS origins, and service restart policies (#1956 ) <!-- .github/pull_request_template.md --> ## Description This PR introduces several configuration improvements to enhance the application's flexibility and reliability. The changes make JWT token expiration and cookie domain configurable via environment variables, improve CORS configuration, and add container restart policies for better uptime. JWT Token Expiration Configuration: - Added `JWT_LIFETIME_SECONDS` environment variable to configure JWT token expiration time - Set default expiration to 3600 seconds (1 hour) for both API and client authentication backends - Removed hardcoded expiration values in favor of environment-based configuration - Added documentation comments explaining the JWT strategy configuration Cookie Domain Configuration: - Added `AUTH_TOKEN_COOKIE_DOMAIN` environment variable to configure cookie domain - When not set or empty, cookie domain defaults to `None` allowing cross-domain usage - Added documentation explaining cookie expiration is handled by JWT strategy - Updated default_transport to use environment-based cookie domain CORS Configuration Enhancement: - Added `CORS_ALLOWED_ORIGINS` environment variable with default value of `''` - Configured frontend to use `NEXT_PUBLIC_BACKEND_API_URL` environment variable - Set default backend API URL to `http://localhost:8000` Docker Service Reliability:* - Added `restart: always` policy to all services (cognee, frontend, neo4j, chromadb, and postgres) - This ensures services automatically restart on failure or system reboot - Improves container reliability and uptime in production and development environments ## Acceptance Criteria <!-- * Key requirements to the new feature or modification; * Proof that the changes work and meet the requirements; * Include instructions on how to verify the changes. Describe how to test it locally; * Proof that it's sufficiently tested. --> ## Type of Change <!-- Please check the relevant option --> - [x] Bug fix (non-breaking change that fixes an issue) - [x] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [x] I have tested my changes thoroughly before submitting this PR - [x] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Services now automatically restart on failure for improved reliability. * Configuration * Cookie domain for authentication is now configurable via environment variable, defaulting to None if not set. * JWT token lifetime is now configurable via environment variable, with a 3600-second default. * CORS allowed origins are now configurable with a default of all origins (). Frontend backend API URL is now configurable, defaulting to http://localhost:8000. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-01-04 10:37:13 +01:00
Vasilije	4e881cd00f	Fix: Handle empty API key in LiteLLMEmbeddingEngine (#1959 ) fix(embeddings): handle empty API key in LiteLLMEmbeddingEngine - Add conditional check for empty API key to prevent authentication errors- Set default API key to "EMPTY" when no valid key is provided- This ensures proper fallback behavior when API key is not configured ``` <!-- .github/pull_request_template.md --> ## Description This PR fixes an issue where the `LiteLLMEmbeddingEngine` throws an authentication error when the `EMBEDDING_API_KEY` environment variable is empty or not set. The error message indicated `"api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable"`. Log Error: 2025-12-23T11:36:58.220908 [error ] Error embedding text: litellm.AuthenticationError: AuthenticationError: OpenAIException - The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable [LiteLLMEmbeddingEngine] Root Cause: When initializing the embedding engine, if the `api_key` parameter is an empty string, the underlying LiteLLM client doesn't treat it as "no key provided" but instead uses this empty string to make API requests, triggering authentication failure. Solution: Added a conditional check in the code that creates the `LiteLLMEmbeddingEngine` instance. If the `EMBEDDING_API_KEY` read from configuration is empty (`None` or empty string), we explicitly set the `api_key` parameter passed to the engine constructor to a non-empty placeholder string `"EMPTY"`. This aligns with LiteLLM's handling of optional authentication and prevents exceptions in scenarios where keys are not required or need to be obtained from other sources How to Reproduce: Configure the application with the following settings (as shown in the error log): EMBEDDING_PROVIDER="custom" EMBEDDING_MODEL="openai/Qwen/Qwen3-Embedding-xxx" EMBEDDING_ENDPOINT="xxxxx" EMBEDDING_API_VERSION="" EMBEDDING_DIMENSIONS=1024 EMBEDDING_MAX_TOKENS=16384 EMBEDDING_BATCH_SIZE=10 # If embedding key is not provided same key set for LLM_API_KEY will be used EMBEDDING_API_KEY="" ## Acceptance Criteria <!-- * Key requirements to the new feature or modification; * Proof that the changes work and meet the requirements; * Include instructions on how to verify the changes. Describe how to test it locally; * Proof that it's sufficiently tested. --> ## Type of Change <!-- Please check the relevant option --> - [x] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [x] I have tested my changes thoroughly before submitting this PR - [x] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [x] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Bug Fixes * Improved API key validation for the embedding service to properly handle blank or missing API keys, ensuring more reliable embedding generation and preventing potential service errors. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-01-04 10:36:17 +01:00
Vasilije	90805dac36	add ChromaDB dependency to fix missing installation error (#1960 ) <!-- .github/pull_request_template.md --> ## Description This PR addresses a runtime error where the application fails because ChromaDB is not installed. The error message `"ChromaDB is not installed. Please install it with 'pip install chromadb'"` occurs when attempting to use features that depend on ChromaDB. ## Acceptance Criteria <!-- * Key requirements to the new feature or modification; * Proof that the changes work and meet the requirements; * Include instructions on how to verify the changes. Describe how to test it locally; * Proof that it's sufficiently tested. --> ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] I have tested my changes thoroughly before submitting this PR - [ ] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Updated dependency management to include chromadb in the build configuration. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-01-04 10:35:45 +01:00
maozhen	570de517c5	``` feat(Dockerfile): add chromadb support and China mirror option - Add chromadb extra dependency to uv sync commands in Dockerfile- Include optional aliyun mirror configuration for users in China- Update dependency installation to include chromadb extra```	2026-01-04 15:22:21 +08:00
maozhen	2c79d693fd	``` fix(embeddings): handle empty API key in LiteLLMEmbeddingEngine - Add conditional check for empty API key to prevent authentication errors- Set default API key to "EMPTY" when no valid key is provided- This ensures proper fallback behavior when API key is not configured ```	2026-01-04 15:18:43 +08:00
maozhen	e47fda4872	``` fix(auth): add error handling for JWT lifetime configuration - Add try-catch block to handle invalid JWT_LIFETIME_SECONDS environment variable - Default to 360 seconds when environment variable is not a valid integer - Apply same fix to both API and client authentication backendsdocs(docker): add security warning for CORS configuration - Add comment warning about default CORS_ALLOWED_ORIGINS setting - Emphasize need to override wildcard with specific domains in production ```	2026-01-04 11:08:42 +08:00
maozhen	5a77c36a95	``` refactor(auth): remove redundant comments from JWT strategy configurationRemove duplicate comments that were explaining the JWT lifetime configuration in both API and client authentication backends. The code remains functionallyunchanged but comments are cleaned up for better maintainability. ```	2026-01-04 11:08:32 +08:00
maozhen	a7b114725a	``` feat(auth): make JWT token expiration configurable via environment variable- Add JWT_LIFETIME_SECONDS environment variable to configure token expiration - Set default expiration to3600 seconds (1 hour) for both API and client auth backends - Remove hardcoded expiration values in favor of environment-based configuration - Add documentation comments explaining the JWT strategy configuration feat(auth): make cookie domain configurable via environment variable - Add AUTH_TOKEN_COOKIE_DOMAIN environment variable to configure cookie domain - When not set or empty, cookie domain defaults to None allowing cross-domain usage - Add documentation explaining cookie expiration is handled by JWT strategy - Update default_transport to use environment-based cookie domainfeat(docker): add CORS_ALLOWED_ORIGINS environment variable - Add CORS_ALLOWED_ORIGINS environment variable with default value of '*' - Configure frontend to use NEXT_PUBLIC_BACKEND_API_URL environment variable - Set default backend API URL to http://localhost:8000 feat(docker): add restart policy to all services - Add restart: always policy to cognee, frontend, neo4j, chromadb, and postgres services - This ensures services automatically restart on failure or system reboot - Improves container reliability and uptime```	2026-01-04 11:08:28 +08:00
Vasilije	a0f25f4f50	feat: redo notebook tutorials (#1922 ) <!-- .github/pull_request_template.md --> ## Description <!-- Please provide a clear, human-generated description of the changes in this PR. DO NOT use AI-generated descriptions. We want to understand your thought process and reasoning. --> ## Acceptance Criteria <!-- * Key requirements to the new feature or modification; * Proof that the changes work and meet the requirements; * Include instructions on how to verify the changes. Describe how to test it locally; * Proof that it's sufficiently tested. --> ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] I have tested my changes thoroughly before submitting this PR - [ ] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Two interactive tutorial notebooks added (Cognee Basics, Python Development) with runnable code and rich markdown; MarkdownPreview for rendered markdown; instance-aware notebook support and cloud proxy with API key handling; notebook CRUD (create, save, run, delete). * Bug Fixes * Improved authentication handling to treat 401/403 consistently. * Improvements * Auto-expanding text areas; better error propagation from dataset operations; migration to allow toggling deletability for legacy tutorial notebooks. * Tests * Expanded tests for tutorial creation and loading. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-01-01 14:44:04 +01:00
vasilije	8965e31a58	reformat	2025-12-31 13:57:48 +01:00
Vasilije	e5341c5f49	Support Structured Outputs with Llama CPP using LiteLLM & Instructor (#1949 ) <!-- .github/pull_request_template.md --> ## Description This PR adds support for structured outputs with llama cpp using litellm and instructor. It returns a Pydantic instance. Based on the github issue described [here](https://github.com/topoteretes/cognee/issues/1947). It features the following: - works for both local and server modes (OpenAI api compatible) - defaults to `JSON` mode (not JSON schema mode, which is too rigid) - uses existing patterns around logging & tenacity decorator consistent with other adapters - Respects max_completion_tokens / max_tokens ## Acceptance Criteria <!-- * Key requirements to the new feature or modification; * Proof that the changes work and meet the requirements; * Include instructions on how to verify the changes. Describe how to test it locally; * Proof that it's sufficiently tested. --> I used the script below to test it with the [Phi-3-mini-4k-instruct model](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf). This tests a basic structured data extraction and a more complex one locally, then verifies that data extraction works in server mode. There are instructors in the script on how to set up the models. If you are testing this on a mac, run `brew install llama.cpp` to get llama cpp working locally. If you don't have Apple silicon chips, you will need to alter the script or the configs to run this on GPU. ``` """ Comprehensive test script for LlamaCppAPIAdapter - Tests LOCAL and SERVER modes SETUP INSTRUCTIONS: =================== 1. Download a small model (pick ONE): # Phi-3-mini (2.3GB, recommended - best balance) wget https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-q4.gguf # OR TinyLlama (1.1GB, smallest but lower quality) wget https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf 2. For SERVER mode tests, start a server: python -m llama_cpp.server --model ./Phi-3-mini-4k-instruct-q4.gguf --port 8080 --n_gpu_layers -1 """ import asyncio import os from pydantic import BaseModel from cognee.infrastructure.llm.structured_output_framework.litellm_instructor.llm.llama_cpp.adapter import ( LlamaCppAPIAdapter, ) class Person(BaseModel): """Simple test model for person extraction""" name: str age: int class EntityExtraction(BaseModel): """Test model for entity extraction""" entities: list[str] summary: str # Configuration - UPDATE THESE PATHS MODEL_PATHS = [ "./Phi-3-mini-4k-instruct-q4.gguf", "./tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf", ] def find_model() -> str: """Find the first available model file""" for path in MODEL_PATHS: if os.path.exists(path): return path return None async def test_local_mode(): """Test LOCAL mode (in-process, no server needed)""" print("=" * 70) print("Test 1: LOCAL MODE (In-Process)") print("=" * 70) model_path = find_model() if not model_path: print("❌ No model found! Download a model first:") print() return False print(f"Using model: {model_path}") try: adapter = LlamaCppAPIAdapter( name="LlamaCpp-Local", model_path=model_path, # Local mode parameter max_completion_tokens=4096, n_ctx=2048, n_gpu_layers=-1, # 0 for CPU, -1 for all GPU layers ) print(f"✓ Adapter initialized in {adapter.mode_type.upper()} mode") print(" Sending request...") result = await adapter.acreate_structured_output( text_input="John Smith is 30 years old", system_prompt="Extract the person's name and age.", response_model=Person, ) print(f"✅ Success!") print(f" Name: {result.name}") print(f" Age: {result.age}") print() return True except ImportError as e: print(f"❌ ImportError: {e}") print(" Install llama-cpp-python: pip install llama-cpp-python") print() return False except Exception as e: print(f"❌ Failed: {e}") print() return False async def test_server_mode(): """Test SERVER mode (localhost HTTP endpoint)""" print("=" * 70) print("Test 3: SERVER MODE (Localhost HTTP)") print("=" * 70) try: adapter = LlamaCppAPIAdapter( name="LlamaCpp-Server", endpoint="http://localhost:8080/v1", # Server mode parameter api_key="dummy", model="Phi-3-mini-4k-instruct-q4.gguf", max_completion_tokens=1024, chat_format="phi-3" ) print(f"✓ Adapter initialized in {adapter.mode_type.upper()} mode") print(f" Endpoint: {adapter.endpoint}") print(" Sending request...") result = await adapter.acreate_structured_output( text_input="Sarah Johnson is 25 years old", system_prompt="Extract the person's name and age.", response_model=Person, ) print(f"✅ Success!") print(f" Name: {result.name}") print(f" Age: {result.age}") print() return True except Exception as e: print(f"❌ Failed: {e}") print(" Make sure llama-cpp-python server is running on port 8080:") print(" python -m llama_cpp.server --model your-model.gguf --port 8080") print() return False async def test_entity_extraction_local(): """Test more complex extraction with local mode""" print("=" * 70) print("Test 2: Complex Entity Extraction (Local Mode)") print("=" * 70) model_path = find_model() if not model_path: print("❌ No model found!") print() return False try: adapter = LlamaCppAPIAdapter( name="LlamaCpp-Local", model_path=model_path, max_completion_tokens=1024, n_ctx=2048, n_gpu_layers=-1, ) print(f"✓ Adapter initialized") print(" Sending complex extraction request...") result = await adapter.acreate_structured_output( text_input="Natural language processing (NLP) is a subfield of artificial intelligence (AI) and computer science.", system_prompt="Extract all technical entities mentioned and provide a brief summary.", response_model=EntityExtraction, ) print(f"✅ Success!") print(f" Entities: {', '.join(result.entities)}") print(f" Summary: {result.summary}") print() return True except Exception as e: print(f"❌ Failed: {e}") print() return False async def main(): """Run all tests""" print("\n" + "🦙" * 35) print("Llama CPP Adapter - Comprehensive Test Suite") print("Testing LOCAL and SERVER modes") print("🦙" * 35 + "\n") results = {} # Test 1: Local mode (no server needed) print("=" * 70) print("PHASE 1: Testing LOCAL mode (in-process)") print("=" * 70) print() results["local_basic"] = await test_local_mode() results["local_complex"] = await test_entity_extraction_local() # Test 2: Server mode (requires server on 8080) print("\n" + "=" * 70) print("PHASE 2: Testing SERVER mode (requires server running)") print("=" * 70) print() results["server"] = await test_server_mode() # Summary print("\n" + "=" * 70) print("TEST SUMMARY") print("=" * 70) for test_name, passed in results.items(): status = "✅ PASSED" if passed else "❌ FAILED" print(f" {test_name:20s}: {status}") passed_count = sum(results.values()) total_count = len(results) print() print(f"Total: {passed_count}/{total_count} tests passed") if passed_count == total_count: print("\n🎉 All tests passed! The adapter is working correctly.") elif results.get("local_basic"): print("\n✓ Local mode works! Server/cloud tests need llama-cpp-python server running.") else: print("\n⚠️ Please check setup instructions at the top of this file.") if __name__ == "__main__": asyncio.run(main()) ``` The following screenshots show the tests passing <img width="622" height="149" alt="image" src="https://github.com/user-attachments/assets/9df02f66-39a9-488a-96a6-dc79b47e3001" /> Test 1 <img width="939" height="750" alt="image" src="https://github.com/user-attachments/assets/87759189-8fd2-450f-af7f-0364101a5690" /> Test 2 <img width="938" height="746" alt="image" src="https://github.com/user-attachments/assets/61e423c0-3d41-4fde-acaf-ae77c3463d66" /> Test 3 <img width="944" height="232" alt="image" src="https://github.com/user-attachments/assets/f7302777-2004-447c-a2fe-b12762241ba9" /> note I also tried to test it with the `TinyLlama-1.1B-Chat` model but such a small model is bad at producing structured JSON consistently. ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ X] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) see above ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [X] I have tested my changes thoroughly before submitting this PR - [X] This PR contains minimal changes necessary to address the issue/feature - [X] My code follows the project's coding standards and style guidelines - [X] I have added tests that prove my fix is effective or that my feature works - [X] I have added necessary documentation (if applicable) - [X] All new and existing tests pass - [X] I have searched existing PRs to ensure this change hasn't been submitted already - [X] I have linked any relevant issues in the description - [X] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Llama CPP integration supporting local (in-process) and server (OpenAI‑compatible) modes. * Selectable provider with configurable model path, context size, GPU layers, and chat format. * Asynchronous structured-output generation with rate limiting, retries/backoff, and debug logging. * Chores * Added llama-cpp-python dependency and bumped project version. * Documentation * CONTRIBUTING updated with a “Running Simple Example” walkthrough for local/server usage. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-12-31 12:53:55 +01:00
dgarnitz	dd639fa967	update lock file	2025-12-30 16:59:59 -08:00
dgarnitz	d578971b60	add support for structured outputs with llamma cpp va instructor and litellm	2025-12-30 16:37:31 -08:00
vasilije	27f2aa03b3	added fixes to litellm	2025-12-28 21:48:01 +01:00
Vasilije	310e9e97ae	feat: list vector distance in cogneegraph (#1926 ) <!-- .github/pull_request_template.md --> ## Description <!-- Please provide a clear, human-generated description of the changes in this PR. DO NOT use AI-generated descriptions. We want to understand your thought process and reasoning. --> - `map_vector_distances_to_graph_nodes` and `map_vector_distances_to_graph_edges` accept both single-query (flat list) and multi-query (nested list) inputs. - `query_list_length` controls the mode: omit it for single-query behavior, or provide it to enable multi-query mode with strict length validation and per-query results. - `vector_distance` on `Node` and `Edge` is now a list (one distance per query). Constructors set it to `None`, and `reset_distances` initializes it at the start of each search. - `Node.update_distance_for_query` and `Edge.update_distance_for_query` are the only methods that write to `vector_distance`. They ensure the list has enough elements and keep unmatched queries at the penalty value. - `triplet_distance_penalty` is the default distance value used everywhere. Unmatched nodes/edges and missing scores all use this same penalty for consistency. - `edges_by_distance_key` is an index mapping edge labels to matching edges. This lets us update all edges with the same label at once, instead of scanning the full edge list repeatedly. - `calculate_top_triplet_importances` returns `List[Edge]` for single-query mode and `List[List[Edge]]` for multi-query mode. ## Acceptance Criteria <!-- * Key requirements to the new feature or modification; * Proof that the changes work and meet the requirements; * Include instructions on how to verify the changes. Describe how to test it locally; * Proof that it's sufficiently tested. --> ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [x] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [x] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [x] I have tested my changes thoroughly before submitting this PR - [x] This PR contains minimal changes necessary to address the issue/feature - [x] My code follows the project's coding standards and style guidelines - [x] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [x] All new and existing tests pass - [x] I have searched existing PRs to ensure this change hasn't been submitted already - [x] I have linked any relevant issues in the description - [x] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Multi-query support for mapping/scoring node and edge distances and a configurable triplet distance penalty. * Distance-keyed edge indexing for more accurate distance-to-edge matching. * Refactor * Vector distance metadata changed from scalars to per-query lists; added reset/normalization and per-query update flows. * Node/edge distance initialization now supports deferred/listed distances. * Tests * Updated and expanded tests for multi-query flows, list-based distances, edge-key handling, and related error cases. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-12-23 14:47:27 +01:00
Hande	5f8a3e24bd	refactor: restructure examples and starter kit into new-examples (#1862 ) <!-- .github/pull_request_template.md --> ## Description <!-- Please provide a clear, human-generated description of the changes in this PR. DO NOT use AI-generated descriptions. We want to understand your thought process and reasoning. --> ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [x] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] I have tested my changes thoroughly before submitting this PR - [ ] This PR contains minimal changes necessary to address the issue/feature - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Documentation * Deprecated legacy examples and added a migration guide mapping old paths to new locations * Added a comprehensive new-examples README detailing configurations, pipelines, demos, and migration notes * New Features * Added many runnable examples and demos: database configs, embedding/LLM setups, permissions and access-control, custom pipelines (organizational, product recommendation, code analysis, procurement), multimedia, visualization, temporal/ontology demos, and a local UI starter * Chores * Updated CI/test entrypoints to use the new-examples layout <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>	2025-12-20 02:07:28 +01:00
lxobr	f6c76ce19e	chore: remove duplicate import	2025-12-19 16:24:49 +01:00
lxobr	c3cec818d7	fix: update tests	2025-12-19 16:22:47 +01:00
lxobr	9808077b4c	nit: update variable names	2025-12-19 15:35:34 +01:00
Vasilije	9b2b1a9c13	chore: covering higher level search logic with tests (#1910 ) <!-- .github/pull_request_template.md --> ## Description This PR covers the higher level search.py logic with unit tests. As a part of the implementation we fully cover the following core logic: - search.py - get_search_type_tools (with all the core search types) - search - prepare_search_results contract (testing behavior from search.py interface) ## Acceptance Criteria <!-- * Key requirements to the new feature or modification; * Proof that the changes work and meet the requirements; * Include instructions on how to verify the changes. Describe how to test it locally; * Proof that it's sufficiently tested. --> ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [x] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [x] I have tested my changes thoroughly before submitting this PR - [x] This PR contains minimal changes necessary to address the issue/feature - [x] My code follows the project's coding standards and style guidelines - [x] I have added tests that prove my fix is effective or that my feature works - [x] I have added necessary documentation (if applicable) - [x] All new and existing tests pass - [x] I have searched existing PRs to ensure this change hasn't been submitted already - [x] I have linked any relevant issues in the description - [x] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Tests * Added comprehensive unit test coverage for search functionality, including search type tool selection, search operations, and result preparation workflows across multiple scenarios and edge cases. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-12-19 14:22:54 +01:00
Vasilije	16cf955497	feat: adds multitenant tests via pytest (#1923 ) <!-- .github/pull_request_template.md --> ## Description This PR changes the permission test in e2e tests to use pytest. Introduces: - fixtures for the environment setup - one eventloop for all pytest tests - mocking for acreate_structured_output answer generation (for search) - Asserts in permission test (before we use the example only) ## Acceptance Criteria <!-- * Key requirements to the new feature or modification; * Proof that the changes work and meet the requirements; * Include instructions on how to verify the changes. Describe how to test it locally; * Proof that it's sufficiently tested. --> ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [x] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [x] I have tested my changes thoroughly before submitting this PR - [x] This PR contains minimal changes necessary to address the issue/feature - [x] My code follows the project's coding standards and style guidelines - [x] I have added tests that prove my fix is effective or that my feature works - [x] I have added necessary documentation (if applicable) - [x] All new and existing tests pass - [x] I have searched existing PRs to ensure this change hasn't been submitted already - [x] I have linked any relevant issues in the description - [x] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Entity model now includes description and metadata fields for richer entity information and indexing. * Tests * Expanded and restructured permission tests covering multi-tenant and role-based access flows; improved test scaffolding and stability. * E2E test workflow now runs pytest with verbose output and INFO logs. * Bug Fixes * Access-tracking updates now commit transactions so access timestamps persist. * Chores * General formatting, cleanup, and refactoring across modules and maintenance scripts. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-12-19 14:16:01 +01:00
Igor Ilic	2c4f9b07ac	fix: Resolve migration issue	2025-12-19 13:35:14 +01:00
lxobr	a85df53c74	chore: tweak mapping and scoring	2025-12-19 13:14:50 +01:00

1 2 3 4 5 ...

4771 commits