From 6f60ac76fd94582bb69b049271d0f9a011e2401f Mon Sep 17 00:00:00 2001 From: Daulet Amirkhanov Date: Thu, 18 Sep 2025 10:47:34 +0100 Subject: [PATCH] fix: Add S3 URL handling in ensure_absolute_path function (#1438) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Summary The `root_dir.py/ensure_absolute_path` validation in the `GraphConfig` model currently enforces that all paths start with `/`. This works for local file system paths but breaks when using S3 storage, since S3 paths do not begin with `/` and fail validation. ## Fix This PR updates the `ensure_absolute_path` method to recognize and treat S3 paths as valid. ## Logs before ``` (.venv) daulet@Mac cognee-claude % cognee-cli -ui 2025-09-18T01:30:39.768877 [info ] Deleted old log file: /Users/daulet/Desktop/dev/cognee-claude/logs/2025-09-18_02-15-14.log [cognee.shared.logging_utils] 2025-09-18T01:30:40.391407 [error ] Exception [cognee.shared.logging_utils] exception_message="1 validation error for GraphConfig\n Value error, Path must be absolute. Got relative path: s3://daulet-personal-dev/cognee/data [type=value_error, input_value={'data_root_directory': '...45daea63bc5392d85746fb'}, input_type=dict]\n For further information visit https://errors.pydantic.dev/2.11/v/value_error" traceback=True ``` ## Logs after ``` (.venv) daulet@Mac cognee-claude % cognee-cli -ui 2025-09-18T01:34:34.404642 [info ] Deleted old log file: /Users/daulet/Desktop/dev/cognee-claude/logs/2025-09-18_02-17-55.log [cognee.shared.logging_utils] 2025-09-18T01:34:35.026078 [info ] Logging initialized [cognee.shared.logging_utils] cognee_version=0.3.4.dev1-local database_path=s3://daulet-personal-dev/cognee/system/databases graph_database_name= os_info='Darwin 24.5.0 (Darwin Kernel Version 24.5.0: Tue Apr 22 19:54:43 PDT 2025; root:xnu-11417.121.6~2/RELEASE_ARM64_T8132)' python_version=3.10.11 relational_config=cognee_db structlog_version=25.4.0 vector_config=lancedb 2025-09-18T01:34:35.026223 [info ] Database storage: s3://daulet-personal-dev/cognee/system/databases [cognee.shared.logging_utils] Starting cognee UI... 2025-09-18T01:34:36.105617 [info ] Starting cognee UI... [cognee.shared.logging_utils] 2025-09-18T01:34:36.105756 [info ] Starting cognee backend API server... [cognee.shared.logging_utils] 2025-09-18T01:34:37.522194 [info ] Logging initialized [cognee.shared.logging_utils] cognee_version=0.3.4.dev1-local database_path=s3://daulet-personal-dev/cognee/system/databases graph_database_name= os_info='Darwin 24.5.0 (Darwin Kernel Version 24.5.0: Tue Apr 22 19:54:43 PDT 2025; root:xnu-11417.121.6~2/RELEASE_ARM64_T8132)' python_version=3.10.11 relational_config=cognee_db structlog_version=25.4.0 vector_config=lancedb 2025-09-18T01:34:37.522376 [info ] Database storage: s3://daulet-personal-dev/cognee/system/databases [cognee.shared.logging_utils] 2025-09-18T01:34:38.115247 [info ] ✓ Backend API started at http://localhost:8000 [cognee.shared.logging_utils] 2025-09-18T01:34:38.198637 [info ] Starting frontend server at http://localhost:3000 [cognee.shared.logging_utils] 2025-09-18T01:34:38.198879 [info ] This may take a moment to compile and start... [cognee.shared.logging_utils] INFO: Started server process [83608] INFO: Waiting for application startup. 2025-09-18T01:34:39.781430 [warning ] Kuzu S3 storage file not found: s3://daulet-personal-dev/cognee/system/databases/cognee_graph_kuzu [cognee.shared.logging_utils] 2025-09-18T01:34:39.802516 [info ] Loaded JSON extension [cognee.shared.logging_utils] 2025-09-18T01:34:40.197857 [info ] Deleted Kuzu database files at s3://daulet-personal-dev/cognee/system/databases/cognee_graph_kuzu [cognee.shared.logging_utils] 2025-09-18T01:34:41.211523 [info ] ✓ Cognee UI is starting up... [cognee.shared.logging_utils] 2025-09-18T01:34:41.212561 [info ] ✓ Open your browser to: http://localhost:3000 [cognee.shared.logging_utils] 2025-09-18T01:34:41.212814 [info ] ✓ The UI will be available once Next.js finishes compiling [cognee.shared.logging_utils] Success: UI server started successfully! The interface is available at: http://localhost:3000 The API backend is available at: http://localhost:8000 Note: Press Ctrl+C to stop the server... ``` ## Description ## Type of Change - [ ] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Changes Made - - - ## Testing ## Screenshots/Videos (if applicable) ## Pre-submission Checklist - [ ] **I have tested my changes thoroughly before submitting this PR** - [ ] **This PR contains minimal changes necessary to address the issue/feature** - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## Related Issues ## Additional Notes ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --- cognee/root_dir.py | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/cognee/root_dir.py b/cognee/root_dir.py index 46d8fcb69..b10f7507c 100644 --- a/cognee/root_dir.py +++ b/cognee/root_dir.py @@ -20,6 +20,11 @@ def ensure_absolute_path(path: str) -> str: """ if path is None: raise ValueError("Path cannot be None") + + # Check if it's an S3 URL - S3 URLs are absolute by definition + if path.startswith("s3://"): + return path + path_obj = Path(path).expanduser() if path_obj.is_absolute(): return str(path_obj.resolve())