cognee/cognee/api/v1
hajdul88 4f07adee66
chore: fixes get_raw_data endpoint and adds s3 support (#1916)
<!-- .github/pull_request_template.md -->

## Description
This PR fixes get_raw_data endpoint in get_dataset_router

- Fixes local path access
- Adds s3 access
- Covers new fixed functionality with unit tests

## Acceptance Criteria
<!--
* Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements;
* Include instructions on how to verify the changes. Describe how to
test it locally;
* Proof that it's sufficiently tested.
-->

## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Streaming support for remote S3 data locations so large dataset files
can be retrieved efficiently.
  * Improved handling of local and remote file paths for downloads.

* **Improvements**
  * Standardized error responses for missing datasets or data files.

* **Tests**
* Added unit tests covering local file downloads and S3 streaming,
including content and attachment header verification.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-18 16:10:05 +01:00
..
add fix: Resolve issues with data label PR, add tests and upgrade migration 2025-12-16 20:59:17 +01:00
cloud/routers fix: resolve formatting issue (#1486) 2025-09-30 18:12:57 +02:00
cognify feat: Add database deletion on dataset delete (#1893) 2025-12-15 18:15:48 +01:00
config feat: api error handling restruct 2025-08-13 13:59:12 +02:00
datasets chore: fixes get_raw_data endpoint and adds s3 support (#1916) 2025-12-18 16:10:05 +01:00
delete ruff fix 2025-10-24 15:37:31 +02:00
exceptions ruff formatting 2025-08-13 15:15:39 +02:00
memify fix(api): pass run_in_background parameter to memify function 2025-11-28 12:40:53 -05:00
notebooks/routers fix: UI (#1397) 2025-09-12 20:06:44 +02:00
ontologies chore: introduces 1 file upload in ontology endpoint (#1899) 2025-12-15 18:30:35 +01:00
permissions/routers refactor: update docstring message 2025-11-10 16:23:34 +01:00
prune feat: add welcome tutorial notebook for new users (#1425) 2025-09-18 18:07:05 +02:00
responses Remove all references to SearchType.INSIGHTS across the codebase, meaningfully replacing it with SearchType.GRAPH_COMPLETION where applicable. 2025-10-08 12:13:59 +01:00
search feature: Introduces wide subgraph search in graph completion and improves QA speed (#1736) 2025-11-26 15:18:53 +01:00
settings/routers Added Mistral support as LLM provider using litellm 2025-10-12 11:44:33 +02:00
sync ruff fix 2025-10-24 15:37:31 +02:00
ui fix: install latest nvm version 2025-12-02 10:48:28 +01:00
update ruff fix 2025-10-24 15:37:31 +02:00
users ruff fix 2025-10-24 15:37:31 +02:00
visualize Jspv structlog auto config fix (#907) 2025-06-11 09:26:23 -04:00
__init__.py chore: rename package 2024-03-13 16:08:11 +01:00