cognee/cognee
Vasilije 559d5009f7
feat: Batch document handling (#1469)
<!-- .github/pull_request_template.md -->

## Description
Add a batch system for document processing to limit number of parallel
documents being processed in Cognee

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [x] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-10-18 09:48:52 +02:00
..
api feat: Batch document handling (#1469) 2025-10-18 09:48:52 +02:00
cli Merge branch 'dev' into feat/mcp-add-support-for-non-standalone-mode 2025-10-12 13:57:55 +02:00
eval_framework refactor: Add proper pip install command for optional extras 2025-09-25 13:55:01 +02:00
exceptions
infrastructure Merge branch 'dev' into fix/search-without-prior-cognify 2025-10-17 12:09:47 +01:00
modules refactor emptiness check to be boolean, and optimize query 2025-10-17 12:01:06 +01:00
shared Merge branch 'main' into main-merge-vol7 2025-10-07 19:56:38 +02:00
tasks Change error logging to warning for missing playwright and protego imports in bs4_crawler.py 2025-10-14 12:47:41 +01:00
tests tests: update tests after last refactoring 2025-10-17 14:18:47 +01:00
__init__.py refactor: Add test for updating of docs and visualization 2025-09-30 18:12:22 +02:00
__main__.py
base_config.py fix: ruff formatting error 2025-09-19 17:55:08 +02:00
context_global_variables.py Added global context for bs4crawler and tavily config 2025-10-04 19:40:37 +05:30
get_token.py
low_level.py
pipelines.py
root_dir.py fix: Add S3 URL handling in ensure_absolute_path function (#1438) 2025-09-18 11:47:34 +02:00
version.py