cognee/cognee/tasks/chunks
alekszievr c1f7b667d1
feat: Eliminate the use of max_chunk_tokens and use a unified max_chunk_size instead [cog-1381] (#626)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Simplified text processing by unifying multiple size-related
parameters into a single metric across chunking and extraction
functionalities.
- Streamlined logic for text segmentation by removing redundant
calculations and checks, resulting in a more consistent chunk management
process.
- **Chores**
  - Removed the `modal` package as a dependency.
- **Documentation**
- Updated the README.md to include a new demo video link and clarified
default environment variable settings.
- Enhanced the CONTRIBUTING.md to improve clarity and engagement for
potential contributors.
- **Bug Fixes**
- Improved handling of sentence-ending punctuation in text processing to
include additional characters.
- **Version Update**
  - Updated project version to 0.1.33 in the pyproject.toml file.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-12 14:03:41 +01:00
..
__init__.py Transition to new retrievers, update searches (#585) 2025-02-27 15:25:24 +01:00
chunk_by_paragraph.py feat: Eliminate the use of max_chunk_tokens and use a unified max_chunk_size instead [cog-1381] (#626) 2025-03-12 14:03:41 +01:00
chunk_by_sentence.py feat: Eliminate the use of max_chunk_tokens and use a unified max_chunk_size instead [cog-1381] (#626) 2025-03-12 14:03:41 +01:00
chunk_by_word.py feat: Eliminate the use of max_chunk_tokens and use a unified max_chunk_size instead [cog-1381] (#626) 2025-03-12 14:03:41 +01:00
remove_disconnected_chunks.py add docstrings any typing to cognee tasks 2025-01-17 10:30:34 +01:00