Commit graph

3 commits

Author SHA1 Message Date
yangdx
fec7c67f45 Add comprehensive chunking tests with multi-token tokenizer edge cases
• Add MultiTokenCharacterTokenizer for testing
• Test token vs character counting accuracy
• Verify delimiter splitting precision
• Test overlap with distinctive content
• Add decode content preservation tests
2025-11-19 19:31:36 +08:00
yangdx
5733292557 Add comprehensive tests for chunking with recursive splitting
- Test recursive split mode
- Add edge case coverage
- Test parameter combinations
- Verify chunk order indexing
- Add integration test scenarios
2025-11-19 19:08:50 +08:00
yangdx
f988a22652 Add token limit validation for character-only chunking
- Add ChunkTokenLimitExceededError exception
- Validate chunks against token limits
- Include chunk preview in error messages
- Add comprehensive test coverage
- Log warnings for oversized chunks
2025-11-19 18:32:43 +08:00