• Replace truthy checks with `is not None`
• Handle empty dict edge case properly
• Prevent data reload failures
• Add comprehensive test coverage
• Fix JsonKVStorage and DocStatusStorage
• Reload cleaned data after sanitization
• Update shared memory with clean data
• Add specific surrogate char tests
• Test migration sanitization flow
• Prevent dirty data in memory
- Fast path for clean data (no sanitization)
- Slow path sanitizes during encoding
- Reload shared memory after sanitization
- Custom encoder avoids deep copies
- Comprehensive test coverage
- Add entity_chunks & relation_chunks storage
- Implement KEEP/FIFO limit strategies
- Update env.example with new settings
- Add migration for chunk tracking data
- Support all KV storage
- Add StorageNotInitializedError and PipelineNotInitializedError exceptions
- Update JsonDocStatusStorage to raise clear errors when not initialized
- Update JsonKVStorage to raise clear errors when not initialized
- Error messages now include complete initialization instructions
- Helps users understand and fix initialization issues quickly
Addresses feedback from issue #1933 about improving error clarity
Refactored the LLM cache to a flat Key-Value (KV) structure, replacing the previous nested format. The old structure used the 'mode' as a key and stored specific cache content as JSON nested under it. This change significantly enhances cache recall efficiency.
- Enhance delete implementation in JsonKVStorage by removing immediate persistence in delete operation
- Update documentation for drop method to clarify persistence behavior
- Add abstract delete method to BaseKVStorage
In single-process mode, data updates and persistence were not working properly because the update flags were not being correctly handled between different objects.
This commit adds multiprocessing shared memory support to file-based storage implementations:
- JsonDocStatusStorage
- JsonKVStorage
- NanoVectorDBStorage
- NetworkXStorage
Each storage module now uses module-level global variables with multiprocessing.Manager() to ensure data consistency across multiple uvicorn workers. All processes will see
updates immediately when data is modified through ainsert function.