- Split long function calls across multiple lines
- Split long function definitions across multiple lines
- Add blank line after docstring in test function
These changes are purely formatting to comply with the project's
linting standards (black/ruff). No functional changes.
Why this change is needed:
The current locking system uses global locks shared across all users
and workspaces, causing blocking issues in multi-tenant scenarios.
When one tenant performs document indexing, all other tenants are
blocked waiting for the same global lock. This severely limits
the system's ability to serve multiple users concurrently.
How it solves it:
- Add optional `workspace` parameter to 5 lock functions
- Implement lazy creation of workspace-specific locks with proper synchronization
- Store workspace locks in new `_sync_locks` dictionary
- Support both multi-process (RLock) and single-process (asyncio.Lock) modes
- Empty workspace parameter uses global lock for backward compatibility
- Extract common logic into `_get_workspace_lock()` to eliminate duplication
Impact:
- Enables concurrent operations across different workspaces
- Foundation for PR2 (pipeline status isolation)
- Zero impact on existing code (all parameters optional with defaults)
- Each workspace now has independent lock instances
- Thread-safe lazy creation using _registry_guard in multiprocess mode
- Automatic creation of async_locks for workspace locks in multiprocess mode
Code Quality Improvements (Linus review feedback):
- Fixed race condition: lazy creation protected by _registry_guard
- Eliminated code duplication: common logic extracted to _get_workspace_lock()
- Added async_lock support: workspace locks now have companion async_locks
- Handles None workspace parameter gracefully
- Clear separation of concerns: one function handles all workspace logic
Testing:
- 17 new test cases covering:
- Basic functionality and naming
- Workspace isolation and independence
- Backward compatibility with empty workspace
- Concurrent operations (3 workspaces in parallel)
- Performance (1000 workspace lock creation <2s)
- Edge cases (special characters, unicode, long names)
- All existing tests pass (21/21 excluding env issues)
- Verified lock serialization within workspace
- Verified lock independence across workspaces
Files modified:
- lightrag/kg/shared_storage.py: refactored lock functions + synchronization
- tests/test_workspace_locks.py: comprehensive test suite
The stream and timeout parameters were moved from **kwargs to explicit
parameters in a previous commit, but were not being passed to the OpenAI
API, causing streaming responses to fail and fall back to non-streaming
mode.Fixes the issue where stream=True was being silently ignored, resulting
in unexpected non-streaming behavior.
- Snapshot JSON data before yielding batches
- Release lock during batch processing
- Exclude source type from target selection
- Add detailed docstring for lock behavior
- Filter available storage types properly
- Add streaming migration for memory efficiency
- Implement graceful exit with Enter/0
- Add progress indicators for counting
- Optimize batch processing by storage type
- Update docs with new progress displays