Why this change is needed:
The current locking system uses global locks shared across all users
and workspaces, causing blocking issues in multi-tenant scenarios.
When one tenant performs document indexing, all other tenants are
blocked waiting for the same global lock. This severely limits
the system's ability to serve multiple users concurrently.
How it solves it:
- Add optional `workspace` parameter to 5 lock functions
- Implement lazy creation of workspace-specific locks with proper synchronization
- Store workspace locks in new `_sync_locks` dictionary
- Support both multi-process (RLock) and single-process (asyncio.Lock) modes
- Empty workspace parameter uses global lock for backward compatibility
- Extract common logic into `_get_workspace_lock()` to eliminate duplication
Impact:
- Enables concurrent operations across different workspaces
- Foundation for PR2 (pipeline status isolation)
- Zero impact on existing code (all parameters optional with defaults)
- Each workspace now has independent lock instances
- Thread-safe lazy creation using _registry_guard in multiprocess mode
- Automatic creation of async_locks for workspace locks in multiprocess mode
Code Quality Improvements (Linus review feedback):
- Fixed race condition: lazy creation protected by _registry_guard
- Eliminated code duplication: common logic extracted to _get_workspace_lock()
- Added async_lock support: workspace locks now have companion async_locks
- Handles None workspace parameter gracefully
- Clear separation of concerns: one function handles all workspace logic
Testing:
- 17 new test cases covering:
- Basic functionality and naming
- Workspace isolation and independence
- Backward compatibility with empty workspace
- Concurrent operations (3 workspaces in parallel)
- Performance (1000 workspace lock creation <2s)
- Edge cases (special characters, unicode, long names)
- All existing tests pass (21/21 excluding env issues)
- Verified lock serialization within workspace
- Verified lock independence across workspaces
Files modified:
- lightrag/kg/shared_storage.py: refactored lock functions + synchronization
- tests/test_workspace_locks.py: comprehensive test suite
- Add allow_create parameter to get_namespace_data() to permit internal initialization
- initialize_pipeline_status() now uses allow_create=True to create the namespace
- External calls still get the error if pipeline_status is not initialized
- This maintains the improved error messages while allowing proper server startup
Fixes server startup failure reported in PR #1978
- Add PipelineNotInitializedError import to shared_storage.py
- Raise PipelineNotInitializedError when accessing uninitialized pipeline_status namespace
- This provides clear error messages to users about initialization requirements
- Other namespaces continue to be created dynamically as before
Addresses review feedback from PR #1978 about unused exception class
- Change cleanup condition from count == 1 to count == 0 to properly
remove reused locks from cleanup list
- Fix RuntimeError: Attempting to release lock for xxxx more times than it was acquired
Refactored the `KeyedUnifiedLock` to be generic and support dynamic namespaces. This decouples the locking mechanism from a specific "GraphDB" implementation, allowing it to be reused across different components and workspaces safely.
Key changes:
- `KeyedUnifiedLock` now takes a `namespace` parameter on lock acquisition.
- Renamed `_graph_db_lock_keyed` to a more generic _storage_keyed_lock`
- Replaced `get_graph_db_lock_keyed` with get_storage_keyed_lock` to support namespaces
- Add CLEANUP_THRESHOLD constant (100) to control cleanup frequency
- Modify _release_shared_raw_mp_lock to only scan when cleanup list exceeds threshold
- Modify _release_async_lock to only scan when cleanup list exceeds threshold
In single-process mode, data updates and persistence were not working properly because the update flags were not being correctly handled between different objects.
• Add async support for get_namespace_data
• Rename get_update_flags to get_update_flag
• Rename set_update_flag to set_all_update_flags
• Update docstrings for clarity
• Fix typos in log messages