Commit graph

70 commits

Author SHA1 Message Date
chengjie
5d31412bd7 feat: add workspace isolation support to unified lock functions
Why this change is needed:
The current locking system uses global locks shared across all users
and workspaces, causing blocking issues in multi-tenant scenarios.
When one tenant performs document indexing, all other tenants are
blocked waiting for the same global lock. This severely limits
the system's ability to serve multiple users concurrently.

How it solves it:
- Add optional `workspace` parameter to 5 lock functions
- Implement lazy creation of workspace-specific locks with proper synchronization
- Store workspace locks in new `_sync_locks` dictionary
- Support both multi-process (RLock) and single-process (asyncio.Lock) modes
- Empty workspace parameter uses global lock for backward compatibility
- Extract common logic into `_get_workspace_lock()` to eliminate duplication

Impact:
- Enables concurrent operations across different workspaces
- Foundation for PR2 (pipeline status isolation)
- Zero impact on existing code (all parameters optional with defaults)
- Each workspace now has independent lock instances
- Thread-safe lazy creation using _registry_guard in multiprocess mode
- Automatic creation of async_locks for workspace locks in multiprocess mode

Code Quality Improvements (Linus review feedback):
- Fixed race condition: lazy creation protected by _registry_guard
- Eliminated code duplication: common logic extracted to _get_workspace_lock()
- Added async_lock support: workspace locks now have companion async_locks
- Handles None workspace parameter gracefully
- Clear separation of concerns: one function handles all workspace logic

Testing:
- 17 new test cases covering:
  - Basic functionality and naming
  - Workspace isolation and independence
  - Backward compatibility with empty workspace
  - Concurrent operations (3 workspaces in parallel)
  - Performance (1000 workspace lock creation <2s)
  - Edge cases (special characters, unicode, long names)
- All existing tests pass (21/21 excluding env issues)
- Verified lock serialization within workspace
- Verified lock independence across workspaces

Files modified:
- lightrag/kg/shared_storage.py: refactored lock functions + synchronization
- tests/test_workspace_locks.py: comprehensive test suite
2025-11-10 22:51:49 +08:00
yangdx
d5bcd14c6f Refactor service deployment to use direct process execution
- Remove bash wrapper script
- Update systemd service configuration
- Improve process management for gunicorn
- Simplify shared storage cleanup logic
- Update documentation for deployment
2025-10-29 18:55:47 +08:00
yangdx
6489aaa7f0 Remove worker_exit hook and improve cleanup logging
• Remove unreliable worker_exit function
• Add debug logs for cleanup modes
• Move DEBUG_LOCKS to top of file
2025-10-29 15:15:13 +08:00
yangdx
72b29659c9 Fix worker process cleanup to prevent shared resource conflicts
• Add worker_exit hook in gunicorn config
• Add shutdown_manager parameter in finalize_share_data of share_storage
• Prevent Manager shutdown in workers
• Remove custom signal handlers
2025-10-29 13:33:21 +08:00
yangdx
083b163c1f Improve lock logging with consistent messaging and debug levels 2025-10-25 11:04:21 +08:00
yangdx
a9ec15e669 Resolve lock leakage issue during user cancellation handling
• Change default log level to INFO
• Force enable error logging output
• Add lock cleanup rollback protection
• Handle LLM cache persistence errors
• Fix async task exception handling
2025-10-25 03:06:45 +08:00
yangdx
059003c906 Rename allow_create to first_initialization for clarity 2025-08-23 02:34:39 +08:00
Albert Gil López
3fca3be09b fix: Fix server startup issue with PipelineNotInitializedError
- Add allow_create parameter to get_namespace_data() to permit internal initialization
- initialize_pipeline_status() now uses allow_create=True to create the namespace
- External calls still get the error if pipeline_status is not initialized
- This maintains the improved error messages while allowing proper server startup

Fixes server startup failure reported in PR #1978
2025-08-22 10:55:56 +00:00
Albert Gil López
c66fc3483a fix: Implement PipelineNotInitializedError usage in get_namespace_data
- Add PipelineNotInitializedError import to shared_storage.py
- Raise PipelineNotInitializedError when accessing uninitialized pipeline_status namespace
- This provides clear error messages to users about initialization requirements
- Other namespaces continue to be created dynamically as before

Addresses review feedback from PR #1978 about unused exception class
2025-08-22 02:52:51 +00:00
zrguo
91d0f65476 Update QueryParam 2025-07-15 14:21:58 +08:00
yangdx
3da9f8aab4 Fix logging output condition in shared_storage.py. Early return if logging disabled 2025-07-15 13:38:05 +08:00
yangdx
85cd1178a1 fix: prevent premature lock cleanup in multiprocess mode
- Change cleanup condition from count == 1 to count == 0 to properly
remove reused locks from cleanup list
- Fix RuntimeError: Attempting to release lock for xxxx more times than it was acquired
2025-07-13 13:51:48 +08:00
yangdx
a2eeae9661 Fixes incorrect cleanup count 2025-07-13 02:38:36 +08:00
yangdx
582e952020 Disable direct logging by default for shared storage module 2025-07-13 01:58:50 +08:00
yangdx
cbf544b3c1 Remvoe redundant log message 2025-07-13 01:51:30 +08:00
yangdx
0e3aaa318f Feat: Add keyed lock cleanup and status monitoring 2025-07-13 00:09:00 +08:00
yangdx
2ade3067f8 Refac: Generalize keyed lock with namespace support
Refactored the `KeyedUnifiedLock` to be generic and support dynamic namespaces. This decouples the locking mechanism from a specific "GraphDB" implementation, allowing it to be reused across different components and workspaces safely.

Key changes:
- `KeyedUnifiedLock` now takes a `namespace` parameter on lock acquisition.
- Renamed `_graph_db_lock_keyed` to a more generic _storage_keyed_lock`
- Replaced `get_graph_db_lock_keyed` with get_storage_keyed_lock` to support namespaces
2025-07-12 12:10:12 +08:00
yangdx
f2d875f8ab Update comments 2025-07-12 11:05:25 +08:00
yangdx
5ee509e671 Fix linting 2025-07-12 05:17:44 +08:00
yangdx
964293f21b Optimize lock cleanup with time tracking and intervals
- Add cleanup time tracking variables
- Implement minimum cleanup intervals
- Track earliest cleanup times
- Handle time rollback cases
- Improve cleanup logging
2025-07-12 04:34:26 +08:00
yangdx
39965d7ded Move merging stage back controled by max parallel insert semhore 2025-07-12 03:32:08 +08:00
yangdx
7490a18481 Optimize lock cleanup parameters 2025-07-12 03:10:03 +08:00
yangdx
3d8e6924bc Show lock clean up message 2025-07-12 02:58:05 +08:00
yangdx
22c36f2fd2 Optimize log messages 2025-07-12 02:41:31 +08:00
yangdx
a64c767298 optimize: improve lock cleanup performance with threshold-based strategy
- Add CLEANUP_THRESHOLD constant (100) to control cleanup frequency
- Modify _release_shared_raw_mp_lock to only scan when cleanup list exceeds threshold
- Modify _release_async_lock to only scan when cleanup list exceeds threshold
2025-07-11 23:43:40 +08:00
yangdx
ad99d9ba5a Improve code organization and comments 2025-07-11 22:13:02 +08:00
yangdx
c52c451cf7 Fix linting 2025-07-11 20:40:50 +08:00
yangdx
3afdd1b67c Fix initial count error for multi-process lock with key 2025-07-11 20:39:08 +08:00
Arjun Rao
f8149790e4 Initial commit with keyed graph lock 2025-05-08 12:29:49 +10:00
yangdx
9f33ff2ecd Optimize log messages 2025-04-29 13:45:06 +08:00
yangdx
5f3e210246 Optimize log messages 2025-04-29 13:32:05 +08:00
yangdx
3aef63cc65 Optimize log info 2025-04-28 23:17:09 +08:00
yangdx
922fc914be Change empty pipeline job name 2025-03-26 17:48:00 +08:00
yangdx
15e060f854 Fix share storage update status handling problem of in memeory storage 2025-03-25 10:48:15 +08:00
yangdx
53396e4d82 Fixlinting 2025-03-21 16:56:47 +08:00
yangdx
20d65ae554 feat(shared_storage): prevent event loop blocking in multiprocess mode
Add auxiliary async locks in multiprocess mode to prevent event loop blocking
2025-03-21 16:08:23 +08:00
yangdx
5d64f3b0a0 Improved auto-scan task initialization and status tracking.
- Added autoscan status tracking in pipeline
- Ensured auto-scan runs only once per startup
2025-03-10 17:14:14 +08:00
yangdx
57a41eedb8 Fix linting 2025-03-10 15:41:46 +08:00
yangdx
46610682ce Fix data persistence issue in single-process mode
In single-process mode, data updates and persistence were not working properly because the update flags were not being correctly handled between different objects.
2025-03-10 15:41:00 +08:00
yangdx
4065a7df92 Fix linting 2025-03-10 02:07:19 +08:00
yangdx
d2708b966d Added update flag to avoid persistence if no data is changed for KV storage 2025-03-10 01:17:25 +08:00
yangdx
e47883d872 Add atomic data initialization lock to prevent race conditions 2025-03-09 17:33:15 +08:00
yangdx
90527875fd Fix async issues in namespace init 2025-03-09 15:22:06 +08:00
yangdx
c5d0962872 Fix linting 2025-03-09 01:00:42 +08:00
yangdx
95c06f1bde Add graph DB lock to shared storage system
• Introduced new graph_db_lock
• Added detailed lock debugging output
2025-03-08 22:36:41 +08:00
yangdx
7cd25fe5ab Improve shared storage cleanup and clarify initialization in multi-worker setup 2025-03-02 01:00:27 +08:00
yangdx
e3a40c2fdb Fix linting 2025-03-01 16:23:34 +08:00
yangdx
40e9e26edb feat: add update flags status to API health endpoint 2025-03-01 14:58:26 +08:00
yangdx
c07a5039b7 Refactor shared storage locks to separate pipeline, storage and internal locks for deadlock preventing 2025-03-01 10:48:55 +08:00
yangdx
d704512139 Refactor shared storage module to improve async handling and naming consistency
• Add async support for get_namespace_data
• Rename get_update_flags to get_update_flag
• Rename set_update_flag to set_all_update_flags
• Update docstrings for clarity
• Fix typos in log messages
2025-03-01 05:01:26 +08:00