LightRAG/lightrag
BukeLy 16fff353d9 fix: prevent data loss in PostgreSQL migration and add doc_status table creation
This commit fixes two critical issues in PostgreSQL storage:

BUG 1: Legacy table cleanup causing data loss across workspaces
---------------------------------------------------------------
PROBLEM:
- After migrating workspace_a data from legacy table, the ENTIRE legacy
  table was deleted
- This caused workspace_b's data (still in legacy table) to be lost
- Multi-tenant data isolation was violated

FIX:
- Implement workspace-aware cleanup: only delete migrated workspace's data
- Check if other workspaces still have data before dropping table
- Only drop legacy table when it becomes completely empty
- If other workspace data exists, preserve legacy table with remaining records

Location: postgres_impl.py PGVectorStorage.setup_table() lines 2510-2567

Test verification:
- test_workspace_migration_isolation_e2e_postgres validates this fix

BUG 2: PGDocStatusStorage missing table initialization
-------------------------------------------------------
PROBLEM:
- PGDocStatusStorage.initialize() only set workspace, never created table
- Caused "relation 'lightrag_doc_status' does not exist" errors
- document insertion (ainsert) failed immediately

FIX:
- Add table creation to initialize() method using _pg_create_table()
- Consistent with other storage implementations:
  * MongoDocStatusStorage creates collections
  * JsonDocStatusStorage creates directories
  * PGDocStatusStorage now creates tables ✓

Location: postgres_impl.py PGDocStatusStorage.initialize() lines 2965-2971

Test Results:
- Unit tests: 13/13 passed (test_unified_lock_safety,
  test_workspace_migration_isolation, test_dimension_mismatch)
- E2E tests require PostgreSQL server

Related: PR #2391 (Vector Storage Model Isolation)
2025-11-23 16:43:49 +08:00
..
api Bump API version to 0256 2025-11-18 23:15:31 +08:00
evaluation Update LLM cache migration docs and improve UX prompts 2025-11-08 23:48:19 +08:00
kg fix: prevent data loss in PostgreSQL migration and add doc_status table creation 2025-11-23 16:43:49 +08:00
llm Improve Bedrock error handling with retry logic and custom exceptions 2025-11-17 12:54:32 +08:00
tools Improve LightRAG initialization checker tool with better usage docs 2025-11-17 15:42:54 +08:00
__init__.py Bump core version to 1.4.9.9 and API to 0252 2025-11-08 11:27:26 +08:00
base.py style: fix lint issues (trailing whitespace and formatting) 2025-11-20 01:28:39 +08:00
constants.py Refactor entity merging with unified attribute merge function 2025-10-27 00:04:17 +08:00
exceptions.py Auto-initialize pipeline status in LightRAG.initialize_storages() 2025-11-17 12:54:33 +08:00
lightrag.py style: fix lint issues (trailing whitespace and formatting) 2025-11-20 01:28:39 +08:00
namespace.py Add entity/relation chunk tracking with configurable source ID limits 2025-10-20 15:24:15 +08:00
operate.py Adjust chunking parameters to match the default environment variable settings 2025-11-18 23:14:50 +08:00
prompt.py Fix typo in 'equipment' in prompt.py 2025-10-22 11:13:22 +08:00
rerank.py fix: Resolve default rerank config problem when env var missing 2025-08-23 01:07:59 +08:00
types.py
utils.py style: fix lint issues (trailing whitespace and formatting) 2025-11-20 01:28:39 +08:00
utils_graph.py Improve entity merge logging by removing redundant message and fixing typo 2025-10-31 17:16:59 +08:00