LightRAG/tests
BukeLy f69cf9bcd6 fix: prevent vector dimension mismatch crashes and data loss on no-suffix restarts
Why this change is needed:
Two critical issues were identified in Codex review of PR #2391:
1. Migration fails when legacy collections/tables use different embedding dimensions
   (e.g., upgrading from 1536d to 3072d models causes initialization failures)
2. When model_suffix is empty (no model_name provided), table_name equals legacy_table_name,
   causing Case 1 logic to delete the only table/collection on second startup

How it solves it:
- Added dimension compatibility checks before migration in both Qdrant and PostgreSQL
- PostgreSQL uses two-method detection: pg_attribute metadata query + vector sampling fallback
- When dimensions mismatch, skip migration and create new empty table/collection, preserving legacy data
- Added safety check to detect when new and legacy names are identical, preventing deletion
- Both backends log clear warnings about dimension mismatches and skipped migrations

Impact:
- lightrag/kg/qdrant_impl.py: Added dimension check (lines 254-297) and no-suffix safety (lines 163-169)
- lightrag/kg/postgres_impl.py: Added dimension check with fallback (lines 2347-2410) and no-suffix safety (lines 2281-2287)
- tests/test_no_model_suffix_safety.py: New test file with 4 test cases covering edge scenarios
- Backward compatible: All existing scenarios continue working unchanged

Testing:
- All 20 tests pass (16 existing migration tests + 4 new safety tests)
- E2E tests enhanced with explicit verification points for dimension mismatch scenarios
- Verified graceful degradation when dimension detection fails
- Code style verified with ruff and pre-commit hooks
2025-11-23 15:44:07 +08:00
..
conftest.py Add GitHub CI workflow and test markers for offline/integration tests 2025-11-18 11:36:10 +08:00
README_WORKSPACE_ISOLATION_TESTS.md Fix linting 2025-11-18 08:07:54 +08:00
test_aquery_data_endpoint.py Add GitHub CI workflow and test markers for offline/integration tests 2025-11-18 11:36:10 +08:00
test_base_storage_integrity.py style: fix lint errors (trailing whitespace and formatting) 2025-11-20 01:41:23 +08:00
test_curl_aquery_data.sh Fix linting 2025-10-06 04:57:11 +08:00
test_dimension_mismatch.py style: fix lint errors in test files 2025-11-20 12:24:53 +08:00
test_e2e_multi_instance.py style: apply ruff formatting fixes to test_e2e_multi_instance.py 2025-11-20 12:31:08 +08:00
test_embedding_func.py style: fix lint errors (trailing whitespace and formatting) 2025-11-20 01:41:23 +08:00
test_graph_storage.py Add GitHub CI workflow and test markers for offline/integration tests 2025-11-18 11:36:10 +08:00
test_lightrag_ollama_chat.py Rename test classes to prevent warning from pytest 2025-11-18 13:33:05 +08:00
test_no_model_suffix_safety.py fix: prevent vector dimension mismatch crashes and data loss on no-suffix restarts 2025-11-23 15:44:07 +08:00
test_postgres_migration.py refactor: unify PostgreSQL and Qdrant migration logic for consistency 2025-11-20 11:37:59 +08:00
test_postgres_retry_integration.py Add GitHub CI workflow and test markers for offline/integration tests 2025-11-18 11:36:10 +08:00
test_qdrant_migration.py feat: implement dimension compatibility checks for PostgreSQL and Qdrant migrations 2025-11-20 12:22:13 +08:00
test_workspace_isolation.py Fix test to use default workspace parameter behavior 2025-11-18 11:51:17 +08:00
test_write_json_optimization.py Add GitHub CI workflow and test markers for offline/integration tests 2025-11-18 11:36:10 +08:00