LightRAG

Author	SHA1	Message	Date
BukeLy	e24b2ed4fa	fix: Prioritize workspace-specific legacy collections in Qdrant migration Why this change is needed: The E2E test test_backward_compat_old_workspace_naming_qdrant was failing because _find_legacy_collection() searched for generic "lightrag_vdb_{namespace}" before workspace-specific "{workspace}_{namespace}" collections. When both existed, it would always find the generic one first (which might be empty), ignoring the workspace collection that actually contained the data to migrate. How it solves it: Reordered the candidates list in _find_legacy_collection() to prioritize more specific naming patterns over generic ones: 1. {workspace}_{namespace} (most specific, old workspace format) 2. lightrag_vdb_{namespace} (generic legacy format) 3. {namespace} (most generic, oldest format) This ensures the migration finds the correct source collection with actual data. Impact: - Fixes test_backward_compat_old_workspace_naming_qdrant which creates a "prod_chunks" collection with 10 points - Migration will now correctly find and migrate from workspace-specific legacy collections before falling back to generic collections - Maintains backward compatibility with all legacy naming patterns Testing: Run: pytest tests/test_e2e_multi_instance.py::test_backward_compat_old_workspace_naming_qdrant -v	2025-11-20 02:34:55 +08:00
BukeLy	42df825d30	fix: handle empty model_suffix in Qdrant collection naming This change ensures that when the model_suffix is empty, the final_namespace falls back to the legacy_namespace, preventing potential naming issues. A warning is logged to inform users about the missing model suffix and the fallback to the legacy naming scheme. Additionally, comprehensive tests have been added to verify the behavior of both PostgreSQL and Qdrant storage when model_suffix is empty, ensuring that the naming conventions are correctly applied and that no trailing underscores are present. Impact: - Prevents crashes due to empty model_suffix - Provides clear feedback to users regarding configuration issues - Maintains backward compatibility with existing setups Testing: All new tests pass, validating the handling of empty model_suffix scenarios.	2025-11-20 01:55:20 +08:00
BukeLy	6bef40766d	style: fix lint errors (trailing whitespace and formatting)	2025-11-20 01:41:23 +08:00
BukeLy	088b986ac6	style: fix lint issues (trailing whitespace and formatting)	2025-11-20 01:28:39 +08:00
BukeLy	5d9547344a	fix: correct Qdrant legacy_namespace for data migration Why this change is needed: The legacy_namespace logic was incorrectly including workspace in the collection name, causing migration to fail in E2E tests. When workspace was set (e.g., to a temp directory path), legacy_namespace became "/tmp/xxx_chunks" instead of "lightrag_vdb_chunks", so the migration logic couldn't find the legacy collection. How it solves it: Changed legacy_namespace to always use the old naming scheme without workspace prefix: "lightrag_vdb_{namespace}". This matches the actual collection names from pre-migration code and aligns with PostgreSQL's approach where legacy_table_name = base_table (without workspace). Impact: - Qdrant legacy data migration now works correctly in E2E tests - All unit tests pass (6/6 for both Qdrant and PostgreSQL) - E2E test_legacy_migration_qdrant should now pass Testing: - Unit tests: pytest tests/test_qdrant_migration.py -v (6/6 passed) - Unit tests: pytest tests/test_postgres_migration.py -v (6/6 passed) - Updated test_qdrant_collection_naming to verify new legacy_namespace	2025-11-20 01:08:15 +08:00
BukeLy	df5aacb545	feat: Qdrant model isolation and auto-migration Why this change is needed: To implement vector storage model isolation for Qdrant, allowing different workspaces to use different embedding models without conflict, and automatically migrating existing data. How it solves it: - Modified QdrantVectorDBStorage to use model-specific collection suffixes - Implemented automated migration logic from legacy collections to new schema - Fixed Shared-Data lock re-entrancy issue in multiprocess mode - Added comprehensive tests for collection naming and migration triggers Impact: - Existing users will have data automatically migrated on next startup - New workspaces will use isolated collections based on embedding model - Fixes potential lock-related bugs in shared storage Testing: - Added tests/test_qdrant_migration.py passing - Verified migration logic covers all 4 states (New/Legacy existence combinations)	2025-11-19 18:47:38 +08:00
yangdx	926960e957	Refactor workspace handling to use default workspace and namespace locks - Remove DB-specific workspace configs - Add default workspace auto-setting - Replace global locks with namespace locks - Simplify pipeline status management - Remove redundant graph DB locking	2025-11-17 12:54:33 +08:00
yangdx	5f4a280458	Add Qdrant legacy collection migration with workspace support - Add QdrantMigrationError exception - Implement automatic data migration - Support workspace-based partitioning - Add migration verification logic - Update collection naming scheme	2025-10-30 19:16:33 +08:00
Anush008	8584980e3a	refactor: Qdrant Multi-tenancy (Include staged) Signed-off-by: Anush008 <anushshetty90@gmail.com>	2025-10-26 09:58:24 +05:30
yangdx	9be22dd666	Preserve ordering in get_by_ids methods across all storage implementations - Fix result ordering in vector stores - Update KV storage get_by_ids methods - Maintain order in doc status storage - Return None for missing IDs	2025-10-11 12:37:59 +08:00
yangdx	43f6fcea6c	Fix linting	2025-09-12 17:00:53 +08:00
luxiang	fb4166ba2a	chore: compatible wit qdrant v1.7.3	2025-09-10 20:07:49 +08:00
yangdx	03d0fa3014	perf: add optional query_embedding parameter to avoid redundant embedding calls	2025-08-29 18:15:45 +08:00
yangdx	a923d378dd	Remove deprecated ID-based filtering from vector storage queries - Remove ids param from QueryParam - Simplify BaseVectorStorage.query signature - Update all vector storage implementations - Streamline PostgreSQL query templates - Remove ID filtering from operate.py calls	2025-08-29 17:06:48 +08:00
yangdx	8f7031b882	Add get_vectors_by_ids method to QdrantVectorDBStorage	2025-08-15 16:46:52 +08:00
yangdx	0b22ffb252	Refac: uniformly protected with the get_data_init_lock for all storage initializations	2025-08-14 03:46:19 +08:00
yangdx	5d1bc8b49d	Relocate client creation to the initialize method to prevent race conditions in multi-process mode.	2025-08-12 18:20:56 +08:00
yangdx	74783d7781	Remove redundant debug logging for Qdrant operations	2025-08-12 17:29:05 +08:00
yangdx	fc8ca1a706	Fix: add muti-process lock for initialize and drop method for all storage	2025-08-12 04:25:09 +08:00
yangdx	095e0cbfa2	Refac: Add workspace infomation to all logger output for all storage type	2025-08-12 01:19:09 +08:00
yangdx	033098c1bc	Feat: Add WORKSPACE support to all storage types	2025-07-07 00:57:21 +08:00
yangdx	271722405f	feat: Flatten LLM cache structure for improved recall efficiency Refactored the LLM cache to a flat Key-Value (KV) structure, replacing the previous nested format. The old structure used the 'mode' as a key and stored specific cache content as JSON nested under it. This change significantly enhances cache recall efficiency.	2025-07-02 16:11:53 +08:00
yangdx	045993f7d2	Remove deprecated search_by_prefix	2025-05-03 11:17:49 +08:00
yangdx	08e8a7ead1	Fix linting	2025-05-03 00:46:28 +08:00
yangdx	6021796a61	Fix created_at problem for Qdrant vector db	2025-05-02 16:38:35 +08:00
yangdx	ca63386546	Increase embeding priority for query request	2025-04-28 20:10:39 +08:00
yangdx	95a8ee27ed	Fix linting	2025-03-31 23:22:27 +08:00
yangdx	1df4b777d7	Add drop funtions to storage implementations	2025-03-30 15:17:57 +08:00
Roy	8aa9d0e6ca	Add optional ids filter to vector database query methods - Updated query method signatures across multiple vector database implementations - Added optional `ids` parameter to filter search results - Consistent implementation across ChromaDB, Faiss, Milvus, MongoDB, NanoVectorDB, Oracle, Qdrant, and TiDB vector storage classes	2025-03-11 15:22:17 +00:00
zrguo	da59cc89d8	fix linting	2025-03-09 00:51:14 +08:00
dixyes	458eafd714	Fix qdrant payload id Qdrant now is using PointStruct.payload["id"], not PointStruct.id UUID. This will fix id overwrite	2025-03-08 16:40:40 +08:00
zrguo	e822f35c89	Fix edit entity and relation bugs	2025-03-07 14:39:06 +08:00
zrguo	81568f3bad	fix linting	2025-03-04 15:53:20 +08:00
zrguo	3a2a636862	Implement the missing methods.	2025-03-04 15:50:53 +08:00
Yannick Stephan	48a1ad9b3b	Merge pull request #883 from YanSte/fix-return-none Optimised returns	2025-02-19 22:24:50 +01:00
Yannick Stephan	9277fe8c29	fixed return	2025-02-19 22:22:41 +01:00
Saifeddine ALOUI	473e52a095	Update qdrant_impl.py	2025-02-19 19:51:39 +01:00
Yannick Stephan	2524e02428	remove tqdm and cleaned readme and ollama	2025-02-18 19:58:03 +01:00
Yannick Stephan	2b2c81a722	added some comments	2025-02-16 16:04:07 +01:00
Yannick Stephan	0e7aff96bb	back to not making breaks	2025-02-16 15:08:50 +01:00
Yannick Stephan	a0844bca28	cleaned import	2025-02-16 14:45:45 +01:00
Yannick Stephan	3fef8201c6	added final, required methods and cleaned import	2025-02-16 14:38:09 +01:00
Yannick Stephan	931c31fa8c	cleaned code	2025-02-16 13:55:30 +01:00
Yannick Stephan	3eba41aab6	updated clean of what implemented on BaseVectorStorage	2025-02-16 13:24:42 +01:00
ArnoChen	cac1c993a9	remove redundant cosine similarity filter in Qdrant query fix	2025-02-14 03:16:01 +08:00
ArnoChen	9a91b68e62	fix configuration errors of mongodb, neo4j, and qdrant backends.	2025-02-14 02:48:15 +08:00
yangdx	7017f114e1	Merge branch 'main' into select-datastore-in-api-server	2025-02-13 11:25:52 +08:00
yangdx	ed73ea4076	Fix linting	2025-02-13 04:12:00 +08:00
yangdx	f01f57d0da	refactor: make cosine similarity threshold a required config parameter • Remove default threshold from env var • Add validation for missing threshold • Move default to lightrag.py config init • Update all vector DB implementations • Improve threshold validation consistency	2025-02-13 03:25:48 +08:00
ArnoChen	9daab4340c	add MongoDocStatusStorage remove unnecessary logging format	2025-02-12 04:13:48 +08:00

1 2

53 commits