LightRAG

Author	SHA1	Message	Date
BukeLy	df5aacb545	feat: Qdrant model isolation and auto-migration Why this change is needed: To implement vector storage model isolation for Qdrant, allowing different workspaces to use different embedding models without conflict, and automatically migrating existing data. How it solves it: - Modified QdrantVectorDBStorage to use model-specific collection suffixes - Implemented automated migration logic from legacy collections to new schema - Fixed Shared-Data lock re-entrancy issue in multiprocess mode - Added comprehensive tests for collection naming and migration triggers Impact: - Existing users will have data automatically migrated on next startup - New workspaces will use isolated collections based on embedding model - Fixes potential lock-related bugs in shared storage Testing: - Added tests/test_qdrant_migration.py passing - Verified migration logic covers all 4 states (New/Legacy existence combinations)	2025-11-19 18:47:38 +08:00
BukeLy	13f2440bbf	feat: enhance BaseVectorStorage for model isolation Why this change is needed: To enforce consistent naming and migration strategy across all vector storages. How it solves it: - Added _generate_collection_suffix() helper - Added _get_legacy_collection_name() and _get_new_collection_name() interfaces Impact: Prepares storage implementations for multi-model support. Testing: Added tests/test_base_storage_integrity.py passing.	2025-11-19 02:15:22 +08:00
BukeLy	5c10d3d58e	feat: enhance EmbeddingFunc with model_name support Why this change is needed: To support vector storage model isolation, we need to track which model is used for embeddings and generate unique identifiers for collections/tables. How it solves it: - Added model_name field to EmbeddingFunc - Added get_model_identifier() method to generate sanitized suffix - Added unit tests to verify behavior Impact: Enables subsequent changes in storage backends to isolate data by model. Testing: Added tests/test_embedding_func.py passing.	2025-11-19 02:11:39 +08:00
yangdx	d16c7840ab	Bump API version to 0256	2025-11-18 23:15:31 +08:00
yangdx	e77340d4a1	Adjust chunking parameters to match the default environment variable settings	2025-11-18 23:14:50 +08:00
yangdx	1bfa1f81cb	Merge branch 'main' into fix_chunk_comment	2025-11-18 22:38:50 +08:00
yangdx	9c10c87554	Fix linting	2025-11-18 22:38:43 +08:00
yangdx	dbae327a17	Merge branch 'main' into dev-postgres-vchordrq	2025-11-18 22:13:27 +08:00
yangdx	3096f844fb	fix(postgres): allow vchordrq.epsilon config when probes is empty Previously, configure_vchordrq would fail silently when probes was empty (the default), preventing epsilon from being configured. Now each parameter is handled independently with conditional execution, and configuration errors fail-fast instead of being swallowed. This fixes the documented epsilon setting being impossible to use in the default configuration.	2025-11-18 21:58:36 +08:00
EightyOliveira	dacca334e0	refactor(chunking): rename params and improve docstring for chunking_by_token_size	2025-11-18 15:46:28 +08:00
yangdx	702cfd2981	Fix document deletion concurrency control and validation logic • Clarify job naming for single vs batch deletion • Update job name validation in busy pipeline check	2025-11-18 13:59:24 +08:00
yangdx	4048fc4b89	Fix: auto-acquire pipeline when idle in document deletion • Track if we acquired the pipeline lock • Auto-acquire pipeline when idle • Only release if we acquired it • Prevent concurrent deletion conflicts • Improve deletion job validation	2025-11-18 13:25:13 +08:00
yangdx	1745b30a5f	Fix missing workspace parameter in update flags status call	2025-11-18 12:55:48 +08:00
yangdx	f8dd2e0724	Fix namespace parsing when workspace contains colons • Use rsplit instead of split • Handle colons in workspace names	2025-11-18 12:23:05 +08:00
wmsnp	d07023c962	feat(postgres_impl): add vchordrq vector index support and unify vector index creation logic	2025-11-18 11:45:16 +08:00
yangdx	6cef8df159	Reduce log level and improve workspace mismatch message clarity • Change warning to info level • Simplify workspace mismatch wording	2025-11-18 08:25:21 +08:00
yangdx	ddc76f0c80	Merge branch 'main' into workspace-isolation	2025-11-17 17:08:07 +08:00
yangdx	9262f66d13	Bump API version to 0255	2025-11-17 17:07:18 +08:00
yangdx	393f880311	Improve LightRAG initialization checker tool with better usage docs • Add workspace param to get_namespace_data • Update docstring with proper usage example • Simplify demo to show correct workflow • Remove confusing before/after comparison • Clarify tool should run after init	2025-11-17 15:42:54 +08:00
yangdx	9d7b7981ce	Add pipeline status validation before document deletion	2025-11-17 14:58:10 +08:00
yangdx	98e964dfc4	Fix initialization instructions in check_lightrag_setup function	2025-11-17 14:27:26 +08:00
yangdx	6d6716e9f8	Add _default_workspace to shared storage finalization - Add _default_workspace to global vars - Set _default_workspace to None on cleanup - Ensure complete resource cleanup - Fix missing workspace finalization	2025-11-17 13:46:46 +08:00
yangdx	f1d8f18c80	Merge branch 'main' into workspace-isolation	2025-11-17 13:01:33 +08:00
yangdx	cdd53ee875	Remove manual initialize_pipeline_status() calls across codebase - Auto-init pipeline status in storages - Remove redundant import statements - Simplify initialization pattern - Update docs and examples	2025-11-17 12:54:33 +08:00
yangdx	e22ac52ebc	Auto-initialize pipeline status in LightRAG.initialize_storages() • Remove manual initialize_pipeline_status calls • Auto-init in initialize_storages method • Update error messages for clarity • Warn on workspace conflicts	2025-11-17 12:54:33 +08:00
yangdx	e8383df3b8	Fix NamespaceLock context variable timing to prevent lock bricking * Acquire lock before setting ContextVar * Prevent state corruption on cancellation * Fix permanent lock brick scenario * Store context only after success * Handle acquisition failure properly	2025-11-17 12:54:33 +08:00
yangdx	95e1fb1612	Remove final_namespace attribute for in-memory storage and use namespace in clean_llm_query_cache.py	2025-11-17 12:54:33 +08:00
yangdx	7ed0eac4c9	Fix workspace filtering logic in get_all_update_flags_status • Handle namespaces with/without prefixes • Fix workspace matching logic	2025-11-17 12:54:33 +08:00
yangdx	78689e8837	Fix pipeline status namespace check to handle root case - Add check for bare "pipeline_status" - Handle namespace without prefix	2025-11-17 12:54:33 +08:00
yangdx	d54d0d55d9	Standardize empty workspace handling from "_" to "" across storage * Unify empty workspace behavior by changing workspace from "_" to "" * Fixed incorrect empty workspace detection in get_all_update_flags_status()	2025-11-17 12:54:33 +08:00
yangdx	b6a5a90eaf	Fix NamespaceLock concurrent coroutine safety with ContextVar - Use ContextVar for per-coroutine storage - Prevent state interference between coroutines - Add re-entrance protection check	2025-11-17 12:54:33 +08:00
yangdx	fd486bc922	Refactor storage classes to use namespace instead of final_namespace	2025-11-17 12:54:33 +08:00
yangdx	01814bfc7a	Fix missing function call parentheses in get_all_update_flags_status	2025-11-17 12:54:33 +08:00
yangdx	7deb9a64b9	Refactor namespace lock to support reusable async context manager • Add NamespaceLock class wrapper • Fix lock re-entrance issues • Enable concurrent lock usage • Fresh context per async with block • Update get_namespace_lock API	2025-11-17 12:54:33 +08:00
yangdx	52c812b9a0	Fix workspace isolation for pipeline status across all operations - Fix final_namespace error in get_namespace_data() - Fix get_workspace_from_request return type - Add workspace param to pipeline status calls	2025-11-17 12:54:33 +08:00
yangdx	926960e957	Refactor workspace handling to use default workspace and namespace locks - Remove DB-specific workspace configs - Add default workspace auto-setting - Replace global locks with namespace locks - Simplify pipeline status management - Remove redundant graph DB locking	2025-11-17 12:54:33 +08:00
yangdx	ec05d89c2a	Add macOS fork safety check for Gunicorn multi-worker mode • Check OBJC_DISABLE_INITIALIZE_FORK_SAFETY • Prevent NumPy/Accelerate crashes • Show detailed error message • Provide multiple fix options • Exit early if misconfigured	2025-11-17 12:54:33 +08:00
yangdx	e5addf4d94	Improve embedding config priority and add debug logging • Fix embedding_dim priority logic • Add final config logging	2025-11-17 12:54:32 +08:00
yangdx	2fb57e767d	Fix embedding token limit initialization order * Capture max_token_size before decorator * Apply wrapper after capturing attribute * Prevent decorator from stripping dataclass * Ensure token limit is properly set	2025-11-17 12:54:32 +08:00
yangdx	6b2af2b579	Refactor embedding function creation with proper attribute inheritance - Extract max_token_size from providers - Avoid double-wrapping EmbeddingFunc - Improve configuration priority logic - Add comprehensive debug logging - Return complete EmbeddingFunc instance	2025-11-17 12:54:32 +08:00
yangdx	f0254773c6	Convert embedding_token_limit from property to field with __post_init__ • Remove property decorator • Add field with init=False • Set value in __post_init__ method • embedding_token_limit is now in config dictionary	2025-11-17 12:54:32 +08:00
yangdx	14a6c24ed7	Add configurable embedding token limit with validation - Add EMBEDDING_TOKEN_LIMIT env var - Set max_token_size on embedding func - Add token limit property to LightRAG - Validate summary length vs limit - Log warning when limit exceeded	2025-11-17 12:54:32 +08:00
yangdx	f5b48587ed	Improve Bedrock error handling with retry logic and custom exceptions • Add specific exception types • Implement proper retry mechanism • Better error classification • Enhanced logging and validation • Enable embedding retry decorator	2025-11-17 12:54:32 +08:00
yangdx	77221564b0	Add max_token_size parameter to embedding function decorators - Add max_token_size=8192 to all embed funcs - Move siliconcloud to deprecated folder - Import wrap_embedding_func_with_attrs - Update EmbeddingFunc docstring - Fix langfuse import type annotation	2025-11-17 12:54:32 +08:00
yangdx	8283c86bce	Refactor exception handling in MemgraphStorage label methods	2025-11-17 12:54:32 +08:00
yangdx	423e4e927a	Fix null reference errors in graph database error handling - Initialize result vars to None - Add null checks before consume calls - Prevent crashes in except blocks - Apply fix to both Neo4J and Memgraph	2025-11-17 12:54:32 +08:00
yangdx	2f2f35b883	Add macOS compatibility check for DOCLING with multi-worker Gunicorn	2025-11-17 12:54:32 +08:00
yangdx	c246eff725	Improve docling integration with macOS compatibility and CLI flag - Add --docling CLI flag for easier setup - Add numpy version constraints - Exclude docling on macOS (fork-safety)	2025-11-17 12:54:32 +08:00
yangdx	63510478e5	Improve error handling and logging in cloud model detection	2025-11-17 12:54:32 +08:00
LacombeLouis	67dfd85679	Add a better regex	2025-11-17 12:54:32 +08:00

1 2 3 4 5 ...

3686 commits