LightRAG

Author	SHA1	Message	Date
BukeLy	eb52ec94d7	feat: Add workspace isolation support for pipeline status Problem: In multi-tenant scenarios, different workspaces share a single global pipeline_status namespace, causing pipelines from different tenants to block each other, severely impacting concurrent processing performance. Solution: - Extended get_namespace_data() to recognize workspace-specific pipeline namespaces with pattern "{workspace}:pipeline" (following GraphDB pattern) - Added workspace parameter to initialize_pipeline_status() for per-tenant isolated pipeline namespaces - Updated all 7 call sites to use workspace-aware locks: * lightrag.py: process_document_queue(), aremove_document() * document_routes.py: background_delete_documents(), clear_documents(), cancel_pipeline(), get_pipeline_status(), delete_documents() Impact: - Different workspaces can process documents concurrently without blocking - Backward compatible: empty workspace defaults to "pipeline_status" - Maintains fail-fast: uninitialized pipeline raises clear error - Expected N× performance improvement for N concurrent tenants Bug fixes: - Fixed AttributeError by using self.workspace instead of self.global_config - Fixed pipeline status endpoint to show workspace-specific status - Fixed delete endpoint to check workspace-specific busy flag Code changes: 4 files, 141 insertions(+), 28 deletions(-) Testing: All syntax checks passed, comprehensive workspace isolation tests completed	2025-11-17 12:53:44 +08:00
Daniel.y	8bb54833a7	Merge pull request #2368 from danielaskdd/milvus-vector-batching Refact: Add Embedding Dimension Validation in EmbeddingFunc	2025-11-17 12:38:22 +08:00
yangdx	90f52acf0c	Fix linting	2025-11-17 12:28:53 +08:00
yangdx	c13f9116d9	Add embedding dimension validation to EmbeddingFunc wrapper • Validate total elements divisibility • Check vector count matches input count • Raise clear error messages on mismatch • Ensure embedding output correctness • Add docstring for EmbeddingFunc class	2025-11-17 12:26:54 +08:00
Daniel.y	3b76eea20b	Merge pull request #2359 from danielaskdd/embedding-limit Refact: Add Embedding Token Limit Configuration and Improve Error Handling	2025-11-15 01:27:26 +08:00
yangdx	8722103550	Update env.example • Comment out Ollama config • Set OpenAI as active default • Add EMBEDDING_TOKEN_LIMIT option • Add Gemini embedding configuration	2025-11-15 01:25:56 +08:00
yangdx	b5589ce4d5	Merge branch 'main' into embedding-limit	2025-11-15 01:10:34 +08:00
Daniel.y	9a2ddcee31	Merge pull request #2360 from danielaskdd/macos-gunicorn-numpy Add macOS fork safety check for Gunicorn multi-worker mode	2025-11-15 01:02:41 +08:00
yangdx	4343db753a	Add macOS fork safety check for Gunicorn multi-worker mode • Check OBJC_DISABLE_INITIALIZE_FORK_SAFETY • Prevent NumPy/Accelerate crashes • Show detailed error message • Provide multiple fix options • Exit early if misconfigured	2025-11-15 00:58:23 +08:00
Daniel.y	c6850ac5ac	Merge pull request #2358 from sleeepyin/main Update the value corresponding to the extracted entity relationship keywords	2025-11-14 23:47:58 +08:00
yangdx	5dec4deac7	Improve embedding config priority and add debug logging • Fix embedding_dim priority logic • Add final config logging	2025-11-14 23:22:44 +08:00
yangdx	de4412dd40	Fix embedding token limit initialization order * Capture max_token_size before decorator * Apply wrapper after capturing attribute * Prevent decorator from stripping dataclass * Ensure token limit is properly set	2025-11-14 22:56:03 +08:00
yangdx	963a0a5db1	Refactor embedding function creation with proper attribute inheritance - Extract max_token_size from providers - Avoid double-wrapping EmbeddingFunc - Improve configuration priority logic - Add comprehensive debug logging - Return complete EmbeddingFunc instance	2025-11-14 22:29:08 +08:00
yangdx	39b49e92ff	Convert embedding_token_limit from property to field with __post_init__ • Remove property decorator • Add field with init=False • Set value in __post_init__ method • embedding_token_limit is now in config dictionary	2025-11-14 20:58:41 +08:00
yangdx	ab4d7ac2b0	Add configurable embedding token limit with validation - Add EMBEDDING_TOKEN_LIMIT env var - Set max_token_size on embedding func - Add token limit property to LightRAG - Validate summary length vs limit - Log warning when limit exceeded	2025-11-14 19:28:36 +08:00
yangdx	680e36c6eb	Improve Bedrock error handling with retry logic and custom exceptions • Add specific exception types • Implement proper retry mechanism • Better error classification • Enhanced logging and validation • Enable embedding retry decorator	2025-11-14 18:51:41 +08:00
yangdx	05852e1ab2	Add max_token_size parameter to embedding function decorators - Add max_token_size=8192 to all embed funcs - Move siliconcloud to deprecated folder - Import wrap_embedding_func_with_attrs - Update EmbeddingFunc docstring - Fix langfuse import type annotation	2025-11-14 18:41:43 +08:00
Sleeep	b88d785469	Merge branch 'HKUDS:main' into main	2025-11-14 16:49:30 +08:00
Daniel.y	399a23c3a6	Merge pull request #2356 from danielaskdd/improve-error-handling Fix: Robust error handling for async database operations in graph storage	2025-11-14 11:16:14 +08:00
yangdx	4401f86f07	Refactor exception handling in MemgraphStorage label methods	2025-11-14 11:01:26 +08:00
yangdx	1ccef2b932	Fix null reference errors in graph database error handling - Initialize result vars to None - Add null checks before consume calls - Prevent crashes in except blocks - Apply fix to both Neo4J and Memgraph	2025-11-14 10:39:04 +08:00
yangdx	c164c8f631	Merge branch 'main' of github.com:HKUDS/LightRAG	2025-11-13 20:42:47 +08:00
yangdx	1889301597	Merge branch 'feat/add_cloud_ollama_support'	2025-11-13 20:41:58 +08:00
yangdx	77ad906d3a	Improve error handling and logging in cloud model detection	2025-11-13 20:41:44 +08:00
Daniel.y	28fba19b11	Merge pull request #2352 from danielaskdd/docling-gunicorn-multi-worker Refact: Enhance DOCLING integration with lazy loading and macOS safeguards	2025-11-13 20:37:48 +08:00
yangdx	cc031a3db9	Add macOS compatibility check for DOCLING with multi-worker Gunicorn	2025-11-13 19:18:04 +08:00
LacombeLouis	844537e378	Add a better regex	2025-11-13 12:17:51 +01:00
yangdx	a24d8181c2	Improve docling integration with macOS compatibility and CLI flag - Add --docling CLI flag for easier setup - Add numpy version constraints - Exclude docling on macOS (fork-safety)	2025-11-13 18:58:09 +08:00
Daniel.y	76adde3858	Merge pull request #2351 from danielaskdd/lazy-config-loading Refact: Implement Lazy Configuration Initialization for API Server	2025-11-13 15:55:35 +08:00
Sleeep	89e63aa49b	Update edge keywords extraction in graph visualization 构建neo4j时候关键字的取值默认为d7 应该为修改后的d9	2025-11-13 15:52:14 +08:00
yangdx	e6588f9119	Update uv.lock	2025-11-13 15:31:51 +08:00
yangdx	746c069ab0	Implement lazy configuration initialization for API server • Add lazy config initialization • Maintain backward compatibility • Support programmatic usage • Add gunicorn dependency • Explicit config in entry points	2025-11-13 15:28:05 +08:00
Daniel.y	470e2fd1f9	Merge pull request #2350 from danielaskdd/reduce-dynamic-import Refactor: Remove blocking dependency installation from document upload handlers	2025-11-13 15:06:05 +08:00
yangdx	4b31942e2a	refactor: move document deps to api group, remove dynamic imports - Merge offline-docs into api extras - Remove pipmaster dynamic installs - Add async document processing - Pre-check docling availability - Update offline deployment docs	2025-11-13 13:34:09 +08:00
yangdx	8765974467	Merge branch 'tongda/main'	2025-11-13 12:56:28 +08:00
yangdx	c230d1a28d	Replace asyncio.iscoroutine with inspect.isawaitable for better detection	2025-11-13 12:56:01 +08:00
yangdx	297e460740	Merge branch 'main' into tongda/main	2025-11-13 12:37:37 +08:00
yangdx	940bec0b31	Support async chunking functions in LightRAG processing pipeline - Add Awaitable and Union type imports - Update chunking_func type annotation - Handle coroutine results with await - Add return type validation - Update docstring for async support	2025-11-13 12:37:15 +08:00
yangdx	343d30727a	Update env.example	2025-11-13 11:40:56 +08:00
Louis Lacombe	f7432a260e	Add support for environment variable fallback for API key and default host for cloud models	2025-11-12 16:11:05 +00:00
Daniel.y	075399ffc5	Merge pull request #2346 from danielaskdd/optimize-json-sanitization Refactor: Optimize write_json for Memory Efficiency and Performance	2025-11-12 16:50:28 +08:00
yangdx	70cc2419f2	Fix empty dict handling after JSON sanitization • Replace truthy checks with `is not None` • Handle empty dict edge case properly • Prevent data reload failures • Add comprehensive test coverage • Fix JsonKVStorage and DocStatusStorage	2025-11-12 16:40:57 +08:00
yangdx	dcf1d28681	Fix migration to reload sanitized data and prevent memory corruption • Reload cleaned data after sanitization • Update shared memory with clean data • Add specific surrogate char tests • Test migration sanitization flow • Prevent dirty data in memory	2025-11-12 16:16:28 +08:00
yangdx	6de4123f74	Optimize JSON string sanitization with precompiled regex and zero-copy - Precompile regex pattern at module level - Zero-copy path for clean strings - Use C-level regex for performance - Remove deprecated _sanitize_json_data - Fast detection for common case	2025-11-12 15:42:07 +08:00
yangdx	777c987371	Optimize JSON write with fast/slow path to reduce memory usage - Fast path for clean data (no sanitization) - Slow path sanitizes during encoding - Reload shared memory after sanitization - Custom encoder avoids deep copies - Comprehensive test coverage	2025-11-12 13:48:56 +08:00
Daniel.y	477c3f54fb	Merge pull request #2345 from danielaskdd/remove-response-type Remove deprecated response_type parameter from query settings UI	2025-11-12 12:32:59 +08:00
yangdx	8c07c91833	Remove deprecated response_type parameter from query settings - Bump API version to 0254 - Remove response format UI controls - Hard-code response_type in query params - Add migration for version 19 - Clean up settings store structure	2025-11-12 12:19:30 +08:00
Daniel.y	69ca366242	Merge pull request #2344 from danielaskdd/fix-josn-serialization-error Fix: Prevent UnicodeEncodeError in JSON storage operations	2025-11-12 00:58:59 +08:00
yangdx	f28a0c25b1	Improve JSON data sanitization to handle tuples and dict keys - Sanitize dictionary keys - Preserve tuple types - Handle nested structures better	2025-11-12 00:50:18 +08:00
yangdx	6918a88f92	Add specialized JSON string sanitizer to prevent UTF-8 encoding errors • Remove surrogate characters (U+D800-DFFF) • Filter Unicode non-characters • Direct char-by-char filtering	2025-11-12 00:38:47 +08:00

... 2 3 4 5 6 ...

5812 commits