LightRAG

Author	SHA1	Message	Date
yangdx	52c812b9a0	Fix workspace isolation for pipeline status across all operations - Fix final_namespace error in get_namespace_data() - Fix get_workspace_from_request return type - Add workspace param to pipeline status calls	2025-11-17 12:54:33 +08:00
yangdx	926960e957	Refactor workspace handling to use default workspace and namespace locks - Remove DB-specific workspace configs - Add default workspace auto-setting - Replace global locks with namespace locks - Simplify pipeline status management - Remove redundant graph DB locking	2025-11-17 12:54:33 +08:00
yangdx	ec05d89c2a	Add macOS fork safety check for Gunicorn multi-worker mode • Check OBJC_DISABLE_INITIALIZE_FORK_SAFETY • Prevent NumPy/Accelerate crashes • Show detailed error message • Provide multiple fix options • Exit early if misconfigured	2025-11-17 12:54:33 +08:00
yangdx	e5addf4d94	Improve embedding config priority and add debug logging • Fix embedding_dim priority logic • Add final config logging	2025-11-17 12:54:32 +08:00
yangdx	6b2af2b579	Refactor embedding function creation with proper attribute inheritance - Extract max_token_size from providers - Avoid double-wrapping EmbeddingFunc - Improve configuration priority logic - Add comprehensive debug logging - Return complete EmbeddingFunc instance	2025-11-17 12:54:32 +08:00
yangdx	14a6c24ed7	Add configurable embedding token limit with validation - Add EMBEDDING_TOKEN_LIMIT env var - Set max_token_size on embedding func - Add token limit property to LightRAG - Validate summary length vs limit - Log warning when limit exceeded	2025-11-17 12:54:32 +08:00
yangdx	2f2f35b883	Add macOS compatibility check for DOCLING with multi-worker Gunicorn	2025-11-17 12:54:32 +08:00
yangdx	c246eff725	Improve docling integration with macOS compatibility and CLI flag - Add --docling CLI flag for easier setup - Add numpy version constraints - Exclude docling on macOS (fork-safety)	2025-11-17 12:54:32 +08:00
yangdx	7b7f93d77c	Implement lazy configuration initialization for API server • Add lazy config initialization • Maintain backward compatibility • Support programmatic usage • Add gunicorn dependency • Explicit config in entry points	2025-11-17 12:54:32 +08:00
yangdx	69a0b74ce7	refactor: move document deps to api group, remove dynamic imports - Merge offline-docs into api extras - Remove pipmaster dynamic installs - Add async document processing - Pre-check docling availability - Update offline deployment docs	2025-11-17 12:54:32 +08:00
yangdx	93a3e47134	Remove deprecated response_type parameter from query settings - Bump API version to 0254 - Remove response format UI controls - Hard-code response_type in query params - Add migration for version 19 - Clean up settings store structure	2025-11-17 12:54:32 +08:00
yangdx	c434879c7a	Replace PyPDF2 with pypdf for PDF processing - Update import from PyPDF2 to pypdf - Change dependency to pypdf>=6.1.0 - Update all requirements files - Remove PyPDF2 from lock file - Use modern pypdf library	2025-11-17 12:54:32 +08:00
BukeLy	18a4870229	fix: Add default workspace support for backward compatibility Fixes two compatibility issues in workspace isolation: 1. Problem: lightrag_server.py calls initialize_pipeline_status() without workspace parameter, causing pipeline to initialize in global namespace instead of rag's workspace. Solution: Add set_default_workspace() mechanism in shared_storage. LightRAG.initialize_storages() now sets default workspace, which initialize_pipeline_status() uses when called without parameters. 2. Problem: /health endpoint hardcoded to use "pipeline_status", cannot return workspace-specific status or support frontend workspace selection. Solution: Add LIGHTRAG-WORKSPACE header support. Endpoint now extracts workspace from header or falls back to server default, returning correct workspace-specific pipeline status. Changes: - lightrag/kg/shared_storage.py: Add set/get_default_workspace() - lightrag/lightrag.py: Call set_default_workspace() in initialize_storages() - lightrag/api/lightrag_server.py: Add get_workspace_from_request() helper, update /health endpoint to support LIGHTRAG-WORKSPACE header Testing: - Backward compatibility: Old code works without modification - Multi-instance safety: Explicit workspace passing preserved - /health endpoint: Supports both default and header-specified workspaces Related: #2353	2025-11-17 12:54:20 +08:00
BukeLy	eb52ec94d7	feat: Add workspace isolation support for pipeline status Problem: In multi-tenant scenarios, different workspaces share a single global pipeline_status namespace, causing pipelines from different tenants to block each other, severely impacting concurrent processing performance. Solution: - Extended get_namespace_data() to recognize workspace-specific pipeline namespaces with pattern "{workspace}:pipeline" (following GraphDB pattern) - Added workspace parameter to initialize_pipeline_status() for per-tenant isolated pipeline namespaces - Updated all 7 call sites to use workspace-aware locks: * lightrag.py: process_document_queue(), aremove_document() * document_routes.py: background_delete_documents(), clear_documents(), cancel_pipeline(), get_pipeline_status(), delete_documents() Impact: - Different workspaces can process documents concurrently without blocking - Backward compatible: empty workspace defaults to "pipeline_status" - Maintains fail-fast: uninitialized pipeline raises clear error - Expected N× performance improvement for N concurrent tenants Bug fixes: - Fixed AttributeError by using self.workspace instead of self.global_config - Fixed pipeline status endpoint to show workspace-specific status - Fixed delete endpoint to check workspace-specific busy flag Code changes: 4 files, 141 insertions(+), 28 deletions(-) Testing: All syntax checks passed, comprehensive workspace isolation tests completed	2025-11-17 12:53:44 +08:00
yangdx	1f9d0735c3	Bump API version to 0253	2025-11-09 14:42:22 +08:00
yangdx	7bc6ccea19	Add uv package manager support to installation docs	2025-11-09 04:31:07 +08:00
yangdx	754d2ad297	Add documentation for LLM cache migration between storage types	2025-11-09 00:41:07 +08:00
yangdx	cf732dbfc6	Bump core version to 1.4.9.9 and API to 0252	2025-11-08 11:27:26 +08:00
yangdx	a624a9508a	Add Gemini to APIs requiring embedding dimension parameter	2025-11-08 03:54:50 +08:00
yangdx	de4ed73652	Add Gemini embedding support - Implement gemini_embed function - Add gemini to embedding binding choices - Add L2 normalization for dims < 3072	2025-11-08 03:34:30 +08:00
yangdx	0b2a15c452	Centralize embedding_send_dim config through args instead of env var	2025-11-08 01:52:23 +08:00
yangdx	03cc6262c4	Prohibit direct access to internal functions of EmbeddingFunc. • Fix similarity search error in query stage • Remove redundant null checks • Improve log readability	2025-11-08 01:43:36 +08:00
yangdx	d95efcb9ad	Fix linting	2025-11-07 21:27:54 +08:00
yangdx	ce28f30ca6	Add embedding_dim parameter support to embedding functions • Pass embedding_dim to jina_embed call • Pass embedding_dim to openai_embed call	2025-11-07 21:23:59 +08:00
yangdx	c14f25b7f8	Add mandatory dimension parameter handling for Jina API compliance	2025-11-07 21:08:34 +08:00
yangdx	d8a6355e41	Merge branch 'main' into apply-dim-to-embedding-call	2025-11-07 20:48:22 +08:00
yangdx	33a1482f7f	Add optional embedding dimension parameter control via env var * Add EMBEDDING_SEND_DIM environment variable * Update Jina/OpenAI embed functions * Add send_dimensions to EmbeddingFunc * Auto-inject embedding_dim when enabled * Add parameter validation warnings	2025-11-07 20:46:40 +08:00
yangdx	fc40a36968	Add timeout support to Gemini LLM and improve parameter handling • Add timeout parameter to Gemini client • Convert timeout seconds to milliseconds • Update function signatures consistently • Add Gemini thinking config example • Clean up parameter documentation	2025-11-07 15:50:14 +08:00
yangdx	5bcd2926ca	Bump API version to 0251	2025-11-06 21:45:47 +08:00
yangdx	831e658ed8	Update readme	2025-11-06 16:26:07 +08:00
yangdx	6e36ff41e1	Fix linting	2025-11-06 16:01:24 +08:00
yangdx	5f49cee20f	Merge branch 'main' into VOXWAVE-FOUNDRY/main	2025-11-06 15:37:35 +08:00
anouarbm	c9e1c6c1c2	fix(api): change content field to list in query responses BREAKING CHANGE: content field is now List[str] instead of str - Add ReferenceItem Pydantic model for type safety - Update /query and /query/stream to return content as list - Update OpenAPI schema and examples - Add migration guide to API README - Fix RAGAS evaluation to handle list format Addresses PR #2297 feedback. Tested with RAGAS: 97.37% score.	2025-11-03 04:57:08 +01:00
anouarbm	9d69e8d776	fix(api): Change content field from string to list in query responses BREAKING CHANGE: The `content` field in query response references is now an array of strings instead of a concatenated string. This preserves individual chunk boundaries when a single file has multiple chunks. Changes: - Update QueryResponse Pydantic model to accept List[str] for content - Modify query_text endpoint to return content as list (query_routes.py:425) - Modify query_text_stream endpoint to support chunk content enrichment - Update OpenAPI schema and examples to reflect array structure - Update API README with breaking change notice and migration guide - Fix RAGAS evaluation to flatten chunk content lists	2025-11-03 04:37:09 +01:00
anouarbm	0b5e3f9dc4	Use logger in RAG evaluation and optimize reference content joins	2025-11-02 18:43:53 +01:00
anouarbm	963ad4c637	docs: Add documentation and examples for include_chunk_content parameter Added comprehensive documentation for the new include_chunk_content parameter that enables retrieval of actual chunk text content in API responses. Documentation Updates: - Added "Include Chunk Content in References" section to API README - Explained use cases: RAG evaluation, debugging, citations, transparency - Provided JSON request/response examples - Clarified parameter interaction with include_references OpenAPI/Swagger Examples: - Added "Response with chunk content" example to /query endpoint - Shows complete reference structure with content field - Demonstrates realistic chunk text content This makes the feature discoverable through: 1. API documentation (README.md) 2. Interactive Swagger UI (http://localhost:9621/docs) 3. Code examples for developers	2025-11-02 17:53:27 +01:00
anouarbm	0bbef9814e	Optimize RAGAS evaluation with parallel execution and chunk content enrichment Added efficient RAG evaluation system with optimized API calls and comprehensive benchmarking. Key Features: - Single API call per evaluation (2x faster than before) - Parallel evaluation based on MAX_ASYNC environment variable - Chunk content enrichment in /query endpoint responses - Comprehensive benchmark statistics (moyennes) - NaN-safe metric calculations API Changes: - Added include_chunk_content parameter to QueryRequest (backward compatible) - /query endpoint enriches references with actual chunk content when requested - No breaking changes - default behavior unchanged Evaluation Improvements: - Parallel execution using asyncio.Semaphore (respects MAX_ASYNC) - Shared HTTP client with connection pooling - Proper timeout handling (3min connect, 5min read) - Debug output for context retrieval verification - Benchmark statistics with averages, min/max scores Results: - Moyenne RAGAS Score: 0.9772 - Perfect Faithfulness: 1.0000 - Perfect Context Recall: 1.0000 - Perfect Context Precision: 1.0000 - Excellent Answer Relevance: 0.9087	2025-11-02 17:39:43 +01:00
yangdx	61b57cbb5d	Add PDF decryption support for password-protected files • Add PDF_DECRYPT_PASSWORD env variable • Check encryption status before reading • Handle decrypt errors gracefully • Log detailed error messages • Support both encrypted/plain PDFs	2025-11-01 15:01:17 +08:00
yangdx	728721b14f	Remove redundant separator lines in gunicorn shutdown handler	2025-11-01 12:53:54 +08:00
yangdx	6d4a55100e	Remove redundant shutdown message from gunicorn	2025-11-01 12:52:22 +08:00
yangdx	7ccc1fdd27	Add frontend rebuild warning indicator to version display - Return bool from check_frontend_build() - Add ⚠️ symbol to outdated versions - Show tooltip with rebuild message - Add translations for warning text - Fix tailwind config filename typo	2025-10-31 06:09:46 +08:00
yangdx	e5414c61ef	Bump core version to 1.4.9.8 and API version to 0250	2025-10-31 05:23:48 +08:00
yangdx	c46c1b26a9	Add pycryptodome dependency for PDF encryption support	2025-10-31 01:49:42 +08:00
yangdx	c9e73bb450	Bump core version to 1.4.9.7 and API version to 0249	2025-10-30 19:43:35 +08:00
yangdx	f610fdaf9b	Merge branch 'main' into Anush008/main	2025-10-30 11:07:39 +08:00
yangdx	3a7f753560	Bump core version to 1.4.9.6 and API version to 0248	2025-10-29 19:08:32 +08:00
yangdx	d5bcd14c6f	Refactor service deployment to use direct process execution - Remove bash wrapper script - Update systemd service configuration - Improve process management for gunicorn - Simplify shared storage cleanup logic - Update documentation for deployment	2025-10-29 18:55:47 +08:00
yangdx	6489aaa7f0	Remove worker_exit hook and improve cleanup logging • Remove unreliable worker_exit function • Add debug logs for cleanup modes • Move DEBUG_LOCKS to top of file	2025-10-29 15:15:13 +08:00
yangdx	4a46d39c93	Replace GUNICORN_CMD_ARGS with custom LIGHTRAG_GUNICORN_MODE flag • Use custom env var for mode detection • Improve Gunicorn mode reliability	2025-10-29 14:06:03 +08:00
yangdx	816feefd84	Fix cleanup coordination between Gunicorn and UvicornWorker lifecycles • Document UvicornWorker hook limitations • Add GUNICORN_CMD_ARGS cleanup guard • Prevent double cleanup in workers	2025-10-29 13:53:46 +08:00

1 2 3 4 5 ...

1257 commits