LightRAG

Author	SHA1	Message	Date
Clément THOMAS	62b2a71dda	feat(api): add multi-workspace server support for multi-tenant deployments Enable a single LightRAG server instance to serve multiple isolated workspaces via HTTP header-based routing. This allows multi-tenant SaaS deployments where each tenant's data is completely isolated. Key features: - Header-based workspace routing (LIGHTRAG-WORKSPACE, X-Workspace-ID fallback) - Process-local pool of LightRAG instances with LRU eviction - FastAPI dependency (get_rag) for workspace resolution per request - Full backward compatibility - existing deployments work unchanged - Strict multi-tenant mode option (LIGHTRAG_ALLOW_DEFAULT_WORKSPACE=false) - Configurable pool size (LIGHTRAG_MAX_WORKSPACES_IN_POOL) - Graceful shutdown with workspace finalization Configuration: - LIGHTRAG_DEFAULT_WORKSPACE: Default workspace (falls back to WORKSPACE) - LIGHTRAG_ALLOW_DEFAULT_WORKSPACE: Require explicit header when false - LIGHTRAG_MAX_WORKSPACES_IN_POOL: Max concurrent workspace instances (default: 50) Files: - New: lightrag/api/workspace_manager.py (core multi-workspace module) - New: tests/test_multi_workspace_server.py (17 unit tests) - New: render.yaml (Render deployment blueprint) - Modified: All route files to use get_rag dependency - Updated: README.md, env.example with documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-12-01 12:07:22 +01:00
yangdx	112ed234c4	Bump API version to 0258	2025-12-01 12:20:27 +08:00
yangdx	ea8d55ab42	Add documentation for embedding provider configuration rules	2025-11-28 17:49:30 +08:00
yangdx	4ab4a7ac94	Allow embedding models to use provider defaults when unspecified - Set EMBEDDING_MODEL default to None - Pass model param only when provided - Let providers use their own defaults - Fix lollms embed function params - Add ollama embed_model default param	2025-11-28 16:57:33 +08:00
yangdx	881b8d3a50	Bump API version to 0257	2025-11-28 15:39:55 +08:00
yangdx	56e0365cf0	Add configurable model parameter to jina_embed function - Add model parameter to jina_embed - Pass model from API server - Default to jina-embeddings-v4 - Update function documentation - Make model selection flexible	2025-11-28 15:38:29 +08:00
yangdx	6e2946e78a	Add max_token_size parameter to azure_openai_embed wrapper	2025-11-28 13:41:01 +08:00
yangdx	4f12fe121d	Change entity extraction logging from warning to info level • Reduce log noise for empty entities	2025-11-27 11:00:34 +08:00
yangdx	93d445dfdd	Add pipeline status lock function for legacy compatibility - Add get_pipeline_status_lock function - Return NamespaceLock for consistency - Support workspace parameter - Enable logging option - Legacy code compatibility	2025-11-25 18:24:39 +08:00
EightyOliveira	8994c70f2f	fix:exception handling order error	2025-11-25 16:36:41 +08:00
yangdx	48b67d3077	Handle missing WebUI assets gracefully without blocking server startup - Change build check from error to warning - Redirect to /docs when WebUI unavailable - Add webui_available to health endpoint - Only mount /webui if assets exist - Return status tuple from build check	2025-11-25 02:51:55 +08:00
yangdx	8c4d7a00ad	Refactor: Extract retry decorator to reduce code duplication in Neo4J storage • Define READ_RETRY_EXCEPTIONS constant • Create reusable READ_RETRY decorator • Replace 11 duplicate retry decorators • Improve code maintainability • Add missing retry to edge_degrees_batch	2025-11-25 01:35:21 +08:00
yangdx	7aaa51cda9	Add retry decorators to Neo4j read operations for resilience	2025-11-24 22:28:15 +08:00
yangdx	7b76211066	Add fallback to AZURE_OPENAI_API_VERSION for embedding API version	2025-11-22 00:14:35 +08:00
yangdx	ffd8da512e	Improve Azure OpenAI compatibility and error handling • Reduce log noise for Azure content filters • Add default API version fallback • Change warning to debug log level • Handle empty choices in streaming • Better Azure OpenAI integration	2025-11-21 23:51:18 +08:00
yangdx	fafa1791f4	Fix Azure OpenAI model parameter to use deployment name consistently - Use deployment name for Azure API calls - Fix model param in embed function - Consistent api_model logic - Prevent Azure model name conflicts	2025-11-21 23:41:52 +08:00
yangdx	ac9f2574a5	Improve Azure OpenAI wrapper functions with full parameter support • Add missing parameters to wrappers • Update docstrings for clarity • Ensure API consistency • Fix parameter forwarding • Maintain backward compatibility	2025-11-21 19:24:32 +08:00
yangdx	45f4f82392	Refactor Azure OpenAI client creation to support client_configs merging - Handle None client_configs case - Merge configs with explicit params - Override client_configs with params - Use dict unpacking for client init - Maintain parameter precedence	2025-11-21 19:14:16 +08:00
yangdx	0c4cba3860	Fix double decoration in azure_openai_embed and document decorator usage • Remove redundant @retry decorator • Call openai_embed.func directly • Add detailed decorator documentation • Prevent double parameter injection • Fix EmbeddingFunc wrapping issues	2025-11-21 18:03:53 +08:00
yangdx	b46c152306	Fix linting	2025-11-21 17:16:44 +08:00
yangdx	b709f8f869	Consolidate Azure OpenAI implementation into main OpenAI module • Unified OpenAI/Azure client creation • Azure module now re-exports functions • Backward compatibility maintained • Reduced code duplication	2025-11-21 17:12:33 +08:00
yangdx	66d6c7dd6f	Refactor main function to provide sync CLI entry point	2025-11-21 13:11:55 +08:00
yangdx	02fdceb959	Update OpenAI client to use stable API and bump minimum version to 2.0.0 - Remove beta prefix from completions.parse - Update OpenAI dependency to >=2.0.0 - Fix whitespace formatting - Update all requirement files - Clean up pyproject.toml dependencies	2025-11-21 12:55:44 +08:00
yangdx	9f69c5bf85	feat: Support structured output `parsed` from OpenAI Added support for structured output (JSON mode) from the OpenAI API in `openai.py` and `azure_openai.py`. When `response_format` is used to request structured data, the new logic checks for the `message.parsed` attribute. If it exists, it's serialized into a JSON string as the final content. If not, the code falls back to the existing `message.content` handling, ensuring backward compatibility.	2025-11-21 12:46:31 +08:00
yangdx	c9e1c86e81	Refactor keyword extraction handling to centralize response format logic • Move response format to core function • Remove duplicate format assignments • Standardize keyword extraction flow • Clean up redundant parameter handling • Improve Azure OpenAI compatibility	2025-11-21 12:10:04 +08:00
yangdx	46ce6d9a13	Fix Azure OpenAI embedding model parameter fallback - Use model param if provided - Fall back to deployment name - Fix embedding API call - Improve parameter handling	2025-11-20 18:20:22 +08:00
Amritpal Singh	30e86fa331	use deployment variable which extracted value from .env file or have default value	2025-11-20 09:00:27 +00:00
yangdx	b7de694f48	Add comprehensive error logging across API routes - Add error logs to Ollama API endpoints - Replace logging with unified logger - Log streaming query errors - Add data query error logging - Include stack traces for debugging	2025-11-19 22:50:06 +08:00
yangdx	0fb2925c6a	Remove ascii_colors dependency and fix stream handling errors • Remove ascii_colors.trace_exception calls • Add SafeStreamHandler for closed streams • Patch ascii_colors console handler • Prevent ValueError on stream close • Improve logging error handling	2025-11-19 21:38:17 +08:00
yangdx	6fea68bff9	Fix ChunkTokenLimitExceededError message formatting - Prevent passes two separate string objects to __init__ - Maintain same error output	2025-11-19 18:50:45 +08:00
yangdx	f988a22652	Add token limit validation for character-only chunking - Add ChunkTokenLimitExceededError exception - Validate chunks against token limits - Include chunk preview in error messages - Add comprehensive test coverage - Log warnings for oversized chunks	2025-11-19 18:32:43 +08:00
yangdx	95cd0ece74	Fix DOCX table extraction by escaping special characters in cells - Add escape_cell() function - Escape backslashes first - Handle tabs and newlines - Preserve tab-delimited format - Prevent double-escaping issues	2025-11-19 09:54:35 +08:00
yangdx	87de2b3e9e	Update XLSX extraction documentation to reflect current implementation	2025-11-19 04:26:41 +08:00
yangdx	0244699d81	Optimize XLSX extraction by using sheet.max_column instead of two-pass scan • Remove two-pass row scanning approach • Use built-in sheet.max_column property • Simplify column width detection logic • Improve memory efficiency • Maintain column alignment preservation	2025-11-19 04:02:39 +08:00
yangdx	2b16016312	Optimize XLSX extraction to avoid storing all rows in memory • Remove intermediate row storage • Use iterator twice instead of list() • Preserve column alignment logic • Reduce memory footprint • Maintain same output format	2025-11-19 03:48:36 +08:00
yangdx	ef659a1e09	Preserve column alignment in XLSX extraction with two-pass processing • Two-pass approach for consistent width • Maintain tabular structure integrity • Determine max columns first pass • Extract with alignment second pass • Prevent column misalignment issues	2025-11-19 03:34:22 +08:00
yangdx	3efb1716b4	Enhance XLSX extraction with structured tab-delimited format and escaping - Add clear sheet separators - Escape special characters - Trim trailing empty columns - Preserve row structure - Single-pass optimization	2025-11-19 03:06:29 +08:00
yangdx	e7d2803a65	Remove text stripping in DOCX extraction to preserve whitespace • Keep original paragraph spacing • Preserve cell whitespace in tables • Maintain document formatting • Don't strip leading/trailing spaces	2025-11-19 02:12:27 +08:00
yangdx	186c8f0e16	Preserve blank paragraphs in DOCX extraction to maintain spacing • Remove text emptiness check • Always append paragraph text • Maintain document formatting • Preserve original spacing	2025-11-19 02:03:10 +08:00
yangdx	fa887d811b	Fix table column structure preservation in DOCX extraction • Always append cell text to maintain columns • Preserve empty cells in table structure • Check for any content before adding rows • Use tab separation for proper alignment • Improve table formatting consistency	2025-11-19 01:52:02 +08:00
yangdx	4438ba41a3	Enhance DOCX extraction to preserve document order with tables • Include tables in extracted content • Maintain original document order • Add spacing around tables • Use tabs to separate table cells • Process all body elements sequentially	2025-11-19 01:31:33 +08:00
yangdx	d16c7840ab	Bump API version to 0256	2025-11-18 23:15:31 +08:00
yangdx	e77340d4a1	Adjust chunking parameters to match the default environment variable settings	2025-11-18 23:14:50 +08:00
yangdx	1bfa1f81cb	Merge branch 'main' into fix_chunk_comment	2025-11-18 22:38:50 +08:00
yangdx	9c10c87554	Fix linting	2025-11-18 22:38:43 +08:00
yangdx	dbae327a17	Merge branch 'main' into dev-postgres-vchordrq	2025-11-18 22:13:27 +08:00
yangdx	3096f844fb	fix(postgres): allow vchordrq.epsilon config when probes is empty Previously, configure_vchordrq would fail silently when probes was empty (the default), preventing epsilon from being configured. Now each parameter is handled independently with conditional execution, and configuration errors fail-fast instead of being swallowed. This fixes the documented epsilon setting being impossible to use in the default configuration.	2025-11-18 21:58:36 +08:00
EightyOliveira	dacca334e0	refactor(chunking): rename params and improve docstring for chunking_by_token_size	2025-11-18 15:46:28 +08:00
yangdx	702cfd2981	Fix document deletion concurrency control and validation logic • Clarify job naming for single vs batch deletion • Update job name validation in busy pipeline check	2025-11-18 13:59:24 +08:00
yangdx	4048fc4b89	Fix: auto-acquire pipeline when idle in document deletion • Track if we acquired the pipeline lock • Auto-acquire pipeline when idle • Only release if we acquired it • Prevent concurrent deletion conflicts • Improve deletion job validation	2025-11-18 13:25:13 +08:00

1 2 3 4 5 ...

3724 commits