LightRAG

Author	SHA1	Message	Date
yangdx	e22ac52ebc	Auto-initialize pipeline status in LightRAG.initialize_storages() • Remove manual initialize_pipeline_status calls • Auto-init in initialize_storages method • Update error messages for clarity • Warn on workspace conflicts	2025-11-17 12:54:33 +08:00
yangdx	52c812b9a0	Fix workspace isolation for pipeline status across all operations - Fix final_namespace error in get_namespace_data() - Fix get_workspace_from_request return type - Add workspace param to pipeline status calls	2025-11-17 12:54:33 +08:00
yangdx	926960e957	Refactor workspace handling to use default workspace and namespace locks - Remove DB-specific workspace configs - Add default workspace auto-setting - Replace global locks with namespace locks - Simplify pipeline status management - Remove redundant graph DB locking	2025-11-17 12:54:33 +08:00
yangdx	2fb57e767d	Fix embedding token limit initialization order * Capture max_token_size before decorator * Apply wrapper after capturing attribute * Prevent decorator from stripping dataclass * Ensure token limit is properly set	2025-11-17 12:54:32 +08:00
yangdx	f0254773c6	Convert embedding_token_limit from property to field with __post_init__ • Remove property decorator • Add field with init=False • Set value in __post_init__ method • embedding_token_limit is now in config dictionary	2025-11-17 12:54:32 +08:00
yangdx	14a6c24ed7	Add configurable embedding token limit with validation - Add EMBEDDING_TOKEN_LIMIT env var - Set max_token_size on embedding func - Add token limit property to LightRAG - Validate summary length vs limit - Log warning when limit exceeded	2025-11-17 12:54:32 +08:00
yangdx	7d394fb0a4	Replace asyncio.iscoroutine with inspect.isawaitable for better detection	2025-11-17 12:54:32 +08:00
yangdx	af5423919b	Support async chunking functions in LightRAG processing pipeline - Add Awaitable and Union type imports - Update chunking_func type annotation - Handle coroutine results with await - Add return type validation - Update docstring for async support	2025-11-17 12:54:32 +08:00
Tong Da	5016025453	easier version: detect chunking_func result is coroutine or not	2025-11-17 12:54:32 +08:00
Tong Da	7740500693	support async chunking func to improve processing performance when a heavy `chunking_func` is passed in by user	2025-11-17 12:54:32 +08:00
BukeLy	18a4870229	fix: Add default workspace support for backward compatibility Fixes two compatibility issues in workspace isolation: 1. Problem: lightrag_server.py calls initialize_pipeline_status() without workspace parameter, causing pipeline to initialize in global namespace instead of rag's workspace. Solution: Add set_default_workspace() mechanism in shared_storage. LightRAG.initialize_storages() now sets default workspace, which initialize_pipeline_status() uses when called without parameters. 2. Problem: /health endpoint hardcoded to use "pipeline_status", cannot return workspace-specific status or support frontend workspace selection. Solution: Add LIGHTRAG-WORKSPACE header support. Endpoint now extracts workspace from header or falls back to server default, returning correct workspace-specific pipeline status. Changes: - lightrag/kg/shared_storage.py: Add set/get_default_workspace() - lightrag/lightrag.py: Call set_default_workspace() in initialize_storages() - lightrag/api/lightrag_server.py: Add get_workspace_from_request() helper, update /health endpoint to support LIGHTRAG-WORKSPACE header Testing: - Backward compatibility: Old code works without modification - Multi-instance safety: Explicit workspace passing preserved - /health endpoint: Supports both default and header-specified workspaces Related: #2353	2025-11-17 12:54:20 +08:00
BukeLy	eb52ec94d7	feat: Add workspace isolation support for pipeline status Problem: In multi-tenant scenarios, different workspaces share a single global pipeline_status namespace, causing pipelines from different tenants to block each other, severely impacting concurrent processing performance. Solution: - Extended get_namespace_data() to recognize workspace-specific pipeline namespaces with pattern "{workspace}:pipeline" (following GraphDB pattern) - Added workspace parameter to initialize_pipeline_status() for per-tenant isolated pipeline namespaces - Updated all 7 call sites to use workspace-aware locks: * lightrag.py: process_document_queue(), aremove_document() * document_routes.py: background_delete_documents(), clear_documents(), cancel_pipeline(), get_pipeline_status(), delete_documents() Impact: - Different workspaces can process documents concurrently without blocking - Backward compatible: empty workspace defaults to "pipeline_status" - Maintains fail-fast: uninitialized pipeline raises clear error - Expected N× performance improvement for N concurrent tenants Bug fixes: - Fixed AttributeError by using self.workspace instead of self.global_config - Fixed pipeline status endpoint to show workspace-specific status - Fixed delete endpoint to check workspace-specific busy flag Code changes: 4 files, 141 insertions(+), 28 deletions(-) Testing: All syntax checks passed, comprehensive workspace isolation tests completed	2025-11-17 12:53:44 +08:00
yangdx	ea141e2779	Fix: Remove redundant entity/relation chunk deletions	2025-11-07 02:56:16 +08:00
yangdx	04ed709b34	Optimize entity deletion by batching edge queries to avoid N+1 problem • Add batch get_nodes_edges_batch call • Remove individual get_node_edges calls • Improve query performance	2025-11-06 21:34:47 +08:00
yangdx	afb5e5c1cb	Fix edge cleanup when deleting entities to prevent orphaned relationships - Track edges to delete in set - Clean VDB before node deletion - Remove from relation chunks storage - Prevent orphaned relationship data	2025-10-31 02:36:15 +08:00
yangdx	c36afecba4	Remove redundant await call in file extraction pipeline	2025-10-30 20:35:41 +08:00
yangdx	3fa79026e0	Fix Entity Source IDs Tracking Problem - Handle existing node updates properly in edge merging stage - Fix source_ids merging logic - Reorder entity deletion and optimize node operations - Delete relationships before entities - Add edge existence debugging logs	2025-10-29 01:19:55 +08:00
yangdx	c81a56a113	Fix entity and relationship deletion when no chunk references remain	2025-10-28 16:02:35 +08:00
yangdx	5155edd8d2	feat: Improve entity merge and edit UX - API: The `graph/entity/edit` endpoint now returns a detailed `operation_summary` for better client-side handling of update, rename, and merge outcomes. - Web UI: Added an "auto-merge on rename" option. The UI now gracefully handles merge success, partial failures (update OK, merge fail), and other errors with specific user feedback.	2025-10-27 23:42:08 +08:00
yangdx	2c09adb8d3	Add chunk tracking support to entity merge functionality - Pass chunk storages to merge function - Merge relation chunk tracking data - Merge entity chunk tracking data - Delete old chunk tracking records - Persist chunk storage updates	2025-10-27 02:06:21 +08:00
yangdx	3fbd704bf9	Enhance entity/relation editing with chunk tracking synchronization • Add chunk storage sync to edit ops • Implement incremental chunk ID updates • Support entity renaming migrations • Normalize relation keys consistently • Preserve chunk references on edits	2025-10-26 14:34:56 +08:00
yangdx	29bf593663	Fix entity and relation chunk cleanup in deletion pipeline • Delete from entity_chunks storage • Delete from relation_chunks storage	2025-10-25 22:32:27 +08:00
yangdx	a9bc348446	Remove enable_logging parameter from data init lock call	2025-10-25 11:48:14 +08:00
yangdx	97a2ee4ef1	Rename rebuild function name and improve relationship logging format	2025-10-25 11:17:43 +08:00
yangdx	a9ec15e669	Resolve lock leakage issue during user cancellation handling • Change default log level to INFO • Force enable error logging output • Add lock cleanup rollback protection • Handle LLM cache persistence errors • Fix async task exception handling	2025-10-25 03:06:45 +08:00
yangdx	77336e50b6	Improve error handling and add cancellation checks in pipeline	2025-10-24 17:54:17 +08:00
yangdx	743aefc655	Add pipeline cancellation feature for graceful processing termination • Add cancel_pipeline API endpoint • Implement PipelineCancelledException • Add cancellation checks in main loop • Handle task cancellation gracefully • Mark cancelled docs as FAILED	2025-10-24 14:08:12 +08:00
yangdx	b76350a3bc	Fix linting	2025-10-22 12:53:42 +08:00
yangdx	d7e2527e1a	Handle cache deletion errors gracefully instead of raising exceptions	2025-10-22 12:53:19 +08:00
yangdx	162370b6e6	Add optional LLM cache deletion when deleting documents • Add delete_llm_cache parameter to API • Collect cache IDs from text chunks • Delete cache after graph operations • Update UI with new checkbox option • Add i18n translations for cache option	2025-10-22 12:19:23 +08:00
yangdx	e5e16b7bd1	Fix Redis data migration error • Use proper Redis connection context • Fix namespace pattern for key scanning • Propagate storage check exceptions • Remove defensive error swallowing	2025-10-21 16:27:04 +08:00
yangdx	a9fec26798	Add file path limit configuration for entities and relations • Add MAX_FILE_PATHS env variable • Implement file path count limiting • Support KEEP/FIFO strategies • Add truncation placeholder • Remove old build_file_path function	2025-10-20 20:12:53 +08:00
yangdx	dc62c78f98	Add entity/relation chunk tracking with configurable source ID limits - Add entity_chunks & relation_chunks storage - Implement KEEP/FIFO limit strategies - Update env.example with new settings - Add migration for chunk tracking data - Support all KV storage	2025-10-20 15:24:15 +08:00
yangdx	9f49e56a44	Merge branch 'main' into feat-entity-size-caps	2025-10-17 15:59:44 +08:00
DivinesLight	c06522b927	Get max source Id config from .env and lightRAG init	2025-10-15 18:24:38 +05:00
yangdx	29bac49fb9	Handle empty query results by returning None instead of fail responses • Return None when no context found • Add structured failure metadata • Use PROMPTS["fail_response"] for content • Keep API compatible	2025-10-15 12:04:49 +08:00
yangdx	130b4959dc	Add PREPROCESSED (multimodal_processed) status for multimodal document processing • Add DocStatus.PREPROCESSED enum value • Update API routes and response models • Add preprocessed filter in web UI • Update localization files • Handle preprocessed status in deletion	2025-10-14 14:02:05 +08:00
yangdx	074f0c8b23	Update docstring for adelete_by_doc_id method clarity	2025-10-12 10:12:45 +08:00
yangdx	457d51952e	Add doc_name field to full docs storage - Store file_path in full_docs storage - Update PostgreSQL implementation by map file_path to doc_name - Other storage implementation automatically handles the new field	2025-10-05 11:44:27 +08:00
yangdx	1766cddd6c	Fix mode parameter serialization error in Ollama chat API • Use mode.value for API requests • Add debug logging in aquery_llm	2025-09-27 15:11:51 +08:00
yangdx	8cd4139cbf	refactor: fix double query problem by add aquery_llm function for consistent response handling - Add new aquery_llm/query_llm methods providing structured responses - Consolidate /query and /query/stream endpoints to use unified aquery_llm - Optimize cache handling by moving cache checks before LLM calls	2025-09-26 19:05:03 +08:00
yangdx	b848ca49e6	Fix linting	2025-09-25 16:22:00 +08:00
yangdx	b08b8a6a6a	Add reference list support to query API endpoints with unified result handling • Add include_references param to QueryRequest • Extend QueryResponse with references field • Create unified QueryResult data structures • Refactor kg_query and naive_query functions • Update streaming to send references first	2025-09-25 16:21:42 +08:00
yangdx	5eb4a4b799	feat: simplify citations, add reference merging, and restructure API response format	2025-09-24 14:30:10 +08:00
yangdx	c0d5abba6b	Fix linting	2025-09-15 02:59:21 +08:00
yangdx	b1c8206346	Add aquery_data endpoint for structured retrieval without LLM generation - Add QueryDataResponse model - Implement /query/data endpoint - Add aquery_data method to LightRAG - Return entities, relationships, chunks	2025-09-15 02:15:14 +08:00
yangdx	82a67354d0	Code formatting improvements and style consistency fixes * Remove trailing whitespace * Fix function signature ellipsis style	2025-09-14 17:49:02 +08:00
yangdx	0ffb5d5f2d	Replace search API with aquery_data for consistent raw data retrieval, mirroring aquery results • Reuse existing query logic paths and remove kg_search function entirely • Update kg_query/naive_query to return raw data as needed	2025-09-13 15:30:29 +08:00
yangdx	6774058670	Merge branch 'main' into tongda/main	2025-09-09 22:43:17 +08:00
yangdx	077d9be5d7	Add Deepseek Style Chain of Thought (CoT) Support for OpenAI Compatible LLM providers - Add enable_cot parameter to all LLM APIs - Implement CoT for OpenAI with <think> tags - Log warnings for unsupported providers - Enable CoT in query operations - Handle streaming and non-streaming CoT	2025-09-09 22:34:36 +08:00

1 2 3 4 5 ...

619 commits