LightRAG

Author	SHA1	Message	Date
yangdx	ce702ccb2f	Add workspace parameter and remove chunk-based query unit tests - Add workspace param to test storage init - Remove get_nodes_by_chunk_ids tests - Remove get_edges_by_chunk_ids tests - Clean up batch operations test function (cherry picked from commit `6b0f9795be`)	2025-12-04 19:11:20 +08:00
anouarbm	7ce251c319	docs: Add documentation and examples for include_chunk_content parameter Added comprehensive documentation for the new include_chunk_content parameter that enables retrieval of actual chunk text content in API responses. Documentation Updates: - Added "Include Chunk Content in References" section to API README - Explained use cases: RAG evaluation, debugging, citations, transparency - Provided JSON request/response examples - Clarified parameter interaction with include_references OpenAPI/Swagger Examples: - Added "Response with chunk content" example to /query endpoint - Shows complete reference structure with content field - Demonstrates realistic chunk text content This makes the feature discoverable through: 1. API documentation (README.md) 2. Interactive Swagger UI (http://localhost:9621/docs) 3. Code examples for developers (cherry picked from commit `963ad4c637`)	2025-12-04 19:11:20 +08:00
anouarbm	349c1945db	Optimize RAGAS evaluation with parallel execution and chunk content enrichment Added efficient RAG evaluation system with optimized API calls and comprehensive benchmarking. Key Features: - Single API call per evaluation (2x faster than before) - Parallel evaluation based on MAX_ASYNC environment variable - Chunk content enrichment in /query endpoint responses - Comprehensive benchmark statistics (moyennes) - NaN-safe metric calculations API Changes: - Added include_chunk_content parameter to QueryRequest (backward compatible) - /query endpoint enriches references with actual chunk content when requested - No breaking changes - default behavior unchanged Evaluation Improvements: - Parallel execution using asyncio.Semaphore (respects MAX_ASYNC) - Shared HTTP client with connection pooling - Proper timeout handling (3min connect, 5min read) - Debug output for context retrieval verification - Benchmark statistics with averages, min/max scores Results: - Moyenne RAGAS Score: 0.9772 - Perfect Faithfulness: 1.0000 - Perfect Context Recall: 1.0000 - Perfect Context Precision: 1.0000 - Excellent Answer Relevance: 0.9087 (cherry picked from commit `0bbef9814e`)	2025-12-04 19:11:20 +08:00
yangdx	8f16f6fe31	Fix entity and relationship deletion when no chunk references remain (cherry picked from commit `c81a56a113`)	2025-12-04 19:11:19 +08:00
yangdx	17a9771cfb	Add chunk tracking support to entity merge functionality - Pass chunk storages to merge function - Merge relation chunk tracking data - Merge entity chunk tracking data - Delete old chunk tracking records - Persist chunk storage updates (cherry picked from commit `2c09adb8d3`)	2025-12-04 19:11:19 +08:00
yangdx	450f969430	Add chunk tracking cleanup to entity/relation deletion and creation • Clean up chunk storage on delete • Track chunks in create operations • Normalize relation keys consistently (cherry picked from commit `a3370b024d`)	2025-12-04 19:11:19 +08:00
yangdx	7e0f12c28e	Enhance entity/relation editing with chunk tracking synchronization • Add chunk storage sync to edit ops • Implement incremental chunk ID updates • Support entity renaming migrations • Normalize relation keys consistently • Preserve chunk references on edits (cherry picked from commit `3fbd704bf9`)	2025-12-04 19:11:19 +08:00
yangdx	488f67e5b2	Fix entity and relation chunk cleanup in deletion pipeline • Delete from entity_chunks storage • Delete from relation_chunks storage (cherry picked from commit `29bf593663`)	2025-12-04 19:11:19 +08:00
yangdx	cb5451faf8	Add entity/relation chunk tracking with configurable source ID limits - Add entity_chunks & relation_chunks storage - Implement KEEP/FIFO limit strategies - Update env.example with new settings - Add migration for chunk tracking data - Support all KV storage (cherry picked from commit `dc62c78f98`)	2025-12-04 19:11:19 +08:00
yangdx	7248e09fc4	Allow related chunks missing in knowledge graph queries (cherry picked from commit `35cd567c9e`)	2025-12-04 19:11:18 +08:00
yangdx	851b45f726	Add pipeline status lock function for legacy compatibility - Add get_pipeline_status_lock function - Return NamespaceLock for consistency - Support workspace parameter - Enable logging option - Legacy code compatibility (cherry picked from commit `93d445dfdd`)	2025-12-04 19:11:18 +08:00
yangdx	402d2f9a98	Fix namespace parsing when workspace contains colons • Use rsplit instead of split • Handle colons in workspace names (cherry picked from commit `f8dd2e0724`)	2025-12-04 19:11:18 +08:00
yangdx	6ba35f81df	Fix: auto-acquire pipeline when idle in document deletion • Track if we acquired the pipeline lock • Auto-acquire pipeline when idle • Only release if we acquired it • Prevent concurrent deletion conflicts • Improve deletion job validation (cherry picked from commit `4048fc4b89`)	2025-12-04 19:11:18 +08:00
yangdx	7e7c86601e	Improve workspace isolation tests with better parallelism checks and cleanup • Add finalize_share_data cleanup • Refactor lock timing measurement • Add timeline overlap validation • Include purpose/scope documentation • Fix tokenizer integration (cherry picked from commit `21ad990e36`)	2025-12-04 19:11:18 +08:00
yangdx	5febb88824	Fix missing workspace parameter in update flags status call (cherry picked from commit `1745b30a5f`)	2025-12-04 19:11:18 +08:00
yangdx	dc4c10c346	Fix NamespaceLock context variable timing to prevent lock bricking * Acquire lock before setting ContextVar * Prevent state corruption on cancellation * Fix permanent lock brick scenario * Store context only after success * Handle acquisition failure properly (cherry picked from commit `e8383df3b8`)	2025-12-04 19:11:17 +08:00
yangdx	87561f8b28	Remove manual initialize_pipeline_status() calls across codebase - Auto-init pipeline status in storages - Remove redundant import statements - Simplify initialization pattern - Update docs and examples (cherry picked from commit `cdd53ee875`)	2025-12-04 19:11:17 +08:00
yangdx	1e7bd654d8	Fix NamespaceLock concurrent coroutine safety with ContextVar - Use ContextVar for per-coroutine storage - Prevent state interference between coroutines - Add re-entrance protection check (cherry picked from commit `b6a5a90eaf`)	2025-12-04 19:11:17 +08:00
yangdx	f6a45245bd	Add pipeline status validation before document deletion (cherry picked from commit `9d7b7981ce`)	2025-12-04 19:11:17 +08:00
yangdx	94ae13a037	Refactor workspace handling to use default workspace and namespace locks - Remove DB-specific workspace configs - Add default workspace auto-setting - Replace global locks with namespace locks - Simplify pipeline status management - Remove redundant graph DB locking (cherry picked from commit `926960e957`)	2025-12-04 19:11:17 +08:00
yangdx	c01cfc3649	Fix workspace filtering logic in get_all_update_flags_status • Handle namespaces with/without prefixes • Fix workspace matching logic (cherry picked from commit `7ed0eac4c9`)	2025-12-04 19:11:16 +08:00
yangdx	50f8ddd933	Fix pipeline status namespace check to handle root case - Add check for bare "pipeline_status" - Handle namespace without prefix (cherry picked from commit `78689e8837`)	2025-12-04 19:11:16 +08:00
yangdx	dfab175c16	Fix workspace isolation for pipeline status across all operations - Fix final_namespace error in get_namespace_data() - Fix get_workspace_from_request return type - Add workspace param to pipeline status calls (cherry picked from commit `52c812b9a0`)	2025-12-04 19:11:16 +08:00
BukeLy	fe1576943f	fix: Add default workspace support for backward compatibility Fixes two compatibility issues in workspace isolation: 1. Problem: lightrag_server.py calls initialize_pipeline_status() without workspace parameter, causing pipeline to initialize in global namespace instead of rag's workspace. Solution: Add set_default_workspace() mechanism in shared_storage. LightRAG.initialize_storages() now sets default workspace, which initialize_pipeline_status() uses when called without parameters. 2. Problem: /health endpoint hardcoded to use "pipeline_status", cannot return workspace-specific status or support frontend workspace selection. Solution: Add LIGHTRAG-WORKSPACE header support. Endpoint now extracts workspace from header or falls back to server default, returning correct workspace-specific pipeline status. Changes: - lightrag/kg/shared_storage.py: Add set/get_default_workspace() - lightrag/lightrag.py: Call set_default_workspace() in initialize_storages() - lightrag/api/lightrag_server.py: Add get_workspace_from_request() helper, update /health endpoint to support LIGHTRAG-WORKSPACE header Testing: - Backward compatibility: Old code works without modification - Multi-instance safety: Explicit workspace passing preserved - /health endpoint: Supports both default and header-specified workspaces Related: #2353 (cherry picked from commit `18a4870229`)	2025-12-04 19:11:16 +08:00
BukeLy	f7b500bca2	feat: Add workspace isolation support for pipeline status Problem: In multi-tenant scenarios, different workspaces share a single global pipeline_status namespace, causing pipelines from different tenants to block each other, severely impacting concurrent processing performance. Solution: - Extended get_namespace_data() to recognize workspace-specific pipeline namespaces with pattern "{workspace}:pipeline" (following GraphDB pattern) - Added workspace parameter to initialize_pipeline_status() for per-tenant isolated pipeline namespaces - Updated all 7 call sites to use workspace-aware locks: * lightrag.py: process_document_queue(), aremove_document() * document_routes.py: background_delete_documents(), clear_documents(), cancel_pipeline(), get_pipeline_status(), delete_documents() Impact: - Different workspaces can process documents concurrently without blocking - Backward compatible: empty workspace defaults to "pipeline_status" - Maintains fail-fast: uninitialized pipeline raises clear error - Expected N× performance improvement for N concurrent tenants Bug fixes: - Fixed AttributeError by using self.workspace instead of self.global_config - Fixed pipeline status endpoint to show workspace-specific status - Fixed delete endpoint to check workspace-specific busy flag Code changes: 4 files, 141 insertions(+), 28 deletions(-) Testing: All syntax checks passed, comprehensive workspace isolation tests completed (cherry picked from commit `eb52ec94d7`)	2025-12-04 19:11:16 +08:00
yangdx	4cc6388742	Add auto-refresh of popular labels when pipeline completes • Monitor pipeline busy->idle transitions • Reload labels on dropdown open if needed • Add onBeforeOpen callback to AsyncSelect • Clear refresh flags after processing • Improve label sync with backend state (cherry picked from commit `58c83f9da5`)	2025-12-04 19:11:15 +08:00
yangdx	a7330f0b95	Remove redundant await call in file extraction pipeline (cherry picked from commit `c36afecba4`)	2025-12-04 19:11:15 +08:00
yangdx	537db072e0	Add Qdrant legacy collection migration with workspace support - Add QdrantMigrationError exception - Implement automatic data migration - Support workspace-based partitioning - Add migration verification logic - Update collection naming scheme (cherry picked from commit `5f4a280458`)	2025-12-04 19:11:15 +08:00
yangdx	46c13e23f0	Add confirmation dialog for pipeline cancellation (cherry picked from commit `81e3496aa4`)	2025-12-04 19:11:15 +08:00
yangdx	74d0a22020	Add pipeline cancellation feature with UI and i18n support - Add cancelPipeline API endpoint - Add cancel button to status dialog - Update status response type - Add cancellation UI translations - Handle cancellation request states (cherry picked from commit `f89b5ab101`)	2025-12-04 19:11:15 +08:00
yangdx	687d2b6b13	Improve error handling and add cancellation checks in pipeline (cherry picked from commit `77336e50b6`)	2025-12-04 19:11:15 +08:00
yangdx	a471f1ca0e	Add pipeline cancellation feature for graceful processing termination • Add cancel_pipeline API endpoint • Implement PipelineCancelledException • Add cancellation checks in main loop • Handle task cancellation gracefully • Mark cancelled docs as FAILED (cherry picked from commit `743aefc655`)	2025-12-04 19:11:15 +08:00
yangdx	37d48bafb6	Simplify skip logging and reduce pipeline status updates (cherry picked from commit `a5253244f9`)	2025-12-04 19:11:14 +08:00
yangdx	d56b4c856e	Fix trailing whitespace and update test mocking for rerank module • Remove trailing whitespace • Fix TiktokenTokenizer import patch • Add async context manager mocks • Update aiohttp.ClientSession patch • Improve test reliability (cherry picked from commit `561ba4e4b5`)	2025-12-04 19:11:14 +08:00
yangdx	f6c20faa16	Configure Dependabot schedule with specific times and timezone - Set Monday 2AM for GitHub Actions - Set Wednesday 2AM for Python deps - Set Friday 2AM for web UI deps - Use Asia/Shanghai timezone - Spread updates across weekdays (cherry picked from commit `6476021619`)	2025-12-04 19:11:14 +08:00
yangdx	a32d201f17	Refactor dependencies and add test extra in pyproject.toml • Pin httpx version in api extra • Extract test dependencies to new extra • Move httpx pin from evaluation to api • Add api dependency to evaluation extra • Separate test from evaluation concerns (cherry picked from commit `268e4ff6f1`)	2025-12-04 19:11:14 +08:00
yangdx	ea421295a6	Drop Python 3.10 and 3.11 from CI test matrix (cherry picked from commit `1f8751225d`)	2025-12-04 19:11:14 +08:00
yangdx	9068c629c6	Configure comprehensive Dependabot for Python and frontend dependencies - Add pip ecosystem with grouping - Add bun ecosystem for webui - Set weekly update schedule - Configure cooldown periods - Ignore numpy breaking changes (cherry picked from commit `0f19f80fdb`)	2025-12-04 19:11:13 +08:00
dependabot[bot]	a5ca3b13cc	Bump the github-actions group with 7 updates Bumps the github-actions group with 7 updates: \| Package \| From \| To \| \| --- \| --- \| --- \| \| [actions/checkout](https://github.com/actions/checkout) \| `2` \| `6` \| \| [actions/setup-python](https://github.com/actions/setup-python) \| `2` \| `6` \| \| [docker/build-push-action](https://github.com/docker/build-push-action) \| `5` \| `6` \| \| [oven-sh/setup-bun](https://github.com/oven-sh/setup-bun) \| `1` \| `2` \| \| [actions/upload-artifact](https://github.com/actions/upload-artifact) \| `4` \| `5` \| \| [actions/download-artifact](https://github.com/actions/download-artifact) \| `4` \| `6` \| \| [actions/stale](https://github.com/actions/stale) \| `9` \| `10` \| Updates `actions/checkout` from 2 to 6 - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](https://github.com/actions/checkout/compare/v2...v6) Updates `actions/setup-python` from 2 to 6 - [Release notes](https://github.com/actions/setup-python/releases) - [Commits](https://github.com/actions/setup-python/compare/v2...v6) Updates `docker/build-push-action` from 5 to 6 - [Release notes](https://github.com/docker/build-push-action/releases) - [Commits](https://github.com/docker/build-push-action/compare/v5...v6) Updates `oven-sh/setup-bun` from 1 to 2 - [Release notes](https://github.com/oven-sh/setup-bun/releases) - [Commits](https://github.com/oven-sh/setup-bun/compare/v1...v2) Updates `actions/upload-artifact` from 4 to 5 - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](https://github.com/actions/upload-artifact/compare/v4...v5) Updates `actions/download-artifact` from 4 to 6 - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](https://github.com/actions/download-artifact/compare/v4...v6) Updates `actions/stale` from 9 to 10 - [Release notes](https://github.com/actions/stale/releases) - [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md) - [Commits](https://github.com/actions/stale/compare/v9...v10) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions - dependency-name: actions/setup-python dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions - dependency-name: docker/build-push-action dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions - dependency-name: oven-sh/setup-bun dependency-version: '2' dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions - dependency-name: actions/upload-artifact dependency-version: '5' dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions - dependency-name: actions/download-artifact dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions - dependency-name: actions/stale dependency-version: '10' dependency-type: direct:production update-type: version-update:semver-major dependency-group: github-actions ... Signed-off-by: dependabot[bot] <support@github.com> (cherry picked from commit `88357675ea`)	2025-12-04 19:11:13 +08:00
Christian Clauss	ae84bd8ff5	Keep GitHub Actions up to date with GitHub's Dependabot * [Keeping your software supply chain secure with Dependabot](https://docs.github.com/en/code-security/dependabot) * [Keeping your actions up to date with Dependabot](https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot) * [Configuration options for the `dependabot.yml` file - package-ecosystem](https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#package-ecosystem) To see all GitHub Actions dependencies, type: % `git grep 'uses: ' .github/workflows/` (cherry picked from commit `90e38c20ca`)	2025-12-04 19:11:13 +08:00
yangdx	97a2c8e8b9	Add ruff as dependency to pytest and evaluation extras (cherry picked from commit `5f91063c7a`)	2025-12-04 19:11:13 +08:00
yangdx	322ff19f72	Remove ascii_colors dependency and fix stream handling errors • Remove ascii_colors.trace_exception calls • Add SafeStreamHandler for closed streams • Patch ascii_colors console handler • Prevent ValueError on stream close • Improve logging error handling (cherry picked from commit `0fb2925c6a`)	2025-12-04 19:11:13 +08:00
yangdx	fd76e0f7ce	Enhance workspace isolation test with distinct mock data and persistence • Use different mock LLM per workspace • Add persistent test directory • Create workspace-specific responses • Skip cleanup for inspection (cherry picked from commit `99262adaaa`)	2025-12-04 19:11:13 +08:00
yangdx	4da291468d	Rename test classes to prevent warning from pytest • TestResult → ExecutionResult • TestStats → ExecutionStats • Update class docstrings • Update type hints • Update variable references (cherry picked from commit `7e9c8ed1e8`)	2025-12-04 19:11:12 +08:00
yangdx	60520e0188	test: add concurrent execution to workspace isolation test • Add async sleep to mock functions • Test concurrent ainsert operations • Use asyncio.gather for parallel exec • Measure concurrent execution time (cherry picked from commit `6ae0c14438`)	2025-12-04 19:11:12 +08:00
yangdx	9cf1629117	Add pre-commit to pytest dependencies and format test code • Add pre-commit to pytest extra deps • Update lock file dependencies (cherry picked from commit `5da82bb096`)	2025-12-04 19:11:12 +08:00
yangdx	668b842862	Standardize test directory creation and remove tempfile dependency • Remove unused tempfile import • Use consistent project temp/ structure • Clean up existing directories first • Create directories with os.makedirs • Use descriptive test directory names (cherry picked from commit `4fef731f37`)	2025-12-04 19:11:12 +08:00
yangdx	660ccc7ada	Add GitHub CI workflow and test markers for offline/integration tests - Add GitHub Actions workflow for CI - Mark integration tests requiring services - Add offline test markers for isolated tests - Skip integration tests by default - Configure pytest markers and collection (cherry picked from commit `4ea2124001`)	2025-12-04 19:11:12 +08:00
yangdx	a6fc87d50e	Replace pytest group reference with explicit dependencies in evaluation • Remove pytest group dependency • Add explicit pytest>=8.4.2 • Add pytest-asyncio>=1.2.0 • Add pre-commit directly • Fix potential circular dependency (cherry picked from commit `472b498ade`)	2025-12-04 19:11:12 +08:00
yangdx	d790a660cd	Fix test to use default workspace parameter behavior (cherry picked from commit `41bf6d0283`)	2025-12-04 19:11:12 +08:00

1 2 3 4 5 ...

5362 commits