- Add KaTeX extensions (mhchem for chemistry, copy-tex for copying)
- Add CASCADE to AGE extension for PostgreSQL
- Remove future dependency, replace passlib with bcrypt
- Fix Jina embedding configuration and provider defaults
- Update gunicorn help text and bump API version to 0258
- Documentation and README updates
Add citation tracking and display system across backend and frontend components.
Backend changes include citation.py for document attribution, enhanced query routes
with citation metadata, improved prompt templates, and PostgreSQL schema updates.
Frontend includes CitationMarker component, HoverCard UI, QuerySettings refinements,
and ChatMessage enhancements for displaying document sources. Update dependencies
and docker-compose test configuration for improved development workflow.
Graph Connectivity Awareness:
- Add db_degree property to all KG implementations (NetworkX, Postgres, Neo4j, Mongo, Memgraph)
- Show database degree vs visual degree in node panel with amber badge
- Add visual indicator (amber border) for nodes with hidden connections
- Add "Load X hidden connection(s)" button to expand hidden neighbors
- Add configurable "Expand Depth" setting (1-5) in graph settings
- Use global maxNodes setting for node expansion consistency
Orphan Connection UI:
- Add OrphanConnectionDialog component for manual orphan entity connection
- Add OrphanConnectionControl button in graph sidebar
- Expose /graph/orphans/connect API endpoint for frontend use
Backend Improvements:
- Add get_orphan_entities() and connect_orphan_entities() to base storage
- Add orphan connection configuration parameters
- Improve entity extraction with relationship density requirements
Frontend:
- Add graphExpandDepth and graphIncludeOrphans to settings store
- Add min_degree and include_orphans graph filtering parameters
- Update translations (en.json, zh.json)
Implement automatic orphan entity connection system that identifies entities with
no relationships and creates meaningful connections via vector similarity + LLM
validation. This improves knowledge graph connectivity and retrieval quality.
Changes:
- Add orphan connection configuration parameters (thresholds, cross-connect settings)
- Implement aconnect_orphan_entities() method with 4-step validation pipeline
- Add SQL templates for efficient orphan and candidate entity queries
- Create POST /graph/orphans/connect API endpoint with configurable parameters
- Add orphan connection validation prompt for LLM-based relationship verification
- Include relationship density requirement in extraction prompts to prevent orphans
- Update docker-compose.test.yml with optimized extraction parameters
- Add quality validation test suite (run_quality_tests.py) for retrieval evaluation
- Add unit test framework (test_orphan_connection_quality.py) with test cases
- Enable auto-run of orphan connection after document processing
Previously, configure_vchordrq would fail silently when probes was empty
(the default), preventing epsilon from being configured. Now each parameter
is handled independently with conditional execution, and configuration
errors fail-fast instead of being swallowed.
This fixes the documented epsilon setting being impossible to use in the
default configuration.
- Add _default_workspace to global vars
- Set _default_workspace to None on cleanup
- Ensure complete resource cleanup
- Fix missing workspace finalization
* Acquire lock before setting ContextVar
* Prevent state corruption on cancellation
* Fix permanent lock brick scenario
* Store context only after success
* Handle acquisition failure properly
• Replace truthy checks with `is not None`
• Handle empty dict edge case properly
• Prevent data reload failures
• Add comprehensive test coverage
• Fix JsonKVStorage and DocStatusStorage
• Reload cleaned data after sanitization
• Update shared memory with clean data
• Add specific surrogate char tests
• Test migration sanitization flow
• Prevent dirty data in memory
- Fast path for clean data (no sanitization)
- Slow path sanitizes during encoding
- Reload shared memory after sanitization
- Custom encoder avoids deep copies
- Comprehensive test coverage
Fixes two compatibility issues in workspace isolation:
1. Problem: lightrag_server.py calls initialize_pipeline_status()
without workspace parameter, causing pipeline to initialize in
global namespace instead of rag's workspace.
Solution: Add set_default_workspace() mechanism in shared_storage.
LightRAG.initialize_storages() now sets default workspace, which
initialize_pipeline_status() uses when called without parameters.
2. Problem: /health endpoint hardcoded to use "pipeline_status",
cannot return workspace-specific status or support frontend
workspace selection.
Solution: Add LIGHTRAG-WORKSPACE header support. Endpoint now
extracts workspace from header or falls back to server default,
returning correct workspace-specific pipeline status.
Changes:
- lightrag/kg/shared_storage.py: Add set/get_default_workspace()
- lightrag/lightrag.py: Call set_default_workspace() in initialize_storages()
- lightrag/api/lightrag_server.py: Add get_workspace_from_request() helper,
update /health endpoint to support LIGHTRAG-WORKSPACE header
Testing:
- Backward compatibility: Old code works without modification
- Multi-instance safety: Explicit workspace passing preserved
- /health endpoint: Supports both default and header-specified workspaces
Related: #2353
Problem:
In multi-tenant scenarios, different workspaces share a single global
pipeline_status namespace, causing pipelines from different tenants to
block each other, severely impacting concurrent processing performance.
Solution:
- Extended get_namespace_data() to recognize workspace-specific pipeline
namespaces with pattern "{workspace}:pipeline" (following GraphDB pattern)
- Added workspace parameter to initialize_pipeline_status() for per-tenant
isolated pipeline namespaces
- Updated all 7 call sites to use workspace-aware locks:
* lightrag.py: process_document_queue(), aremove_document()
* document_routes.py: background_delete_documents(), clear_documents(),
cancel_pipeline(), get_pipeline_status(), delete_documents()
Impact:
- Different workspaces can process documents concurrently without blocking
- Backward compatible: empty workspace defaults to "pipeline_status"
- Maintains fail-fast: uninitialized pipeline raises clear error
- Expected N× performance improvement for N concurrent tenants
Bug fixes:
- Fixed AttributeError by using self.workspace instead of self.global_config
- Fixed pipeline status endpoint to show workspace-specific status
- Fixed delete endpoint to check workspace-specific busy flag
Code changes: 4 files, 141 insertions(+), 28 deletions(-)
Testing: All syntax checks passed, comprehensive workspace isolation tests completed
• Replace truthy checks with `is not None`
• Handle empty dict edge case properly
• Prevent data reload failures
• Add comprehensive test coverage
• Fix JsonKVStorage and DocStatusStorage
• Reload cleaned data after sanitization
• Update shared memory with clean data
• Add specific surrogate char tests
• Test migration sanitization flow
• Prevent dirty data in memory
- Fast path for clean data (no sanitization)
- Slow path sanitizes during encoding
- Reload shared memory after sanitization
- Custom encoder avoids deep copies
- Comprehensive test coverage
• Remove premature ID normalization
• Add lookup mapping for node resolution
• Filter results by requested nodes only
• Improve error logging with workspace