Commit graph

5275 commits

Author SHA1 Message Date
yangdx
155cb2a1d2 Expand AGENTS.md with testing controls and automation guidelines
- Add pytest marker and CLI toggle docs
- Document automation workflow rules
- Clarify integration test setup
- Add agent-specific best practices
- Update testing command examples

(cherry picked from commit 5cc916861f)
2025-12-04 19:09:06 +08:00
wmsnp
ae5cd9262b fix: add logger to configure_vchordrq() and format code
(cherry picked from commit f4bf5d279c)
2025-12-04 19:09:06 +08:00
wmsnp
3954bb6579 feat(postgres_impl): add vchordrq vector index support and unify vector index creation logic
(cherry picked from commit d07023c962)
2025-12-04 19:09:06 +08:00
yangdx
1cbe0ba885 Reduce log level and improve workspace mismatch message clarity
• Change warning to info level
• Simplify workspace mismatch wording

(cherry picked from commit 6cef8df159)
2025-12-04 19:09:06 +08:00
yangdx
0ac858d3e2 fix(postgres): allow vchordrq.epsilon config when probes is empty
Previously, configure_vchordrq would fail silently when probes was empty
(the default), preventing epsilon from being configured. Now each parameter
is handled independently with conditional execution, and configuration
errors fail-fast instead of being swallowed.

This fixes the documented epsilon setting being impossible to use in the
default configuration.

(cherry picked from commit 3096f844fb)
2025-12-04 19:09:06 +08:00
yangdx
5bd1320a1d Refactor storage classes to use namespace instead of final_namespace
(cherry picked from commit fd486bc922)
2025-12-04 19:09:06 +08:00
yangdx
ed46d375fb Auto-initialize pipeline status in LightRAG.initialize_storages()
• Remove manual initialize_pipeline_status calls
• Auto-init in initialize_storages method
• Update error messages for clarity
• Warn on workspace conflicts

(cherry picked from commit e22ac52ebc)
2025-12-04 19:09:05 +08:00
yangdx
961c87a6e5 Standardize empty workspace handling from "_" to "" across storage
* Unify empty workspace behavior by changing workspace from "_" to ""
* Fixed incorrect empty workspace detection in get_all_update_flags_status()

(cherry picked from commit d54d0d55d9)
2025-12-04 19:09:05 +08:00
yangdx
6b0c0ef815 Refactor namespace lock to support reusable async context manager
• Add NamespaceLock class wrapper
• Fix lock re-entrance issues
• Enable concurrent lock usage
• Fresh context per async with block
• Update get_namespace_lock API

(cherry picked from commit 7deb9a64b9)
2025-12-04 19:09:05 +08:00
yangdx
708f80f43d Add _default_workspace to shared storage finalization
- Add _default_workspace to global vars
- Set _default_workspace to None on cleanup
- Ensure complete resource cleanup
- Fix missing workspace finalization

(cherry picked from commit 6d6716e9f8)
2025-12-04 19:09:05 +08:00
BukeLy
c52c1aea69 test: Enhance workspace isolation test suite to 100% coverage
Why this enhancement is needed:
The initial test suite covered the 4 core scenarios from PR #2366, but
lacked comprehensive coverage of edge cases and implementation details.
This update adds 5 additional test scenarios to achieve complete validation
of the workspace isolation feature.

What was added:
Test 5 - NamespaceLock Re-entrance Protection (2 sub-tests):
  - Verifies re-entrance in same coroutine raises RuntimeError
  - Confirms same NamespaceLock instance works in concurrent coroutines

Test 6 - Different Namespace Lock Isolation:
  - Validates locks with same workspace but different namespaces are independent

Test 7 - Error Handling (2 sub-tests):
  - Tests None workspace conversion to empty string
  - Validates empty workspace creates correct namespace format

Test 8 - Update Flags Workspace Isolation (3 sub-tests):
  - set_all_update_flags isolation between workspaces
  - clear_all_update_flags isolation between workspaces
  - get_all_update_flags_status workspace filtering

Test 9 - Empty Workspace Standardization (2 sub-tests):
  - Empty workspace namespace format verification
  - Empty vs non-empty workspace independence

Test Results:
All 19 test cases passed (previously 9/9, now 19/19)
- 4 core PR requirements: 100% coverage
- 5 additional scenarios: 100% coverage
- Total coverage: 100% of workspace isolation implementation

Testing approach improvements:
- Proper initialization of update flags using get_update_flag()
- Correct handling of flag objects (.value property)
- Updated error handling tests to match actual implementation behavior
- All edge cases and boundary conditions validated

Impact:
Provides complete confidence in the workspace isolation feature with
comprehensive test coverage of all implementation details, edge cases,
and error handling paths.

(cherry picked from commit 436e41439e)
2025-12-04 19:09:05 +08:00
yangdx
67007ed9a6 Improve LightRAG initialization checker tool with better usage docs
• Add workspace param to get_namespace_data
• Update docstring with proper usage example
• Simplify demo to show correct workflow
• Remove confusing before/after comparison
• Clarify tool should run after init

(cherry picked from commit 393f880311)
2025-12-04 19:09:05 +08:00
yangdx
dcf88a8273 Refactor exception handling in MemgraphStorage label methods
(cherry picked from commit 4401f86f07)
2025-12-04 19:09:04 +08:00
yangdx
ed79218550 Optimize JSON write with fast/slow path to reduce memory usage
- Fast path for clean data (no sanitization)
- Slow path sanitizes during encoding
- Reload shared memory after sanitization
- Custom encoder avoids deep copies
- Comprehensive test coverage

(cherry picked from commit 777c987371)
2025-12-04 19:09:04 +08:00
yangdx
7632805cd0 Add concurrency warning for JsonKVStorage in cleanup tool
(cherry picked from commit 913fa1e415)
2025-12-04 19:09:04 +08:00
yangdx
db508954d1 Add uv package manager support to installation docs
(cherry picked from commit 7bc6ccea19)
2025-12-04 19:09:04 +08:00
yangdx
1daf35a77d Refactor storage selection UI with dynamic numbering and inline prompts
• Remove standalone get_user_choice method
• Add dynamic sequential numbering
• Inline choice validation logic
• Remove redundant storage type prints
• Improve excluded storage handling

(cherry picked from commit e95b02fb55)
2025-12-04 19:09:03 +08:00
yangdx
8f5ec484e3 Add async generator lock management rule to cline extension
(cherry picked from commit b72632e4d4)
2025-12-04 19:09:03 +08:00
yangdx
fa5510e6f6 Fix deadlock in JSON cache migration and prevent same storage selection
- Snapshot JSON data before yielding batches
- Release lock during batch processing
- Exclude source type from target selection
- Add detailed docstring for lock behavior
- Filter available storage types properly

(cherry picked from commit 5be04263b2)
2025-12-04 19:09:03 +08:00
yangdx
5a5e583b9c Improve storage config validation and add config.ini fallback support
• Add MongoDB env requirements
• Support config.ini fallback
• Warn on missing env vars
• Check available storage count
• Show config source info

(cherry picked from commit 1a91bcdb5f)
2025-12-04 19:09:03 +08:00
domices
45c10d7f22 Fix spelling errors in the "使用PostgreSQL存储" section of README-zh.md
(cherry picked from commit 5c0ced6e4a)
2025-12-04 19:09:03 +08:00
yangdx
d1ab42bb36 Translate graph storage test from Chinese to English
(cherry picked from commit f3b2ba8152)
2025-12-04 19:09:03 +08:00
yangdx
cea34d6691 Initialize shared storage for all graph storage types in graph unit test
(cherry picked from commit 36501b82f5)
2025-12-04 19:09:03 +08:00
yangdx
7896c42fba Restructure semaphore control to manage entire evaluation pipeline
• Move rag_semaphore to wrap full function
• Increase RAG concurrency to 2x eval limit
• Prevent memory buildup from slow evals
• Keep eval_semaphore for RAGAS control

(cherry picked from commit e5abe9dd3d)
2025-12-04 19:09:02 +08:00
yangdx
c459caed26 Implement two-stage pipeline for RAG evaluation with separate semaphores
• Split RAG gen and eval stages
• Add rag_semaphore for stage 1
• Add eval_semaphore for stage 2
• Improve concurrency control
• Update connection pool limits

(cherry picked from commit 83715a3ac1)
2025-12-04 19:09:02 +08:00
ben moussa anouar
dd425e5513 Update lightrag/evaluation/eval_rag_quality.py for launguage
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
(cherry picked from commit 98f0464a31)
2025-12-04 19:09:02 +08:00
yangdx
407a2c2ecd Remove redundant shutdown message from gunicorn
(cherry picked from commit 6d4a55100e)
2025-12-04 19:09:02 +08:00
yangdx
df2c24264f Improve entity merge logging by removing redundant message and fixing typo
(cherry picked from commit 9a8742da59)
2025-12-04 19:09:02 +08:00
yangdx
cd0cd99062 Include static files in package distribution
- Add static dir to MANIFEST.in
- Update package data config
- Ensure static assets are bundled
- Fix missing static file issue

(cherry picked from commit 16d3d82a0e)
2025-12-04 19:09:02 +08:00
yangdx
2d85e9f2f8 Fix swagger docs page problem in dev mode
- Add /static to VITE_API_ENDPOINTS
- Update proxy rewrite rules
- Include static file serving
- Sync sample env file

(cherry picked from commit ee7c683fa7)
2025-12-04 19:09:02 +08:00
yangdx
8c7b0017df Remove enable_logging parameter from get_data_init_lock call in MilvusVectorDBStorage
(cherry picked from commit 0692175c7b)
2025-12-04 19:09:01 +08:00
Anush008
e86aa091f4 refactor: Qdrant Multi-tenancy (Include staged)
Signed-off-by: Anush008 <anushshetty90@gmail.com>
(cherry picked from commit 8584980e3a)
2025-12-04 19:09:01 +08:00
yangdx
a42222d7f9 Resolve lock leakage issue during user cancellation handling
• Change default log level to INFO
• Force enable error logging output
• Add lock cleanup rollback protection
• Handle LLM cache persistence errors
• Fix async task exception handling

(cherry picked from commit a9ec15e669)
2025-12-04 19:09:01 +08:00
yangdx
8b6fdef965 Optimize PostgreSQL graph queries to avoid Cypher overhead and complexity
• Replace Cypher with native SQL queries
• Fix O(N²) to O(E) performance issue
• Add error handling for parse failures
• Use direct table access pattern
• Eliminate Cartesian product joins

(cherry picked from commit a97e5dad4c)
2025-12-04 19:09:01 +08:00
yangdx
ec9b4862d0 Simplify pipeline status dialog by consolidating message sections
• Remove separate latest message section
• Combine into single pipeline messages area
• Add overflow-x-hidden for better display
• Change break-words to break-all
• Update translations across all locales

(cherry picked from commit 2476d6b7f8)
2025-12-04 19:09:01 +08:00
yangdx
e4be3549c3 Improve entity identifier truncation warning message format
(cherry picked from commit 00aa5e53a7)
2025-12-04 19:09:00 +08:00
Yasiru Rangana
8a72135a32 Optimize PostgreSQL initialization performance
- Batch index existence checks into single query (16+ queries -> 1 query)
- Batch timestamp column checks into single query (8 queries -> 1 query)
- Batch field length checks into single query (5 queries -> 1 query)

Performance improvement: ~70-80% faster initialization (35s -> 5-10s)

Key optimizations:
1. check_tables(): Use ANY($1) to check all indexes at once
2. _migrate_timestamp_columns(): Batch all column type checks
3. _migrate_field_lengths(): Batch all field definition checks

All changes are backward compatible with no schema or API changes.
Reduces database round-trips by batching information_schema queries.

(cherry picked from commit 2f22336ace)
2025-12-04 19:09:00 +08:00
yangdx
c2620efc5e Update truncation message format in properties tooltip
(cherry picked from commit 019dff5248)
2025-12-04 19:09:00 +08:00
yangdx
3780addc4c Fix logging message formatting
(cherry picked from commit e0fd31a60d)
2025-12-04 19:09:00 +08:00
Lucky Verma
12ebc9f2a9 Refactor SQL queries and improve input handling in PGKVStorage and PGDocStatusStorage
(cherry picked from commit 917e41aa78)
2025-12-04 19:09:00 +08:00
Won-Kyu Park
f4d6fcbe91 remove deprecated dotenv package.
(cherry picked from commit 532400412e)
2025-12-04 19:09:00 +08:00
yangdx
c80a9d6ef0 Remove docling dependency and related packages from project
* Remove docling from pyproject.toml
* Update requirements files
* Clean up uv.lock dependencies
* Reduce offline docker image size

(cherry picked from commit f2b6a068e3)
2025-12-04 19:09:00 +08:00
yangdx
a2d67a7c22 Add build script for multi-platform images
- Add build script for multi-platform images
- Update docker deployment document

(cherry picked from commit ef79821f29)
2025-12-04 19:08:59 +08:00
yangdx
58f818c449 Change default docker image to offline version
• Add lite verion docker image with tiktoken cache
• Update docs and build scripts

(cherry picked from commit daeca17f38)
2025-12-04 19:08:59 +08:00
yangdx
f89c7315fd Migrate Dockerfile from pip to uv package manager for faster builds
• Replace pip with uv for dependencies
• Add offline extras to Dockerfile.offline
• Update UV_LOCK_GUIDE.md with new commands
• Improve build caching and performance

(cherry picked from commit 65c2eb9f99)
2025-12-04 19:08:59 +08:00
yangdx
9229c03d40 Migrate from pip to uv package manager for faster builds
• Replace pip with uv in Dockerfile
• Remove constraints-offline.txt
• Add uv.lock for dependency pinning
• Use uv sync --frozen for builds

(cherry picked from commit 466de2070d)
2025-12-04 19:08:59 +08:00
yangdx
7e5c23e15b docs: clarify docling exclusion in offline Docker image
(cherry picked from commit 388dce2e31)
2025-12-04 19:08:59 +08:00
yangdx
69f38041cc Remove explicit protobuf dependency from offline storage requirements
(cherry picked from commit bc1a70bad0)
2025-12-04 19:08:58 +08:00
yangdx
6bd5b2d95b Add offline deployment support with cache management and layered deps
• Add tiktoken cache downloader CLI
• Add layered offline dependencies
• Add offline requirements files
• Add offline deployment guide

(cherry picked from commit a5c05f1b92)
2025-12-04 19:08:58 +08:00
yangdx
e19a4be0af Preserve ordering in get_by_ids methods across all storage implementations
- Fix result ordering in vector stores
- Update KV storage get_by_ids methods
- Maintain order in doc status storage
- Return None for missing IDs

(cherry picked from commit 9be22dd666)
2025-12-04 19:08:58 +08:00