Commit graph

5650 commits

Author SHA1 Message Date
Alexander Belikov
4945faf021 merging main 2025-11-13 19:37:44 +01:00
Alexander Belikov
cdbb0b0826 minor simplifications 2025-11-13 18:05:33 +01:00
Alexander Belikov
3f33d30c33 better looking example 2025-11-13 17:15:39 +01:00
Alexander Belikov
fc0a417775 tested on example; fixed schema definition 2025-11-13 16:57:41 +01:00
yangdx
c164c8f631 Merge branch 'main' of github.com:HKUDS/LightRAG 2025-11-13 20:42:47 +08:00
yangdx
1889301597 Merge branch 'feat/add_cloud_ollama_support' 2025-11-13 20:41:58 +08:00
yangdx
77ad906d3a Improve error handling and logging in cloud model detection 2025-11-13 20:41:44 +08:00
Daniel.y
28fba19b11
Merge pull request #2352 from danielaskdd/docling-gunicorn-multi-worker
Refact: Enhance DOCLING integration with lazy loading and macOS safeguards
2025-11-13 20:37:48 +08:00
yangdx
cc031a3db9 Add macOS compatibility check for DOCLING with multi-worker Gunicorn 2025-11-13 19:18:04 +08:00
LacombeLouis
844537e378 Add a better regex 2025-11-13 12:17:51 +01:00
yangdx
a24d8181c2 Improve docling integration with macOS compatibility and CLI flag
- Add --docling CLI flag for easier setup
- Add numpy version constraints
- Exclude docling on macOS (fork-safety)
2025-11-13 18:58:09 +08:00
Daniel.y
76adde3858
Merge pull request #2351 from danielaskdd/lazy-config-loading
Refact: Implement Lazy Configuration Initialization for API Server
2025-11-13 15:55:35 +08:00
yangdx
e6588f9119 Update uv.lock 2025-11-13 15:31:51 +08:00
yangdx
746c069ab0 Implement lazy configuration initialization for API server
• Add lazy config initialization
• Maintain backward compatibility
• Support programmatic usage
• Add gunicorn dependency
• Explicit config in entry points
2025-11-13 15:28:05 +08:00
Daniel.y
470e2fd1f9
Merge pull request #2350 from danielaskdd/reduce-dynamic-import
Refactor: Remove blocking dependency installation from document upload handlers
2025-11-13 15:06:05 +08:00
yangdx
4b31942e2a refactor: move document deps to api group, remove dynamic imports
- Merge offline-docs into api extras
- Remove pipmaster dynamic installs
- Add async document processing
- Pre-check docling availability
- Update offline deployment docs
2025-11-13 13:34:09 +08:00
yangdx
8765974467 Merge branch 'tongda/main' 2025-11-13 12:56:28 +08:00
yangdx
c230d1a28d Replace asyncio.iscoroutine with inspect.isawaitable for better detection 2025-11-13 12:56:01 +08:00
yangdx
297e460740 Merge branch 'main' into tongda/main 2025-11-13 12:37:37 +08:00
yangdx
940bec0b31 Support async chunking functions in LightRAG processing pipeline
- Add Awaitable and Union type imports
- Update chunking_func type annotation
- Handle coroutine results with await
- Add return type validation
- Update docstring for async support
2025-11-13 12:37:15 +08:00
yangdx
343d30727a Update env.example 2025-11-13 11:40:56 +08:00
Louis Lacombe
f7432a260e Add support for environment variable fallback for API key and default host for cloud models 2025-11-12 16:11:05 +00:00
Daniel.y
075399ffc5
Merge pull request #2346 from danielaskdd/optimize-json-sanitization
Refactor: Optimize write_json for Memory Efficiency and Performance
2025-11-12 16:50:28 +08:00
yangdx
70cc2419f2 Fix empty dict handling after JSON sanitization
• Replace truthy checks with `is not None`
• Handle empty dict edge case properly
• Prevent data reload failures
• Add comprehensive test coverage
• Fix JsonKVStorage and DocStatusStorage
2025-11-12 16:40:57 +08:00
yangdx
dcf1d28681 Fix migration to reload sanitized data and prevent memory corruption
• Reload cleaned data after sanitization
• Update shared memory with clean data
• Add specific surrogate char tests
• Test migration sanitization flow
• Prevent dirty data in memory
2025-11-12 16:16:28 +08:00
yangdx
6de4123f74 Optimize JSON string sanitization with precompiled regex and zero-copy
- Precompile regex pattern at module level
- Zero-copy path for clean strings
- Use C-level regex for performance
- Remove deprecated _sanitize_json_data
- Fast detection for common case
2025-11-12 15:42:07 +08:00
yangdx
777c987371 Optimize JSON write with fast/slow path to reduce memory usage
- Fast path for clean data (no sanitization)
- Slow path sanitizes during encoding
- Reload shared memory after sanitization
- Custom encoder avoids deep copies
- Comprehensive test coverage
2025-11-12 13:48:56 +08:00
Daniel.y
477c3f54fb
Merge pull request #2345 from danielaskdd/remove-response-type
Remove deprecated response_type parameter from query settings UI
2025-11-12 12:32:59 +08:00
yangdx
8c07c91833 Remove deprecated response_type parameter from query settings
- Bump API version to 0254
- Remove response format UI controls
- Hard-code response_type in query params
- Add migration for version 19
- Clean up settings store structure
2025-11-12 12:19:30 +08:00
Daniel.y
69ca366242
Merge pull request #2344 from danielaskdd/fix-josn-serialization-error
Fix: Prevent UnicodeEncodeError in JSON storage operations
2025-11-12 00:58:59 +08:00
yangdx
f28a0c25b1 Improve JSON data sanitization to handle tuples and dict keys
- Sanitize dictionary keys
- Preserve tuple types
- Handle nested structures better
2025-11-12 00:50:18 +08:00
yangdx
6918a88f92 Add specialized JSON string sanitizer to prevent UTF-8 encoding errors
• Remove surrogate characters (U+D800-DFFF)
• Filter Unicode non-characters
• Direct char-by-char filtering
2025-11-12 00:38:47 +08:00
yangdx
d1f4b6e515 Add data sanitization to JSON writing to prevent UTF-8 encoding errors
• Add _sanitize_json_data helper function
• Recursively clean strings in data
• Sanitize before JSON serialization
• Prevent encoding-related crashes
• Use existing sanitize_text_for_encoding
2025-11-12 00:11:13 +08:00
yangdx
1ffb533812 Update env.example 2025-11-11 12:02:37 +08:00
Daniel.y
5a6bb65867
Merge pull request #2338 from danielaskdd/migrate-to-pypdf
Refactor: Migrate PDF processing dependency from `pypdf` to actively `pypdf`
2025-11-11 01:41:59 +08:00
yangdx
fdcb4d0b6d Replace PyPDF2 with pypdf for PDF processing
- Update import from PyPDF2 to pypdf
- Change dependency to pypdf>=6.1.0
- Update all requirements files
- Remove PyPDF2 from lock file
- Use modern pypdf library
2025-11-11 01:38:09 +08:00
Tong Da
245df75d9c easier version: detect chunking_func result is coroutine or not 2025-11-10 20:49:50 +08:00
yangdx
e8f5f57ec7 Update qdrant-client minimum version from 1.7.0 to 1.11.0
• Bump qdrant-client to >=1.11.0
• Update pyproject.toml dependency
• Update requirements files
• Sync uv.lock with new version
• Maintain <2.0.0 upper bound
2025-11-10 11:54:48 +08:00
yangdx
913fa1e415 Add concurrency warning for JsonKVStorage in cleanup tool 2025-11-09 23:04:04 +08:00
Tong Da
d137ba5843 support async chunking func to improve processing performance when a heavy chunking_func is passed in by user 2025-11-09 14:52:42 +08:00
yangdx
1f9d0735c3 Bump API version to 0253 2025-11-09 14:42:22 +08:00
Daniel.y
3110ca518b
Merge pull request #2335 from danielaskdd/llm-cache-cleanup
Feat: Add LLM Query Cache Cleanup Tool
2025-11-09 14:27:58 +08:00
yangdx
37b7118901 Fix table alignment and add validation for empty cleanup selections 2025-11-09 14:17:56 +08:00
yangdx
1485cb82e9 Add LLM query cache cleanup tool for KV storage backends
- Interactive cleanup workflow
- Supports all KV storage types
- Batch deletion with progress
- Comprehensive error reporting
- Preserves workspace isolation
2025-11-09 13:37:33 +08:00
Daniel.y
8859eaade7
Merge pull request #2334 from danielaskdd/hotfix-opena-streaming
HotFix: Restore OpenAI Streaming Response & Refactor keyword_extraction Parameter
2025-11-09 12:25:20 +08:00
yangdx
2f16065256 Refactor keyword_extraction from kwargs to explicit parameter
• Add keyword_extraction param to functions
• Remove kwargs.pop() calls
• Update function signatures
• Improve parameter documentation
• Make parameter handling consistent
2025-11-09 12:02:17 +08:00
yangdx
88ab73f6ae HotFix: Restore streaming response in OpenAI LLM
The stream and timeout parameters were moved from **kwargs to explicit
parameters in a previous commit, but were not being passed to the OpenAI
API, causing streaming responses to fail and fall back to non-streaming
mode.Fixes the issue where stream=True was being silently ignored, resulting
in unexpected non-streaming behavior.
2025-11-09 11:52:26 +08:00
yangdx
c12bc372dc Update README 2025-11-09 04:35:41 +08:00
yangdx
7bc6ccea19 Add uv package manager support to installation docs 2025-11-09 04:31:07 +08:00
yangdx
80f2e691fc Remove redundant i18n import triggered the Vite “dynamic + static import” warning 2025-11-09 02:48:11 +08:00