LightRAG

Author	SHA1	Message	Date
Claude	9030280a58	Fix critical issues and improve best practices in .env.unraid.example Critical fixes: - Fix SUMMARY_LENGTH_RECOMMENDED_ typo (trailing underscore) - Change LLM_MODEL from gpt-5-mini to gpt-4o-mini (GPT-5 doesn't exist) - Update all GPT-5 references to GPT-4o in comments Best practice improvements: - Reduce NEO4J_MAX_CONNECTION_POOL_SIZE from 75 to 50 (better for 6-core system) - Add logging rotation settings (LOG_MAX_BYTES, LOG_BACKUP_COUNT) - Add Advanced Entity/Relation Management settings documentation The Advanced Entity/Relation Management settings help users control metadata storage for entities/relations in the knowledge graph, which is especially useful when processing large books where entities appear in many chunks.	2025-11-15 10:25:46 +00:00
Lars Varming	8a65241d98	Merge pull request #1 from Varming73/claude/help-needed-018WdFNrzNu1oZfobErALfdP Add .env.unraid.example for personal use	2025-11-15 11:12:03 +01:00
Claude	9578d26b17	Add .env.unraid.example for Dell T140 server configuration This template is optimized for: - Dell T140 (6-core Xeon E-2226G, 32GB RAM) - Docker deployment on Unraid - GPT-5-mini LLM - Voyage-3-large embeddings (2048 dims) - Jina reranker - Neo4j graph storage + Postgres (pgvector) for vector/KV/doc storage - Books, articles, and podcast transcripts use case Performance settings tuned for ~40-50% CPU utilization during heavy processing.	2025-11-15 09:52:41 +00:00
Daniel.y	3b76eea20b	Merge pull request #2359 from danielaskdd/embedding-limit Refact: Add Embedding Token Limit Configuration and Improve Error Handling	2025-11-15 01:27:26 +08:00
yangdx	8722103550	Update env.example • Comment out Ollama config • Set OpenAI as active default • Add EMBEDDING_TOKEN_LIMIT option • Add Gemini embedding configuration	2025-11-15 01:25:56 +08:00
yangdx	b5589ce4d5	Merge branch 'main' into embedding-limit	2025-11-15 01:10:34 +08:00
Daniel.y	9a2ddcee31	Merge pull request #2360 from danielaskdd/macos-gunicorn-numpy Add macOS fork safety check for Gunicorn multi-worker mode	2025-11-15 01:02:41 +08:00
yangdx	4343db753a	Add macOS fork safety check for Gunicorn multi-worker mode • Check OBJC_DISABLE_INITIALIZE_FORK_SAFETY • Prevent NumPy/Accelerate crashes • Show detailed error message • Provide multiple fix options • Exit early if misconfigured	2025-11-15 00:58:23 +08:00
Daniel.y	c6850ac5ac	Merge pull request #2358 from sleeepyin/main Update the value corresponding to the extracted entity relationship keywords	2025-11-14 23:47:58 +08:00
yangdx	5dec4deac7	Improve embedding config priority and add debug logging • Fix embedding_dim priority logic • Add final config logging	2025-11-14 23:22:44 +08:00
yangdx	de4412dd40	Fix embedding token limit initialization order * Capture max_token_size before decorator * Apply wrapper after capturing attribute * Prevent decorator from stripping dataclass * Ensure token limit is properly set	2025-11-14 22:56:03 +08:00
yangdx	963a0a5db1	Refactor embedding function creation with proper attribute inheritance - Extract max_token_size from providers - Avoid double-wrapping EmbeddingFunc - Improve configuration priority logic - Add comprehensive debug logging - Return complete EmbeddingFunc instance	2025-11-14 22:29:08 +08:00
yangdx	39b49e92ff	Convert embedding_token_limit from property to field with __post_init__ • Remove property decorator • Add field with init=False • Set value in __post_init__ method • embedding_token_limit is now in config dictionary	2025-11-14 20:58:41 +08:00
yangdx	ab4d7ac2b0	Add configurable embedding token limit with validation - Add EMBEDDING_TOKEN_LIMIT env var - Set max_token_size on embedding func - Add token limit property to LightRAG - Validate summary length vs limit - Log warning when limit exceeded	2025-11-14 19:28:36 +08:00
yangdx	680e36c6eb	Improve Bedrock error handling with retry logic and custom exceptions • Add specific exception types • Implement proper retry mechanism • Better error classification • Enhanced logging and validation • Enable embedding retry decorator	2025-11-14 18:51:41 +08:00
yangdx	05852e1ab2	Add max_token_size parameter to embedding function decorators - Add max_token_size=8192 to all embed funcs - Move siliconcloud to deprecated folder - Import wrap_embedding_func_with_attrs - Update EmbeddingFunc docstring - Fix langfuse import type annotation	2025-11-14 18:41:43 +08:00
Sleeep	b88d785469	Merge branch 'HKUDS:main' into main	2025-11-14 16:49:30 +08:00
Daniel.y	399a23c3a6	Merge pull request #2356 from danielaskdd/improve-error-handling Fix: Robust error handling for async database operations in graph storage	2025-11-14 11:16:14 +08:00
yangdx	4401f86f07	Refactor exception handling in MemgraphStorage label methods	2025-11-14 11:01:26 +08:00
yangdx	1ccef2b932	Fix null reference errors in graph database error handling - Initialize result vars to None - Add null checks before consume calls - Prevent crashes in except blocks - Apply fix to both Neo4J and Memgraph	2025-11-14 10:39:04 +08:00
yangdx	c164c8f631	Merge branch 'main' of github.com:HKUDS/LightRAG	2025-11-13 20:42:47 +08:00
yangdx	1889301597	Merge branch 'feat/add_cloud_ollama_support'	2025-11-13 20:41:58 +08:00
yangdx	77ad906d3a	Improve error handling and logging in cloud model detection	2025-11-13 20:41:44 +08:00
Daniel.y	28fba19b11	Merge pull request #2352 from danielaskdd/docling-gunicorn-multi-worker Refact: Enhance DOCLING integration with lazy loading and macOS safeguards	2025-11-13 20:37:48 +08:00
yangdx	cc031a3db9	Add macOS compatibility check for DOCLING with multi-worker Gunicorn	2025-11-13 19:18:04 +08:00
LacombeLouis	844537e378	Add a better regex	2025-11-13 12:17:51 +01:00
yangdx	a24d8181c2	Improve docling integration with macOS compatibility and CLI flag - Add --docling CLI flag for easier setup - Add numpy version constraints - Exclude docling on macOS (fork-safety)	2025-11-13 18:58:09 +08:00
Daniel.y	76adde3858	Merge pull request #2351 from danielaskdd/lazy-config-loading Refact: Implement Lazy Configuration Initialization for API Server	2025-11-13 15:55:35 +08:00
Sleeep	89e63aa49b	Update edge keywords extraction in graph visualization 构建neo4j时候关键字的取值默认为d7 应该为修改后的d9	2025-11-13 15:52:14 +08:00
yangdx	e6588f9119	Update uv.lock	2025-11-13 15:31:51 +08:00
yangdx	746c069ab0	Implement lazy configuration initialization for API server • Add lazy config initialization • Maintain backward compatibility • Support programmatic usage • Add gunicorn dependency • Explicit config in entry points	2025-11-13 15:28:05 +08:00
Daniel.y	470e2fd1f9	Merge pull request #2350 from danielaskdd/reduce-dynamic-import Refactor: Remove blocking dependency installation from document upload handlers	2025-11-13 15:06:05 +08:00
yangdx	4b31942e2a	refactor: move document deps to api group, remove dynamic imports - Merge offline-docs into api extras - Remove pipmaster dynamic installs - Add async document processing - Pre-check docling availability - Update offline deployment docs	2025-11-13 13:34:09 +08:00
yangdx	8765974467	Merge branch 'tongda/main'	2025-11-13 12:56:28 +08:00
yangdx	c230d1a28d	Replace asyncio.iscoroutine with inspect.isawaitable for better detection	2025-11-13 12:56:01 +08:00
yangdx	297e460740	Merge branch 'main' into tongda/main	2025-11-13 12:37:37 +08:00
yangdx	940bec0b31	Support async chunking functions in LightRAG processing pipeline - Add Awaitable and Union type imports - Update chunking_func type annotation - Handle coroutine results with await - Add return type validation - Update docstring for async support	2025-11-13 12:37:15 +08:00
yangdx	343d30727a	Update env.example	2025-11-13 11:40:56 +08:00
Louis Lacombe	f7432a260e	Add support for environment variable fallback for API key and default host for cloud models	2025-11-12 16:11:05 +00:00
Daniel.y	075399ffc5	Merge pull request #2346 from danielaskdd/optimize-json-sanitization Refactor: Optimize write_json for Memory Efficiency and Performance	2025-11-12 16:50:28 +08:00
yangdx	70cc2419f2	Fix empty dict handling after JSON sanitization • Replace truthy checks with `is not None` • Handle empty dict edge case properly • Prevent data reload failures • Add comprehensive test coverage • Fix JsonKVStorage and DocStatusStorage	2025-11-12 16:40:57 +08:00
yangdx	dcf1d28681	Fix migration to reload sanitized data and prevent memory corruption • Reload cleaned data after sanitization • Update shared memory with clean data • Add specific surrogate char tests • Test migration sanitization flow • Prevent dirty data in memory	2025-11-12 16:16:28 +08:00
yangdx	6de4123f74	Optimize JSON string sanitization with precompiled regex and zero-copy - Precompile regex pattern at module level - Zero-copy path for clean strings - Use C-level regex for performance - Remove deprecated _sanitize_json_data - Fast detection for common case	2025-11-12 15:42:07 +08:00
yangdx	777c987371	Optimize JSON write with fast/slow path to reduce memory usage - Fast path for clean data (no sanitization) - Slow path sanitizes during encoding - Reload shared memory after sanitization - Custom encoder avoids deep copies - Comprehensive test coverage	2025-11-12 13:48:56 +08:00
Daniel.y	477c3f54fb	Merge pull request #2345 from danielaskdd/remove-response-type Remove deprecated response_type parameter from query settings UI	2025-11-12 12:32:59 +08:00
yangdx	8c07c91833	Remove deprecated response_type parameter from query settings - Bump API version to 0254 - Remove response format UI controls - Hard-code response_type in query params - Add migration for version 19 - Clean up settings store structure	2025-11-12 12:19:30 +08:00
Daniel.y	69ca366242	Merge pull request #2344 from danielaskdd/fix-josn-serialization-error Fix: Prevent UnicodeEncodeError in JSON storage operations	2025-11-12 00:58:59 +08:00
yangdx	f28a0c25b1	Improve JSON data sanitization to handle tuples and dict keys - Sanitize dictionary keys - Preserve tuple types - Handle nested structures better	2025-11-12 00:50:18 +08:00
yangdx	6918a88f92	Add specialized JSON string sanitizer to prevent UTF-8 encoding errors • Remove surrogate characters (U+D800-DFFF) • Filter Unicode non-characters • Direct char-by-char filtering	2025-11-12 00:38:47 +08:00
yangdx	d1f4b6e515	Add data sanitization to JSON writing to prevent UTF-8 encoding errors • Add _sanitize_json_data helper function • Recursively clean strings in data • Sanitize before JSON serialization • Prevent encoding-related crashes • Use existing sanitize_text_for_encoding	2025-11-12 00:11:13 +08:00

1 2 3 4 5 ...

5661 commits