LightRAG

Author	SHA1	Message	Date
yangdx	4e751e0653	refac: Enhance extraction with improved prompts and parser - Prompts: Restructured prompts with clearer steps and quality guidelines. Simplified the relationship tuple by removing `relationship_strength` - Model: Updated default entity types to be more comprehensive and consistently capitalized (e.g., `Location`, `Product`)	2025-08-31 22:24:11 +08:00
yangdx	75de40da41	Fix typo in relationship extraction log messages	2025-08-31 17:45:16 +08:00
yangdx	97c9600085	Improve extraction error handling and field validation • Add field count validation warnings • Fix relationship field count (5→6) • Change error logs to warnings	2025-08-31 17:33:42 +08:00
yangdx	b747417961	feat: enhance text extraction text sanitization and normalization - Improve reduntant quotes in entity and relation name, type and keywords - Add HTML tag cleaning and Chinese symbol conversion - Filter out short numeric content and malformed text - Enhance entity type validation with character filtering	2025-08-31 13:17:20 +08:00
yangdx	d4bbc5dea9	refactor: Merge multi-step text sanitization into single function	2025-08-31 10:36:56 +08:00
yangdx	69890ff2e1	Bump core version to 1.4.8 and api version to 0210	2025-08-31 03:01:33 +08:00
yangdx	8bab240dbc	Update webui assets	2025-08-31 03:00:16 +08:00
yangdx	25b5d176cd	Fix label selection with leading/trailing whitespace • Fix AsyncSelect value trimming issue • Preserve whitespace in label display • Use safe keys for command items • Add GraphControl dependency fix • Add debug logging for graph labels	2025-08-31 02:54:39 +08:00
yangdx	ae09b5c656	refactor: eliminate conditional imports and simplify LightRAG initialization - Remove conditional import block, replace with lazy loading factory functions - Add create_llm_model_func() and create_llm_model_kwargs() for clean configuration - Update wrapper functions with lazy imports for better performance - Unify LightRAG initialization, eliminating duplicate conditional branches - Reduce code complexity by 33% while maintaining full backward compatibility	2025-08-31 00:18:29 +08:00
yangdx	332202c111	Fix lambda closure bug in embedding function configuration • Replace lambda with proper async function • Capture config values at creation time • Avoid closure variable reference issues • Add factory function for embeddings • Remove test file for closure bug	2025-08-30 23:43:34 +08:00
avchauzov	414d47d12a	fix(server): Resolve lambda closure bug in embedding_func Fixes #2023. Resolves an issue where the embedding function would incorrectly fall back to the OpenAI provider if the server's configuration arguments were mutated after initialization. This was caused by a lambda function capturing a reference to the mutable 'args' object instead of capturing the configuration values at creation time.	2025-08-30 14:43:33 +02:00
yangdx	43f32e8d97	Bump api version to 0209	2025-08-29 19:42:06 +08:00
yangdx	f3989548b9	Fix MongoDB vector query embedding format compatibility * Convert numpy arrays to lists * Ensure MongoDB compatibility	2025-08-29 18:51:53 +08:00
yangdx	03d0fa3014	perf: add optional query_embedding parameter to avoid redundant embedding calls	2025-08-29 18:15:45 +08:00
yangdx	a923d378dd	Remove deprecated ID-based filtering from vector storage queries - Remove ids param from QueryParam - Simplify BaseVectorStorage.query signature - Update all vector storage implementations - Streamline PostgreSQL query templates - Remove ID filtering from operate.py calls	2025-08-29 17:06:48 +08:00
yangdx	d7e0701b63	Improve logging setup and add error prefixes for LLM functions - Move logger init to top of file - Add console handler by default - Prefix LLM errors with "[LLM func]" - Update timeout log messages - Comment out pypinyin success log	2025-08-29 14:19:13 +08:00
yangdx	925e631a9a	refac: Add robust time out handling for LLM request	2025-08-29 13:50:35 +08:00
yangdx	99e28e815b	fix: prevent document processing failures from UTF-8 surrogate characters - Change sanitize_text_for_encoding to fail-fast instead of returning error placeholders - Add strict UTF-8 cleaning pipeline to entity/relationship extraction - Skip problematic entities/relationships instead of corrupting data Fixes document processing crashes when encountering surrogate characters (U+D800-U+DFFF)	2025-08-27 23:52:39 +08:00
yangdx	6a2a592224	Fix linting	2025-08-27 12:51:50 +08:00
yangdx	8a0d06e557	Restore default entity types	2025-08-27 12:51:18 +08:00
yangdx	28e07c89f9	Fix linting	2025-08-27 12:35:51 +08:00
yangdx	2ccc39de9a	Fix language fallback in summarize error	2025-08-27 12:34:27 +08:00
yangdx	0be4f0144b	Merge branch 'entityTypesServerSupport'	2025-08-27 12:23:58 +08:00
yangdx	ff0a18e08c	Unify SUMMARY_LANGUANGE and ENTITY_TYPES implementation method	2025-08-27 12:23:22 +08:00
LinkinPony	45da0385eb	Merge branch 'HKUDS:main' into main	2025-08-27 09:22:39 +08:00
Thibo Rosemplatt	c3aabfc251	Merge branch 'main' into entityTypesServerSupport	2025-08-26 21:48:20 +02:00
yangdx	c259b8f22c	Update webui assets and bump aip verion to 0208	2025-08-26 23:05:00 +08:00
yangdx	d3623cc9ae	fix: resolve infinite loop risk in _handle_entity_relation_summary - Ensure oversized descriptions are force-merged with subsequent ones - Add len(current_list) <= 2 termination condition to guarantee convergence - Implement token-based truncation in _summarize_descriptions to prevent overflow	2025-08-26 21:58:31 +08:00
yangdx	e0a755e42c	Refactor prompt instructions to emphasize depth and completeness	2025-08-26 18:28:57 +08:00
yangdx	79e0226b2b	Refactor: move force_llm_summary_on_merge to global_config access - Remove parameter from function signature - Access from global_config instead - Improve code consistency	2025-08-26 18:02:39 +08:00
yangdx	01a2c79f29	Standardize prompt formatting and section headers across templates - Remove hash delimiters - Consistent section headers - Add "Output:" labels - Clean up example formatting	2025-08-26 14:42:52 +08:00
yangdx	6bcfe696ee	feat: add output length recommendation and description type to LLM summary - Add SUMMARY_LENGTH_RECOMMENDED parameter (600 tokens) - Optimize prompt temple for LLM summary	2025-08-26 14:41:12 +08:00
LinkinPony	ff4c747a2a	fix mismatch of 'error' and 'error_msg' in MongoDB	2025-08-26 10:43:56 +08:00
yangdx	025f70089a	Simplify status messages in knowledge rebuild operations	2025-08-26 04:26:15 +08:00
yangdx	84416d104d	Increase default LLM summary merge threshold from 4 to 8 for reducing summary trigger frequency	2025-08-26 03:57:35 +08:00
yangdx	9eb2be79b8	feat: track actual LLM usage in entity/relation merging - Modified _handle_entity_relation_summary to return tuple[str, bool] - Updated merge functions to log "LLMmerg" vs "Merging" based on actual LLM usage - Replaced hardcoded fragment count prediction with real-time LLM usage tracking	2025-08-26 03:56:18 +08:00
yangdx	cb0fe38b9a	Fix linting	2025-08-26 02:22:34 +08:00
yangdx	de2daf6565	refac: Rename summary_max_tokens to summary_context_size, comprehensive parameter validation for summary configuration - Update algorithm logic in operate.py for better token management - Fix health endpoint to use correct parameter names	2025-08-26 01:35:50 +08:00
yangdx	91767ffcee	Improve warning message formatting in entity/relationship rebuild	2025-08-25 21:55:29 +08:00
yangdx	15cdd0dd8f	fix: Sort cached extraction results by the create_time within each chunk This ensures the KG rebuilds maintain the original creation order of the first extraction result for each chunk.	2025-08-25 21:41:33 +08:00
yangdx	882d6857d8	feat: Implement map-reduce summarization to handle large humber of description merging	2025-08-25 21:03:16 +08:00
yangdx	0b1b264a5d	refactor: optimize graph lock scope in document deletion - Move dependency analysis outside graph database lock - Add persistence call before lock release to prevent dirty reads	2025-08-25 17:46:32 +08:00
yangdx	cac8e189e7	Remove redundant entity vector deletion before upsert	2025-08-25 17:18:51 +08:00
yangdx	9b6de7512d	Optimize the stability of description merging order	2025-08-25 17:10:51 +08:00
yangdx	31f4f96944	Exclude conversation history from context length calculation	2025-08-25 12:43:34 +08:00
yangdx	f688e95f56	Add warning for vector chunks missing chunk_id	2025-08-25 12:42:25 +08:00
yangdx	b6aedba7ae	Add logging for empty naive query results in vector context	2025-08-25 12:21:31 +08:00
yangdx	f1ff5cf93f	fix: initialize truncated_chunks variable in _build_query_context Prevents local variable 'truncated_chunks'referenced before assignment	2025-08-25 11:56:56 +08:00
Thibo Rosemplatt	f5938f76bc	Azure OpenAI requires import of OpenAILLMOptions (missing)	2025-08-24 00:28:49 +02:00
Thibo Rosemplatt	d054ec5d00	Added entity_types as a user defined variable (via .env)	2025-08-23 20:16:11 +02:00

1 2 3 4 5 ...

3087 commits