yangdx
dc62c78f98
Add entity/relation chunk tracking with configurable source ID limits
...
- Add entity_chunks & relation_chunks storage
- Implement KEEP/FIFO limit strategies
- Update env.example with new settings
- Add migration for chunk tracking data
- Support all KV storage
2025-10-20 15:24:15 +08:00
yangdx
9f49e56a44
Merge branch 'main' into feat-entity-size-caps
2025-10-17 15:59:44 +08:00
DivinesLight
c06522b927
Get max source Id config from .env and lightRAG init
2025-10-15 18:24:38 +05:00
yangdx
29bac49fb9
Handle empty query results by returning None instead of fail responses
...
• Return None when no context found
• Add structured failure metadata
• Use PROMPTS["fail_response"] for content
• Keep API compatible
2025-10-15 12:04:49 +08:00
yangdx
130b4959dc
Add PREPROCESSED (multimodal_processed) status for multimodal document processing
...
• Add DocStatus.PREPROCESSED enum value
• Update API routes and response models
• Add preprocessed filter in web UI
• Update localization files
• Handle preprocessed status in deletion
2025-10-14 14:02:05 +08:00
yangdx
074f0c8b23
Update docstring for adelete_by_doc_id method clarity
2025-10-12 10:12:45 +08:00
yangdx
457d51952e
Add doc_name field to full docs storage
...
- Store file_path in full_docs storage
- Update PostgreSQL implementation by map file_path to doc_name
- Other storage implementation automatically handles the new field
2025-10-05 11:44:27 +08:00
yangdx
1766cddd6c
Fix mode parameter serialization error in Ollama chat API
...
• Use mode.value for API requests
• Add debug logging in aquery_llm
2025-09-27 15:11:51 +08:00
yangdx
8cd4139cbf
refactor: fix double query problem by add aquery_llm function for consistent response handling
...
- Add new aquery_llm/query_llm methods providing structured responses
- Consolidate /query and /query/stream endpoints to use unified aquery_llm
- Optimize cache handling by moving cache checks before LLM calls
2025-09-26 19:05:03 +08:00
yangdx
b848ca49e6
Fix linting
2025-09-25 16:22:00 +08:00
yangdx
b08b8a6a6a
Add reference list support to query API endpoints with unified result handling
...
• Add include_references param to QueryRequest
• Extend QueryResponse with references field
• Create unified QueryResult data structures
• Refactor kg_query and naive_query functions
• Update streaming to send references first
2025-09-25 16:21:42 +08:00
yangdx
5eb4a4b799
feat: simplify citations, add reference merging, and restructure API response format
2025-09-24 14:30:10 +08:00
yangdx
c0d5abba6b
Fix linting
2025-09-15 02:59:21 +08:00
yangdx
b1c8206346
Add aquery_data endpoint for structured retrieval without LLM generation
...
- Add QueryDataResponse model
- Implement /query/data endpoint
- Add aquery_data method to LightRAG
- Return entities, relationships, chunks
2025-09-15 02:15:14 +08:00
yangdx
82a67354d0
Code formatting improvements and style consistency fixes
...
* Remove trailing whitespace
* Fix function signature ellipsis style
2025-09-14 17:49:02 +08:00
yangdx
0ffb5d5f2d
Replace search API with aquery_data for consistent raw data retrieval, mirroring aquery results
...
• Reuse existing query logic paths and remove kg_search function entirely
• Update kg_query/naive_query to return raw data as needed
2025-09-13 15:30:29 +08:00
yangdx
6774058670
Merge branch 'main' into tongda/main
2025-09-09 22:43:17 +08:00
yangdx
077d9be5d7
Add Deepseek Style Chain of Thought (CoT) Support for OpenAI Compatible LLM providers
...
- Add enable_cot parameter to all LLM APIs
- Implement CoT for OpenAI with <think> tags
- Log warnings for unsupported providers
- Enable CoT in query operations
- Handle streaming and non-streaming CoT
2025-09-09 22:34:36 +08:00
yangdx
3477e9f919
Merge branch 'main' into tongda/main
2025-09-09 18:27:56 +08:00
yangdx
3059089e7d
Fix logging order in pipeline history trimming
2025-09-08 23:00:44 +08:00
yangdx
9437df83cc
Add memory management for pipeline history messages
...
- Trim history at 10k messages
- Keep latest 5k messages
- Prevent memory growth
- Add logging for trim events
2025-09-08 15:56:35 +08:00
yangdx
387d817fc2
Remove trailing colons from queue names in function wrappers
2025-09-06 00:53:05 +08:00
yangdx
de972f6222
Rename method for clarity and improve code readability
...
- Rename _process_entity_relation_graph to _process_extract_entities
2025-09-04 11:48:31 +08:00
Tong Da
dc7ce98c7e
Add search interface to lightrag.
2025-09-01 02:40:40 +08:00
yangdx
1a015a7015
Add queue_name parameter to priority_limit_async_func_call for better logging
...
• Add queue_name parameter to decorator
• Update all log messages with queue names
• Pass specific names for LLM and embedding
2025-08-31 23:47:22 +08:00
yangdx
925e631a9a
refac: Add robust time out handling for LLM request
2025-08-29 13:50:35 +08:00
yangdx
ff0a18e08c
Unify SUMMARY_LANGUANGE and ENTITY_TYPES implementation method
2025-08-27 12:23:22 +08:00
Thibo Rosemplatt
c3aabfc251
Merge branch 'main' into entityTypesServerSupport
2025-08-26 21:48:20 +02:00
yangdx
d3623cc9ae
fix: resolve infinite loop risk in _handle_entity_relation_summary
...
- Ensure oversized descriptions are force-merged with subsequent ones
- Add len(current_list) <= 2 termination condition to guarantee convergence
- Implement token-based truncation in _summarize_descriptions to prevent overflow
2025-08-26 21:58:31 +08:00
yangdx
6bcfe696ee
feat: add output length recommendation and description type to LLM summary
...
- Add SUMMARY_LENGTH_RECOMMENDED parameter (600 tokens)
- Optimize prompt temple for LLM summary
2025-08-26 14:41:12 +08:00
yangdx
cb0fe38b9a
Fix linting
2025-08-26 02:22:34 +08:00
yangdx
de2daf6565
refac: Rename summary_max_tokens to summary_context_size, comprehensive parameter validation for summary configuration
...
- Update algorithm logic in operate.py for better token management
- Fix health endpoint to use correct parameter names
2025-08-26 01:35:50 +08:00
yangdx
0b1b264a5d
refactor: optimize graph lock scope in document deletion
...
- Move dependency analysis outside graph database lock
- Add persistence call before lock release to prevent dirty reads
2025-08-25 17:46:32 +08:00
Thibo Rosemplatt
d054ec5d00
Added entity_types as a user defined variable (via .env)
2025-08-23 20:16:11 +02:00
yangdx
bf43e1b8c1
fix: Resolve default rerank config problem when env var missing
...
- Read config from selected_rerank_func when env var missing
- Make api_key optional for rerank function
- Add response format validation with proper error handling
- Update Cohere rerank default to official API endpoint
2025-08-23 01:07:59 +08:00
yangdx
0e67ead8fa
Rename MAX_TOKENS to SUMMARY_MAX_TOKENS for clarity
2025-08-21 10:15:20 +08:00
yangdx
9b7ed84e05
Improve document deletion error handling and message consistency
...
- Standardize deletion log messages
- Add try-catch for file operations
- Improve enqueued file error handling
2025-08-20 11:01:24 +08:00
yangdx
485c4b7de7
Change document deletion warnings to info level logging
2025-08-20 03:28:42 +08:00
yangdx
806081645f
Refactor text cleaning to use sanitize_text_for_encoding consistently
...
• Replace clean_text with sanitize_text
• Remove deprecated clean_text function
• Add whitespace trimming to sanitizer
• Improve UTF-8 encoding safety
• Consolidate text cleaning logic
2025-08-19 19:20:01 +08:00
yangdx
e38df464ea
Ensure front-end file type uploads are synchronized with back-end
2025-08-19 15:10:13 +08:00
yangdx
1c4d6fde58
Change log level from info to debug for document storage message
2025-08-18 20:04:29 +08:00
yangdx
377f1a022e
fix: reset PROCESSING/FAILED docs to PENDING at the beginging of document processing pipeline
...
- Reset documents with PROCESSING/FAILED status to PENDING when they pass consistency checks
- Update doc_status storage and clear error messages/metadata on reset
2025-08-18 00:49:52 +08:00
yangdx
add8b07a21
Improve logging messages for document processing clarity
2025-08-18 00:22:04 +08:00
yangdx
1941df9cf6
Simplify warning message format for document deletion
2025-08-17 13:30:55 +08:00
yangdx
3e4214cef3
Standardize document deletion warning messages for consistency
2025-08-17 09:35:46 +08:00
yangdx
cceb46b320
fix: subdirectories are no longer processed during file scans
...
• Change rglob to glob for file scanning
• Simplify error logging messages
2025-08-16 23:46:33 +08:00
yangdx
f5b0c3d38c
feat: Recording file extraction error status to document pipeline
...
- Add apipeline_enqueue_error_documents function to LightRAG class for recording file processing errors in doc_status storage
- Enhance pipeline_enqueue_file with detailed error handling for all file processing stages:
* File access errors (permissions, not found)
* UTF-8 encoding errors
* Format-specific processing errors (PDF, DOCX, PPTX, XLSX)
* Content validation errors
* Unsupported file type errors
This implementation ensures all file extraction failures are properly tracked and recorded in the doc_status storage system, providing better visibility into document processing issues and enabling improved error monitoring and debugging capabilities.
2025-08-16 23:08:52 +08:00
yangdx
ca4c18baaa
Preserve failed documents during data consistency validation for manual review
2025-08-16 22:29:46 +08:00
yangdx
e1310c5262
Optimize document processing pipeline by removing duplicate step
2025-08-16 17:23:01 +08:00
yangdx
5591ef3ac8
Fix document filtering logic and improve logging for ignored docs
2025-08-16 17:22:08 +08:00