LightRAG/lightrag
yangdx f5b0c3d38c feat: Recording file extraction error status to document pipeline
- Add apipeline_enqueue_error_documents function to LightRAG class for recording file processing errors in doc_status storage
- Enhance pipeline_enqueue_file with detailed error handling for all file processing stages:
  * File access errors (permissions, not found)
  * UTF-8 encoding errors
  * Format-specific processing errors (PDF, DOCX, PPTX, XLSX)
  * Content validation errors
  * Unsupported file type errors

This implementation ensures all file extraction failures are properly tracked and recorded in the doc_status storage system, providing better visibility into document processing issues and enabling improved error monitoring and debugging capabilities.
2025-08-16 23:08:52 +08:00
..
api feat: Recording file extraction error status to document pipeline 2025-08-16 23:08:52 +08:00
kg Update Neo4j database naming in env.example 2025-08-15 19:14:38 +08:00
llm Fix linting 2025-08-09 08:41:41 +08:00
tools
__init__.py Bump core version to 1.4.7 and api version to 0198 2025-08-04 10:55:41 +08:00
base.py feat: KG related chunks selection by vector similarity 2025-08-13 18:16:42 +08:00
constants.py Change KG chunk selection default to VECTOR 2025-08-13 23:10:42 +08:00
exceptions.py
lightrag.py feat: Recording file extraction error status to document pipeline 2025-08-16 23:08:52 +08:00
llm.py
namespace.py Refac: Add workspace infomation to all logger output for all storage type 2025-08-12 01:19:09 +08:00
operate.py Add get_vectors_by_ids method and filter out vector data from query results 2025-08-15 16:32:26 +08:00
prompt.py Unify entity extraction prompt between passes 2025-07-27 23:06:55 +08:00
rerank.py Fix: rename rerank parameter from top_k to top_n 2025-07-20 00:26:27 +08:00
types.py
utils.py Add chunk tracking system to monitor chunk sources and frequencies 2025-08-14 22:58:26 +08:00
utils_graph.py Fix GRAPH_FIELD_SEP import typo 2025-06-29 01:28:39 +05:00