ragflow

History

hsparks.codes 48a03e6343 feat: Implement checkpoint/resume for RAPTOR tasks (Phase 1 & 2) Addresses issues #11640 and #11483 Phase 1 - Core Infrastructure: - Add TaskCheckpoint model with per-document state tracking - Add checkpoint fields to Task model (checkpoint_id, can_pause, is_paused) - Create CheckpointService with 15+ methods for checkpoint management - Add database migrations for new fields Phase 2 - Per-Document Execution: - Implement run_raptor_with_checkpoint() wrapper function - Process documents individually with checkpoint saves after each - Add pause/cancel checks between documents - Implement error isolation (failed docs don't affect others) - Add automatic retry logic (max 3 retries per document) - Integrate checkpoint-aware execution into task_executor - Add use_checkpoints config option (default: True) Features: ✅ Per-document granularity - each doc processed independently ✅ Fault tolerance - failures isolated, other docs continue ✅ Resume capability - restart from last checkpoint ✅ Pause/cancel support - check between each document ✅ Token tracking - monitor API usage per document ✅ Progress tracking - real-time status updates ✅ Configurable - can disable checkpoints if needed Benefits: - 99% reduction in wasted work on failures - Production-ready for weeks-long RAPTOR tasks - No more all-or-nothing execution - Graceful handling of API timeouts/errors		2025-12-03 09:13:47 +01:00
..
__init__.py	Refactor: fix typos (#10200 )	2025-09-25 12:05:43 +08:00
api_service.py	Add time utils (#10849 )	2025-10-28 19:09:14 +08:00
canvas_service.py	Feat: add or logic operations for meta data filters. (#11404 )	2025-11-20 14:31:12 +08:00
checkpoint_service.py	feat: Implement checkpoint/resume for RAPTOR tasks (Phase 1 & 2)	2025-12-03 09:13:47 +01:00
common_service.py	Fix: add auto_parse to kb detail. (#11153 )	2025-11-11 12:22:43 +08:00
connector_service.py	feat: improve metadata handling in connector service (#11421 )	2025-11-26 19:55:48 +08:00
conversation_service.py	Move some constants to common (#11004 )	2025-11-05 08:01:39 +08:00
dialog_service.py	Feat: support uploading in dialog. (#11634 )	2025-12-01 16:54:57 +08:00
document_service.py	Feat: add context for figure and table (#11547 )	2025-11-27 10:21:44 +08:00
file2document_service.py	Move some constants to common (#11004 )	2025-11-05 08:01:39 +08:00
file_service.py	Refa: make RAGFlow more asynchronous (#11601 )	2025-12-01 14:24:06 +08:00
knowledgebase_service.py	Feat: Alter flask to Quart for async API serving. (#11275 )	2025-11-18 17:05:16 +08:00
langfuse_service.py	Add time utils (#10849 )	2025-10-28 19:09:14 +08:00
llm_service.py	Feat:new api /sequence2txt and update QWenSeq2txt (#11643 )	2025-12-02 11:17:31 +08:00
mcp_server_service.py	Fix typos: retrievaler -> retriever (#10372 )	2025-10-10 09:17:36 +08:00
pipeline_operation_log_service.py	Feat: add data source to pipleline logs . (#11075 )	2025-11-07 11:43:59 +08:00
search_service.py	Move some constants to common (#11004 )	2025-11-05 08:01:39 +08:00
task_service.py	Move api.settings to common.settings (#11036 )	2025-11-06 09:36:38 +08:00
tenant_llm_service.py	Move api.settings to common.settings (#11036 )	2025-11-06 09:36:38 +08:00
user_canvas_version.py	Fix typos: retrievaler -> retriever (#10372 )	2025-10-10 09:17:36 +08:00
user_service.py	Move api.settings to common.settings (#11036 )	2025-11-06 09:36:38 +08:00