- Add ProgressTracker class for real-time progress parsing from mineru stdout - Implement intelligent stall-based timeout (5min no progress triggers warning) - Add vlm-direct backend mode to directly connect to vLLM server - Improve API mode with streaming response, HTTP session reuse, and auto-retry - Add heartbeat updates during long-running operations - Support dynamic timeout based on page count This addresses the deadlock issue where tasks hang indefinitely when using MinerU API backend, even after the API returns 200 OK. |
||
|---|---|---|
| .. | ||
| resume | ||
| __init__.py | ||
| docling_parser.py | ||
| docx_parser.py | ||
| excel_parser.py | ||
| figure_parser.py | ||
| html_parser.py | ||
| json_parser.py | ||
| markdown_parser.py | ||
| mineru_parser.py | ||
| pdf_parser.py | ||
| ppt_parser.py | ||
| tcadp_parser.py | ||
| txt_parser.py | ||
| utils.py | ||