ragflow/deepdoc/parser
Debug Doctor 3e19044dee
Feat: add OCR's muti-gpus and parallel processing support (#5972)
### What problem does this PR solve?

Add OCR's muti-gpus and parallel processing support

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

@yuzhichang I've tried to resolve the comments in #5697. OCR jobs can
now be done on both CPU and GPU. ( By the way, I've encountered a
“Generate embedding error” issue #5954 that might be due to my outdated
GPUs? idk. ) Please review it and give me suggestions.

GPU:

![gpu_ocr](https://github.com/user-attachments/assets/0ee2ecfb-a665-4e50-8bc7-15941b9cd80e)

![smi](https://github.com/user-attachments/assets/a2312f8c-cf24-443d-bf89-bec50503546d)

CPU:

![cpu_ocr](https://github.com/user-attachments/assets/1ba6bb0b-94df-41ea-be79-790096da4bf1)
2025-03-17 11:58:40 +08:00
..
resume Fix:when start with source code not in docker env report 'UnicodeDec… (#5802) 2025-03-10 11:22:06 +08:00
__init__.py Update comments (#4569) 2025-01-21 20:52:28 +08:00
docx_parser.py Update comments (#4569) 2025-01-21 20:52:28 +08:00
excel_parser.py Feat: add CSV file parsing support (#5989) 2025-03-12 19:20:50 +08:00
html_parser.py Update comments (#4569) 2025-01-21 20:52:28 +08:00
json_parser.py Update comments (#4569) 2025-01-21 20:52:28 +08:00
markdown_parser.py Feat:Optimize the table extraction logic in the Markdown parser: (#5663) 2025-03-07 17:02:35 +08:00
pdf_parser.py Feat: add OCR's muti-gpus and parallel processing support (#5972) 2025-03-17 11:58:40 +08:00
ppt_parser.py refactor: no need to inherit in python3 clean the code (#5659) 2025-03-05 18:03:53 +08:00
txt_parser.py Fix: delimiter issue. (#5720) 2025-03-06 17:51:22 +08:00
utils.py Update comments (#4569) 2025-01-21 20:52:28 +08:00