ragflow/deepdoc/vision
XIANG LI f631073ac2
Fix OCR GPU provider mem limit handling (#10407)
### What problem does this PR solve?

- Running DeepDoc OCR on large PDFs inside the GPU docker-compose setup
would intermittently fail with
[ONNXRuntimeError] ... p2o.Clip.6 ... Available memory of 0 is smaller
than requested bytes ...
- Root cause: load_model() in deepdoc/vision/ocr.py treated
device_id=None as-is.
torch.cuda.device_count() > device_id then raised a TypeError, the
helper returned False, and ONNXRuntime quietly fell back to
CPUExecutionProvider with
the hard-coded 512 MB limit, which then triggered the allocator failure.
- Environment where this reproduces: Windows 11, AMD 5900x, 64 GB RAM,
RTX 3090 (24 GB), docker-compose-gpu.yml from upstream, default DeepDoc
+ GraphRAG
parser settings, ingesting heavy PDF such as 《内科学》(第10版).pdf (~180 MB).

  Fixes:

- Normalize device_id to 0 when it is None before calling any CUDA APIs,
so the GPU path is considered available.
- Allow configuring the CUDA provider’s memory cap via
OCR_GPU_MEM_LIMIT_MB (default 2048 MB) and expose
OCR_ARENA_EXTEND_STRATEGY; the calculated byte
  limit is logged to confirm the effective settings.

  After the change, ragflow_server.log shows for example
load_model ... uses GPU (device 0, gpu_mem_limit=21474836480,
arena_strategy=kNextPowerOfTwo) and the same document finishes OCR
without allocator errors.

  ### Type of change

  - [x] Bug Fix (non-breaking change which fixes an issue)
2025-10-10 11:03:12 +08:00
..
__init__.py Feat: add support for the Ascend layout recognizer (#10105) 2025-09-16 09:51:15 +08:00
layout_recognizer.py Feat: add support for the Ascend layout recognizer (#10105) 2025-09-16 09:51:15 +08:00
ocr.py Fix OCR GPU provider mem limit handling (#10407) 2025-10-10 11:03:12 +08:00
operators.py refactor: no need to inherit in python3 clean the code (#5659) 2025-03-05 18:03:53 +08:00
postprocess.py refactor: no need to inherit in python3 clean the code (#5659) 2025-03-05 18:03:53 +08:00
recognizer.py Fix: judge not empty before delete (#10099) 2025-09-15 17:49:52 +08:00
seeit.py Fix typo (#9766) 2025-08-27 18:56:40 +08:00
t_ocr.py Refa: PARALLEL_DEVICES is a static parameter. (#6168) 2025-03-17 16:49:54 +08:00
t_recognizer.py Fix typo in code (#8327) 2025-06-18 09:41:09 +08:00
table_structure_recognizer.py Feat: add support for the Ascend table structure recognizer (#10110) 2025-09-16 13:57:06 +08:00