### What problem does this PR solve? Add MinerU parser. #3945, #8092. Set `MINERU_EXECUTABLE` to the MinerU executable path, defaults to `mineru`. Set `MINERU_DELETE_OUTPUT=0` to preserve MinerU's output, default is 1, which deletes temporary output. Set `MINERU_OUTPUT_DIR` to choose the MinerU output directory (uses the temporary directory if unset). ### Type of change - [x] New Feature (non-breaking change which adds functionality) |
||
|---|---|---|
| .. | ||
| resume | ||
| __init__.py | ||
| docx_parser.py | ||
| excel_parser.py | ||
| figure_parser.py | ||
| html_parser.py | ||
| json_parser.py | ||
| markdown_parser.py | ||
| mineru_parser.py | ||
| pdf_parser.py | ||
| ppt_parser.py | ||
| txt_parser.py | ||
| utils.py | ||