ragflow/rag/app
Yongteng Lei 5200711441
Feat: add support for multi-column PDF parsing (#10475)
### What problem does this PR solve?

Add support for multi-columns PDF parsing. #9878, #9919.

Two-column sample:
<img width="1885" height="1020" alt="image"
src="https://github.com/user-attachments/assets/0270c028-2db8-4ca6-a4b7-cd5830882d28"
/>

Three-column sample: 
<img width="1881" height="992" alt="image"
src="https://github.com/user-attachments/assets/9ee88844-d5b1-4927-9e4e-3bd810d6e03a"
/>

Single-column sample:
<img width="1883" height="1042" alt="image"
src="https://github.com/user-attachments/assets/e93d3d18-43c3-4067-b5fa-e454ed0ab093"
/>



### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2025-10-11 18:46:09 +08:00
..
__init__.py
audio.py
book.py
email.py Feat: Use data pipeline to visualize the parsing configuration of the knowledge base (#10423) 2025-10-09 12:36:19 +08:00
laws.py Add tree_merge for law parsers, significantly outperforming hierarchical_merge (#10202) 2025-09-22 16:33:21 +08:00
manual.py
naive.py Feat: add support for multi-column PDF parsing (#10475) 2025-10-11 18:46:09 +08:00
one.py
paper.py
picture.py
presentation.py
qa.py
resume.py
table.py
tag.py Fix typos: retrievaler -> retriever (#10372) 2025-10-10 09:17:36 +08:00