ragflow/rag/app
Yongteng Lei 382458ace7
Feat: advanced markdown parsing (#9607)
### What problem does this PR solve?

Using AST parsing to handle markdown more accurately, preventing
components from being cut off by chunking. #9564

<img width="1746" height="993" alt="image"
src="https://github.com/user-attachments/assets/4aaf4bf6-5714-4d48-a9cf-864f59633f7f"
/>

<img width="1739" height="982" alt="image"
src="https://github.com/user-attachments/assets/dc00233f-7a55-434f-bbb7-74ce7f57a6cf"
/>

<img width="559" height="100" alt="image"
src="https://github.com/user-attachments/assets/4a556b5b-d9c6-4544-a486-8ac342bd504e"
/>


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-08-21 09:36:18 +08:00
..
__init__.py Update comments (#4569) 2025-01-21 20:52:28 +08:00
audio.py Refa: OpenAI whisper-1 (#9552) 2025-08-19 16:41:18 +08:00
book.py Feat: Redesign and refactor agent module (#9113) 2025-07-30 19:41:09 +08:00
email.py Feat: Redesign and refactor agent module (#9113) 2025-07-30 19:41:09 +08:00
laws.py Feat: Redesign and refactor agent module (#9113) 2025-07-30 19:41:09 +08:00
manual.py Feat: Redesign and refactor agent module (#9113) 2025-07-30 19:41:09 +08:00
naive.py Feat: advanced markdown parsing (#9607) 2025-08-21 09:36:18 +08:00
one.py Feat: Redesign and refactor agent module (#9113) 2025-07-30 19:41:09 +08:00
paper.py Feat: Redesign and refactor agent module (#9113) 2025-07-30 19:41:09 +08:00
picture.py Feat: add image preview to retrieval test. (#7610) 2025-05-13 14:30:36 +08:00
presentation.py Fix: PlainParser using fix in presentation (#9239) 2025-08-05 17:48:18 +08:00
qa.py Fix: Solve the OOM issue when passing large PDF files while using QA chunking method. (#8464) 2025-06-25 10:25:45 +08:00
resume.py Update comments (#4569) 2025-01-21 20:52:28 +08:00
table.py Support the case of one cell split by multiple columns. (#9225) 2025-08-11 17:17:56 +08:00
tag.py Error storing tag in Redis (#7541) 2025-05-09 10:17:09 +08:00