ragflow/deepdoc/parser
Yuhao Tsui 7b6a5ffaff
Fix: page_chars attribute does not exist in some formats of PDF (#3796)
### What problem does this PR solve?

In #3335 someone suggested to upgrade pdfplumber==0.11.1, but that
didn't solve it.
It's actually the special formatting in some of the pdfs that triggers
the problem.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-12-03 11:08:06 +08:00
..
resume Edit chunk shall update instead of insert it (#3709) 2024-11-28 13:00:38 +08:00
__init__.py add support for eml file parser (#1768) 2024-08-06 16:42:14 +08:00
docx_parser.py Edit chunk shall update instead of insert it (#3709) 2024-11-28 13:00:38 +08:00
excel_parser.py Update readme and add license (#1018) 2024-06-01 16:24:10 +08:00
html_parser.py search between multiple indiices for team function (#3079) 2024-10-29 13:19:01 +08:00
json_parser.py Introduced beartype (#3460) 2024-11-18 17:38:17 +08:00
markdown_parser.py Support table for markdown file in general parser (#1278) 2024-06-27 14:38:35 +08:00
pdf_parser.py Fix: page_chars attribute does not exist in some formats of PDF (#3796) 2024-12-03 11:08:06 +08:00
ppt_parser.py Format file format from Windows/dos to Unix (#1949) 2024-08-15 09:17:36 +08:00
txt_parser.py fix bug about fetching knowledge graph (#3394) 2024-11-14 12:29:15 +08:00
utils.py rename get_txt to get_text (#2649) 2024-09-29 12:47:09 +08:00