fix: Remove custom pdf handling and rely on filetype library (#1694)
<!-- .github/pull_request_template.md --> ## Description Remove custom PDF handling and let filetype handle PDF documents ## Type of Change <!-- Please check the relevant option --> - [x] Bug fix (non-breaking change that fixes an issue) - [ ] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [ ] **I have tested my changes thoroughly before submitting this PR** - [ ] **This PR contains minimal changes necessary to address the issue/feature** - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.
This commit is contained in:
commit
76396d5d27
1 changed files with 0 additions and 47 deletions
|
|
@ -58,53 +58,6 @@ txt_file_type = TxtFileType()
|
|||
filetype.add_type(txt_file_type)
|
||||
|
||||
|
||||
class CustomPdfMatcher(filetype.Type):
|
||||
"""
|
||||
Match PDF file types based on MIME type and extension.
|
||||
|
||||
Public methods:
|
||||
- match
|
||||
|
||||
Instance variables:
|
||||
- MIME: The MIME type of the PDF.
|
||||
- EXTENSION: The file extension of the PDF.
|
||||
"""
|
||||
|
||||
MIME = "application/pdf"
|
||||
EXTENSION = "pdf"
|
||||
|
||||
def __init__(self):
|
||||
super(CustomPdfMatcher, self).__init__(
|
||||
mime=CustomPdfMatcher.MIME, extension=CustomPdfMatcher.EXTENSION
|
||||
)
|
||||
|
||||
def match(self, buf):
|
||||
"""
|
||||
Determine if the provided buffer is a PDF file.
|
||||
|
||||
This method checks for the presence of the PDF signature in the buffer.
|
||||
|
||||
Raises:
|
||||
- TypeError: If the buffer is not of bytes type.
|
||||
|
||||
Parameters:
|
||||
-----------
|
||||
|
||||
- buf: The buffer containing the data to be checked.
|
||||
|
||||
Returns:
|
||||
--------
|
||||
|
||||
Returns True if the buffer contains a PDF signature, otherwise returns False.
|
||||
"""
|
||||
return b"PDF-" in buf
|
||||
|
||||
|
||||
custom_pdf_matcher = CustomPdfMatcher()
|
||||
|
||||
filetype.add_type(custom_pdf_matcher)
|
||||
|
||||
|
||||
def guess_file_type(file: BinaryIO) -> filetype.Type:
|
||||
"""
|
||||
Guess the file type from the given binary file stream.
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue