- Fixed crop() to extract original tags from text instead of reconstructing
- Added MinerU-specific logic in manual.py to handle space/tab separated tags
- Removed redundant import re that caused UnboundLocalError
- Ensures correct bbox coordinates for native images, fallback images, and page selection