Update XLSX extraction documentation to reflect current implementation

This commit is contained in:
yangdx 2025-11-19 04:26:41 +08:00
parent 0244699d81
commit 87de2b3e9e

View file

@ -1063,7 +1063,7 @@ def _extract_xlsx(file_bytes: bytes) -> str:
- Special characters (tabs, newlines, backslashes) are escaped to prevent structure corruption - Special characters (tabs, newlines, backslashes) are escaped to prevent structure corruption
- Column alignment is preserved across all rows to maintain tabular structure - Column alignment is preserved across all rows to maintain tabular structure
- Empty rows are preserved as blank lines to maintain row structure - Empty rows are preserved as blank lines to maintain row structure
- Two-pass processing: determines max column width, then extracts with consistent alignment - Uses sheet.max_column to determine column width efficiently
Args: Args:
file_bytes: XLSX file content as bytes file_bytes: XLSX file content as bytes