### What problem does this PR solve?
Support uploading encrypted files to object storage.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: virgilwong <hyhvirgil@gmail.com>
### What problem does this PR solve?
Fix: presentation parsing #11920
Fix: Embeddin encode exception handling
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Asure-OpenAI resource not found. #11750
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Refactor metadata filter.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Displaying the file option in the webhook's request body #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
1. Sampling optimization: reduce from 20 to 12 images when exceeding threshold
2. Native image width normalization: re-crop page-width strips for consistent stitching
- Preserves original native images for MinIO storage
- Uses normalized versions only for thumbnail stitching
3. Low fallback threshold: stitch full page screenshots when 3 fallback images
- Deduplicates and limits to max 3 pages
- Provides better context for sparse thumbnails
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR introduces a new Docs Generator agent component for producing
downloadable PDF, DOCX, or TXT files from Markdown content generated
within a RAGFlow workflow.
### **Key Features**
**Backend**
- New component: DocsGenerator (agent/component/docs_generator.py)
-
- Markdown → PDF/DOCX/TXT conversion
-
- Supports tables, lists, code blocks, headings, and rich formatting
-
- Configurable document style (fonts, margins, colors, page size,
orientation)
-
- Optional header logo and footer with page numbers/timestamps
-
**Frontend**
- New configuration UI for the Docs Generator
-
- Download button integrated into the chat interface
-
- Output wired to the Message component
-
- Full i18n support
**Documentation**
Added component guide:
docs/guides/agent/agent_component_reference/docs_generator.md
**Usage**
Add the Docs Generator to a workflow, connect Markdown output from an
upstream component, configure metadata/style, and feed its output into
the Message component. Users will see a document download button
directly in the chat.
**Contributor Note**
We have been following RAGFlow since more than a year and half now and
have worked extensively on personalizing the framework and integrating
it into several of our internal systems. Over the past year and a half,
we have built multiple platforms that rely on RAGFlow as a core
component, which has given us a strong appreciation for how flexible and
powerful the project is.
We also previously contributed the full Italian translation, and we were
glad to see it accepted. This new Docs Generator component was created
for our own production needs, and we believe that it may be useful for
many others in the community as well.
We want to sincerely thank the entire RAGFlow team for the remarkable
work you have done and continue to do. If there are opportunities to
contribute further, we would be glad to help whenever we have time
available. It would be a pleasure to support the project in any way we
can.
If appropriate, we would be glad to be listed among the project’s
contributors, but in any case we look forward to continuing to support
and contribute to the project.
PentaFrame Development Team
---------
Co-authored-by: PentaFrame <info@pentaframe.it>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Correct metadata update behavior. #11912
When update `value` is omitted, the corresponding keys are updated to
`"value"` regardless of their current values.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Flatten the request schema of the webhook #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
## Overview
This PR adds support for **Single Bucket Mode** in RAGFlow, allowing
users to configure MinIO/S3 to use a single bucket with a directory
structure instead of creating multiple buckets per Knowledge Base and
user folder.
## Problem Statement
The current implementation creates one bucket per Knowledge Base and one
bucket per user folder, which can be problematic when:
- Cloud providers charge per bucket
- IAM policies restrict bucket creation
- Organizations want centralized data management in a single bucket
## Solution
Added a `prefix_path` configuration option to the MinIO connector that
enables:
- Using a single bucket with directory-based organization
- Backward compatibility with existing multi-bucket deployments
- Support for MinIO, AWS S3, and other S3-compatible storage backends
## Changes
- **`rag/utils/minio_conn.py`**: Enhanced MinIO connector to support
single bucket mode with prefix paths
- **`conf/service_conf.yaml`**: Added new configuration options
(`bucket` and `prefix_path`)
- **`docker/service_conf.yaml.template`**: Updated template with single
bucket configuration examples
- **`docker/.env.single-bucket-example`**: Added example environment
variables for single bucket setup
- **`docs/single-bucket-mode.md`**: Comprehensive documentation covering
usage, migration, and troubleshooting
## Configuration Example
```yaml
minio:
user: "access-key"
password: "secret-key"
host: "minio.example.com:443"
bucket: "ragflow-bucket" # Single bucket name
prefix_path: "ragflow" # Optional prefix path
```
## Backward Compatibility
✅ Fully backward compatible - existing deployments continue to work
without any changes
- If `bucket` is not configured, uses default multi-bucket behavior
- If `bucket` is configured without `prefix_path`, uses bucket root
- If both are configured, uses `bucket/prefix_path/` structure
## Testing
- Tested with MinIO (local and cloud)
- Verified backward compatibility with existing multi-bucket mode
- Validated IAM policy restrictions work correctly
## Documentation
Included comprehensive documentation in `docs/single-bucket-mode.md`
covering:
- Configuration examples
- Migration guide from multi-bucket to single-bucket mode
- IAM policy examples
- Troubleshooting guide
---
**Related Issue**: Addresses use cases where bucket creation is
restricted or costly
### What problem does this PR solve?
Feat: Add mineru as a model manufacturer to the system. #10621
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: balibabu <assassin_cike@163.com>
### What problem does this PR solve?
Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
change:
async issue and sensitive logging
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Retrieval metadata filtering adds semi-automatic mode, and users can
manually check the metadata key that participates in LLM to generate
filter conditions.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR fixes a startup crash in the data_sync_0 service caused by an
incorrect asyncio.run call. The main coroutine was being passed as a
function reference instead of being invoked, which raised:
`ValueError: a coroutine was expected, got <function main ...>
`
What I changed
- Updated the entrypoint in sync_data_source.py to correctly invoke the
coroutine with `asyncio.run(main())`.
Testing
- No tested.
Related Issue
Fixes https://github.com/infiniflow/ragflow/issues/11878
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
MinerU new version supports for the new backend
vlm-mlx-engine,https://github.com/opendatalab/MinerU .
### Type of change
- [ x ] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add metadata condition in document list.
Add metadata bulk update.
Add metadata summary.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
Fix RuntimeError when calling mindmap endpoint by converting
`gen_mindmap()` to async function and using `await` instead of
`asyncio.run()`.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
- Fixed crop() to extract original tags from text instead of reconstructing
- Added MinerU-specific logic in manual.py to handle space/tab separated tags
- Removed redundant import re that caused UnboundLocalError
- Ensures correct bbox coordinates for native images, fallback images, and page selection
- Changed fallback image generation to page-width strips (full horizontal, bbox vertical)
- Implemented smart crop() with native+fallback mixing and deduplication
- Added thresholds: max 10 images, total height <2000px
- Established native_img_map for table/image/equation priority
- Removed 120px padding logic that caused super-long stitched thumbnails
This fixes the issue where chunk thumbnails were either missing or excessively long due to:
1. MinerU not providing images for pure text blocks
2. Official crop() adding 120px padding and stitching across pages
3. Manual.py merging multiple sections into one chunk
The new approach:
- Priority 1: Use MinerU's native high-quality images (tables/equations)
- Priority 2: Use page-width fallback strips (consistent width for stitching)
- Priority 3: Use full page as last resort
- Deduplicates identical bboxes during stitching
- Limits output to reasonable dimensions for UX
### What problem does this PR solve?
Feat: Add configuration for webhook to the begin node. #10427
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Enhance OBConnection.search for better performance. Main changes:
1. Use string type of vector array in distance func for better parsing
performance.
2. Manually set max_connections as pool size instead of using default
value.
3. Set 'fulltext_search_columns' when starting.
4. Cache the results of the table existence check (we will never drop
the table).
5. Remove unused 'group_results' logic.
6. Add the `USE_FULLTEXT_FIRST_FUSION_SEARCH` flag, and the
corresponding fusion search SQL when it's false.
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
Fix:Modify the name of the Overlapped percent field
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Manage and display memory datasets.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
The docker version(24.0.7) installed in the executor manager image is
incompatible with the latest stable docker (29.1.3). The minmum api
v29.1.3 can use is 1.4.4 api version, but 24.0.7 uses api version 1.4.3.
### Type of change
- [X] Other (please describe):
This could break things for people who still have an old docker
installed on their system. A better approach could be a setting to share
- Critical bug fix: imgs list was not initialized before use (line 439)
- Without this fix, NameError would occur when cache miss triggers fallback
- Discovered during reliability audit of MinerU image generation fix
- Add _img_path_cache dict to cache line_tag -> img_path mapping
- Populate cache in _generate_missing_images for fallback text block images
- Refactor crop() to check cache first, return cached image directly
- Fallback to single-position cropping to avoid super-tall merged images
- Fix text_types to use both string literals and enums for compatibility
- Add bbox clamping to prevent cropping errors
### What problem does this PR solve?
Fix: duplicate output by async_chat_streamly
Refact: revert manual modification
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Treat MinerU as an OCR model.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
### What problem does this PR solve?
Fix: The variables in the message node are not displaying correctly.
#11839
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)