Commit graph

4717 commits

Author SHA1 Message Date
Jin Hai
7ca3e11566
Update dataset config and retrieval testing (#11958)
### What problem does this PR solve?

1. Refactor the order of the dataset config items.
2. Refactor the text of retrieval test.
3. Refactor typos

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-12-15 19:56:28 +08:00
lenghanz
a2e080c2d3
feat: display name instead of key in user fillup form submission (#11931)
### What problem does this PR solve?

- Change the message format from 'key: value' to 'name: value' when user
submits the fillup form in agent chat.
- This resolves #11865
- I think this change makes sense, better aligning the form and the
replied message.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-15 19:12:01 +08:00
Yongteng Lei
ad6f7fd4b0
Fix: pipeline ignore MinerU backend config and vllm module is missing (#11955)
### What problem does this PR solve?

Fix pipeline ignore MinerU backend config and vllm module is missing.
#11944, #11947.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-15 18:03:34 +08:00
Stephen Hu
2a0f835ffe
Refactor: Improve the logic to calculate embedding total token count (#11943)
### What problem does this PR solve?

 Improve the logic to calculate embedding total token count 

### Type of change

- [x] Refactoring
2025-12-15 11:33:57 +08:00
Yongteng Lei
13d8241eee
Doc: executor manager updated docker version (#11946)
### What problem does this PR solve?

Add documentation for #11806.

### Type of change

- [x] Documentation Update
2025-12-15 11:13:51 +08:00
balibabu
1ddd11f045
Feat: Set the return value of the webhook to a string. #10427 (#11945)
### What problem does this PR solve?

Feat: Set the return value of the webhook to a string. #10427

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-15 11:09:08 +08:00
YngvarHuang
81eb03d230
Support uploading encrypted files to object storage (#11837) (#11838)
### What problem does this PR solve?

Support uploading encrypted files to object storage.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: virgilwong <hyhvirgil@gmail.com>
2025-12-15 09:45:18 +08:00
Magicbook1108
7d23c3aed0
Fix: presentation parsing & Embedding encode exception handling (#11933)
### What problem does this PR solve?

Fix: presentation parsing #11920
Fix: Embeddin encode exception handling
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-13 11:37:42 +08:00
Yongteng Lei
6be0338aa0
Fix: Asure-OpenAI resource not found (#11934)
### What problem does this PR solve?

Asure-OpenAI resource not found. #11750


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-13 11:32:46 +08:00
Kevin Hu
44dec89f1f
Fix: aspose-slide issue. (#11935)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-12 20:16:18 +08:00
Yongteng Lei
2b260901df
Fix: raptor don't have attribute chat (#11936)
### What problem does this PR solve?

Raptor don't have attribute chat.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-12 20:08:18 +08:00
Magicbook1108
948bc93786
Feat: Add GPT-5.2 & pro (#11929)
### What problem does this PR solve?

Feat: Add GPT-5.2 & pro

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-12 17:35:08 +08:00
Yongteng Lei
0f0fb53256
Refa: refactor metadata filter (#11907)
### What problem does this PR solve?

Refactor metadata filter.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-12 17:12:38 +08:00
balibabu
0fcb1680fd
Feat: Displaying the file option in the webhook's request body #10427 (#11928)
### What problem does this PR solve?

Feat: Displaying the file option in the webhook's request body #10427

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-12 16:16:34 +08:00
Magicbook1108
50715ba332
Fix: forget-reset password (#11927)
### What problem does this PR solve?

Fix: forget-reset password

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-12 16:16:17 +08:00
PentaFDevs
f9510edbbc
Feature/docs generator (#11858)
### Type of change

- [x] New Feature (non-breaking change which adds functionality)


### What problem does this PR solve?

This PR introduces a new Docs Generator agent component for producing
downloadable PDF, DOCX, or TXT files from Markdown content generated
within a RAGFlow workflow.

### **Key Features**

**Backend**

- New component: DocsGenerator (agent/component/docs_generator.py)
- 
- Markdown → PDF/DOCX/TXT conversion
- 
- Supports tables, lists, code blocks, headings, and rich formatting
- 
- Configurable document style (fonts, margins, colors, page size,
orientation)
- 
- Optional header logo and footer with page numbers/timestamps
- 

**Frontend**

- New configuration UI for the Docs Generator
- 
- Download button integrated into the chat interface
- 
- Output wired to the Message component
- 
- Full i18n support

**Documentation**

Added component guide:
docs/guides/agent/agent_component_reference/docs_generator.md

**Usage**

Add the Docs Generator to a workflow, connect Markdown output from an
upstream component, configure metadata/style, and feed its output into
the Message component. Users will see a document download button
directly in the chat.

**Contributor Note**

We have been following RAGFlow since more than a year and half now and
have worked extensively on personalizing the framework and integrating
it into several of our internal systems. Over the past year and a half,
we have built multiple platforms that rely on RAGFlow as a core
component, which has given us a strong appreciation for how flexible and
powerful the project is.

We also previously contributed the full Italian translation, and we were
glad to see it accepted. This new Docs Generator component was created
for our own production needs, and we believe that it may be useful for
many others in the community as well.

We want to sincerely thank the entire RAGFlow team for the remarkable
work you have done and continue to do. If there are opportunities to
contribute further, we would be glad to help whenever we have time
available. It would be a pleasure to support the project in any way we
can.

If appropriate, we would be glad to be listed among the project’s
contributors, but in any case we look forward to continuing to support
and contribute to the project.

PentaFrame Development Team

---------

Co-authored-by: PentaFrame <info@pentaframe.it>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-12 14:59:43 +08:00
Yongteng Lei
6560388f2b
Fix: correct metadata update behavior (#11919)
### What problem does this PR solve?

Correct metadata update behavior. #11912

When update `value` is omitted, the corresponding keys are updated to
`"value"` regardless of their current values.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-12 12:50:17 +08:00
writinwaters
e37aea5f81
Docs: How to use restful API to update or delete metadata (#11912)
### What problem does this PR solve?



### Type of change

- [x] Documentation Update
2025-12-12 12:04:47 +08:00
Magicbook1108
7db9045b74
Feat: Add box connector (#11845)
### What problem does this PR solve?

Feat: Add box connector

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-12 10:23:40 +08:00
balibabu
a6bd765a02
Feat: Flatten the request schema of the webhook #10427 (#11917)
### What problem does this PR solve?

Feat: Flatten the request schema of the webhook #10427

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-12 09:59:54 +08:00
Andrea Bugeja
74afb8d710
feat: Add Single Bucket Mode for MinIO/S3 (#11416)
## Overview

This PR adds support for **Single Bucket Mode** in RAGFlow, allowing
users to configure MinIO/S3 to use a single bucket with a directory
structure instead of creating multiple buckets per Knowledge Base and
user folder.

## Problem Statement

The current implementation creates one bucket per Knowledge Base and one
bucket per user folder, which can be problematic when:
- Cloud providers charge per bucket
- IAM policies restrict bucket creation
- Organizations want centralized data management in a single bucket

## Solution

Added a `prefix_path` configuration option to the MinIO connector that
enables:
- Using a single bucket with directory-based organization
- Backward compatibility with existing multi-bucket deployments
- Support for MinIO, AWS S3, and other S3-compatible storage backends

## Changes

- **`rag/utils/minio_conn.py`**: Enhanced MinIO connector to support
single bucket mode with prefix paths
- **`conf/service_conf.yaml`**: Added new configuration options
(`bucket` and `prefix_path`)
- **`docker/service_conf.yaml.template`**: Updated template with single
bucket configuration examples
- **`docker/.env.single-bucket-example`**: Added example environment
variables for single bucket setup
- **`docs/single-bucket-mode.md`**: Comprehensive documentation covering
usage, migration, and troubleshooting

## Configuration Example

```yaml
minio:
  user: "access-key"
  password: "secret-key"
  host: "minio.example.com:443"
  bucket: "ragflow-bucket"      # Single bucket name
  prefix_path: "ragflow"         # Optional prefix path
```

## Backward Compatibility

 Fully backward compatible - existing deployments continue to work
without any changes
- If `bucket` is not configured, uses default multi-bucket behavior
- If `bucket` is configured without `prefix_path`, uses bucket root
- If both are configured, uses `bucket/prefix_path/` structure

## Testing

- Tested with MinIO (local and cloud)
- Verified backward compatibility with existing multi-bucket mode
- Validated IAM policy restrictions work correctly

## Documentation

Included comprehensive documentation in `docs/single-bucket-mode.md`
covering:
- Configuration examples
- Migration guide from multi-bucket to single-bucket mode
- IAM policy examples
- Troubleshooting guide

---

**Related Issue**: Addresses use cases where bucket creation is
restricted or costly
2025-12-11 19:22:47 +08:00
Kevin Hu
ea4a5cd665
Fix: tokenizer issue. (#11902)
#11786
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-11 17:38:17 +08:00
balibabu
22a51a3868
Feat: Add mineru as a model manufacturer to the system. #10621 (#11903)
### What problem does this PR solve?

Feat: Add mineru as a model manufacturer to the system. #10621

### Type of change


- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: balibabu <assassin_cike@163.com>
2025-12-11 17:37:10 +08:00
Yongteng Lei
e9710b7aa9
Refa: treat MinerU as an OCR model 2 (#11905)
### What problem does this PR solve?

Treat MinerU as an OCR model 2. #11903

### Type of change

- [x] Refactoring
2025-12-11 17:33:12 +08:00
TeslaZY
bd0eff2954
Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code (#11898)
### What problem does this PR solve?

Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-11 13:55:01 +08:00
buua436
e3cfe8e848
Fix:async issue and sensitive logging (#11895)
### What problem does this PR solve?

change:
async issue and sensitive logging

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-11 13:54:47 +08:00
TeslaZY
c610bb605a
Added semi-automatic mode to the metadata filter (#11886)
### What problem does this PR solve?

Retrieval metadata filtering adds semi-automatic mode, and users can
manually check the metadata key that participates in LLM to generate
filter conditions.
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-11 10:45:21 +08:00
David López Carrascal
a6afb7dfe2
Fix data_sync startup crash by properly invoking async main (#11879)
### What problem does this PR solve?

This PR fixes a startup crash in the data_sync_0 service caused by an
incorrect asyncio.run call. The main coroutine was being passed as a
function reference instead of being invoked, which raised:

`ValueError: a coroutine was expected, got <function main ...>
`

What I changed

- Updated the entrypoint in sync_data_source.py to correctly invoke the
coroutine with `asyncio.run(main())`.

Testing
- No tested.

Related Issue
Fixes https://github.com/infiniflow/ragflow/issues/11878

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-11 10:09:16 +08:00
TeslaZY
7b96113d4c
MinerU supports for the new backend vlm-mlx-engine (#11864)
### What problem does this PR solve?

MinerU new version supports for the new backend
vlm-mlx-engine,https://github.com/opendatalab/MinerU .

### Type of change
- [ x ] New Feature (non-breaking change which adds functionality)
2025-12-11 09:59:38 +08:00
Yongteng Lei
8370bc61b7
Feat: enhance metadata operation (#11874)
### What problem does this PR solve?

Add metadata condition in document list.
Add metadata bulk update.
Add metadata summary.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2025-12-11 09:59:15 +08:00
N0bodycan
74eb894453
Fix RuntimeError: asyncio.run() cannot be called from a running event loop when calling mindmap endpoint. (#11880)
### What problem does this PR solve?

Fix RuntimeError when calling mindmap endpoint by converting
`gen_mindmap()` to async function and using `await` instead of
`asyncio.run()`.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-11 09:47:44 +08:00
balibabu
34d29d7e8b
Feat: Add configuration for webhook to the begin node. #10427 (#11875)
### What problem does this PR solve?

Feat: Add configuration for webhook to the begin node. #10427

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-10 19:13:57 +08:00
He Wang
badf33e3b9
feat: enhance OBConnection.search (#11876)
### What problem does this PR solve?

Enhance OBConnection.search for better performance. Main changes:
1. Use string type of vector array in distance func for better parsing
performance.
2. Manually set max_connections as pool size instead of using default
value.
3. Set 'fulltext_search_columns' when starting.
4. Cache the results of the table existence check (we will never drop
the table).
5. Remove unused 'group_results' logic.
6. Add the `USE_FULLTEXT_FIRST_FUSION_SEARCH` flag, and the
corresponding fusion search SQL when it's false.

### Type of change
- [x] Performance Improvement
2025-12-10 19:13:37 +08:00
buua436
3cb72377d7
Refa:remove sensitive information (#11873)
### What problem does this PR solve?

change:
remove sensitive information

### Type of change

- [x] Refactoring
2025-12-10 19:08:45 +08:00
buua436
ab4b62031f
Fix:csv parse in Table (#11870)
### What problem does this PR solve?

change:
csv parse in Table

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-10 16:44:06 +08:00
chanx
80f3ccf1ac
Fix:Modify the name of the Overlapped percent field (#11866)
### What problem does this PR solve?

Fix:Modify the name of the Overlapped percent field

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-10 13:38:24 +08:00
Lynn
a1164b9c89
Feat/memory (#11812)
### What problem does this PR solve?

Manage and display memory datasets.

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-12-10 13:34:08 +08:00
Russell Valentine
fd7e55b23d
executor_manager updated docker version (#11806)
### What problem does this PR solve?

The docker version(24.0.7) installed in the executor manager image is
incompatible with the latest stable docker (29.1.3). The minmum api
v29.1.3 can use is 1.4.4 api version, but 24.0.7 uses api version 1.4.3.

### Type of change

- [X] Other (please describe):

This could break things for people who still have an old docker
installed on their system. A better approach could be a setting to share
2025-12-10 11:08:11 +08:00
Zhichang Yu
f128a1fa9e
Bump python to >=3.12 (#11846)
### What problem does this PR solve?

Bump python to >=3.12

### Type of change

- [x] Refactoring
2025-12-09 19:55:25 +08:00
buua436
65a5a56d95
Refa:replace trio with asyncio (#11831)
### What problem does this PR solve?

change:
replace trio with asyncio

### Type of change
- [x] Refactoring
2025-12-09 19:23:14 +08:00
Magicbook1108
ca2d6f3301
Fix: duplicate output by async_chat_streamly (#11842)
### What problem does this PR solve?

Fix: duplicate output by async_chat_streamly
Refact: revert manual modification

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-12-09 19:21:52 +08:00
Yongteng Lei
a94b3b9df2
Refa: treat MinerU as an OCR model (#11849)
### What problem does this PR solve?

 Treat MinerU as an OCR model.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
2025-12-09 18:54:14 +08:00
balibabu
30377319d8
Fix: The variables in the message node are not displaying correctly. #11839 (#11841)
### What problem does this PR solve?

Fix: The variables in the message node are not displaying correctly.
#11839

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-09 17:59:49 +08:00
PentaFDevs
07dca37ef0
feat: add Italian language translation support (#11844)
### What problem does this PR solve?

- Add complete Italian translation file with all UI sections
- Register Italian in LanguageAbbreviation enum and language maps
- Configure Italian translation in i18n config
- Add Italiano to language selector dropdown


### Type of change

- [x] Other (please describe):

## What
Added complete Italian language translation support to RAGFlow

## Changes
- Added comprehensive Italian translation file
([it.ts](ragflow/web/src/locales/it.ts:0:0-0:0)) with all UI sections
(1239 lines)
- Registered Italian in `LanguageAbbreviation` enum and all language
maps
- Configured Italian translation in i18n configuration
- Added "Italiano" to language selector dropdown

## Impact
- Italian users can now use RAGFlow in their native language
- All major UI components are translated including:
  - Login/registration screens
  - Knowledge base management
  - Chat interface
  - Settings and configuration
  - Admin console
  - Error messages and notifications

## Testing
- Verified all translation keys are present
- Confirmed language selector shows "Italiano" correctly
- Tested that no translation keys are missing
- All UI sections properly translated

Co-authored-by: PentaFrame <info@pentaframe.it>
2025-12-09 17:59:21 +08:00
changkeke
036b29f084
Docs: Enhance API reference for file management (#11827)
### What problem does this PR solve?

The SDK documentation is lacking in file management sections.

### Type of change

- [x] Documentation Update
2025-12-09 17:30:53 +08:00
N0bodycan
9863862348
fix: prevent redundant retries in async_chat_streamly upon success (#11832)
## What changes were proposed in this pull request?
Added a return statement after the successful completion of the async
for loop in async_chat_streamly.

## Why are the changes needed?
Previously, the code lacked a break/return mechanism inside the try
block. This caused the retry loop (for attempt in range...) to continue
executing even after the LLM response was successfully generated and
yielded, resulting in duplicate requests (up to max_retries times).

## Does this PR introduce any user-facing change?
No (it fixes an internal logic bug).
2025-12-09 17:14:30 +08:00
Zhichang Yu
bb6022477e
Bump infinity to v0.6.11. Requires python>=3.11 (#11814)
### What problem does this PR solve?

Bump infinity to v0.6.11. Requires python>=3.11

### Type of change

- [x] Refactoring
2025-12-09 16:23:37 +08:00
chanx
28bc87c5e2
Feature: Memory interface integration testing (#11833)
### What problem does this PR solve?

Feature: Memory interface integration testing

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-09 14:52:58 +08:00
Yongteng Lei
c51e6b2a58
Refa: migrate CV model chat to Async (#11828)
### What problem does this PR solve?

Migrate CV model chat to Async. #11750

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2025-12-09 13:08:37 +08:00
Stephen Hu
481192300d
Fix:[ERROR][Exception]: list index out of range (#11826)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/11821

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-09 09:58:34 +08:00