Commit graph

25 commits

Author SHA1 Message Date
Lei Zhang
7499608a8b
feat: add Redis username support (#11608)
### What problem does this PR solve?

Support for Redis 6+ ACL authentication (username)

close #11606 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2025-12-01 11:26:20 +08:00
aidan
420c97199a
Feat: Add TCADP parser for PPTX and spreadsheet document types. (#11041)
### What problem does this PR solve?

- Added TCADP Parser configuration fields to PDF, PPT, and spreadsheet
parsing forms
- Implemented support for setting table result type (Markdown/HTML) and
Markdown image response type (URL/Text)
- Updated TCADP Parser to handle return format settings from
configuration or parameters
- Enhanced frontend to dynamically show TCADP options based on selected
parsing method
- Modified backend to pass format parameters when calling TCADP API
- Optimized form default value logic for TCADP configuration items
- Updated multilingual resource files for new configuration options

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-20 10:08:42 +08:00
He Wang
38234aca53
feat: add OceanBase doc engine (#11228)
### What problem does this PR solve?

Add OceanBase doc engine. Close #5350

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-20 10:00:14 +08:00
aidan
33a189f620
Feat: add TCADP Parser (#10775)
### What problem does this PR solve?

This PR adds a new TCADP (Tencent Cloud Advanced Document Processing)
parser to RAGFlow, enabling users to leverage Tencent Cloud's document
parsing capabilities for more accurate and structured document
processing. The implementation includes:
New TCADP Parser: A complete implementation of Tencent Cloud's document
parsing API without SDK dependency
Configuration Support: Added configuration options in service_conf.yaml
for Tencent Cloud API credentials
Frontend Integration: Updated UI components to support the new TCADP
parser option
Error Handling: Comprehensive error handling and retry mechanisms for
API calls
Result Processing: Support for both SSE streaming and JSON response
formats from Tencent Cloud API

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-10-27 15:14:58 +08:00
Zhichang Yu
73144e278b
Don't release full image (#10654)
### What problem does this PR solve?

Introduced gpu profile in .env
Added Dockerfile_tei
fix datrie
Removed LIGHTEN flag

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-10-23 23:02:27 +08:00
Jin Hai
d11b1628a1
Feat: add admin CLI and admin service (#10186)
### What problem does this PR solve?

Introduce new feature: RAGFlow system admin service and CLI

### Introduction

Admin Service is a dedicated management component designed to monitor,
maintain, and administrate the RAGFlow system. It provides comprehensive
tools for ensuring system stability, performing operational tasks, and
managing users and permissions efficiently.

The service offers monitoring of critical components, including the
RAGFlow server, Task Executor processes, and dependent services such as
MySQL, Infinity / Elasticsearch, Redis, and MinIO. It automatically
checks their health status, resource usage, and uptime, and performs
restarts in case of failures to minimize downtime.

For user and system management, it supports listing, creating,
modifying, and deleting users and their associated resources like
knowledge bases and Agents.

Built with scalability and reliability in mind, the Admin Service
ensures smooth system operation and simplifies maintenance workflows.

It consists of a server-side Service and a command-line client (CLI),
both implemented in Python. User commands are parsed using the Lark
parsing toolkit.

- **Admin Service**: A backend service that interfaces with the RAGFlow
system to execute administrative operations and monitor its status.
- **Admin CLI**: A command-line interface that allows users to connect
to the Admin Service and issue commands for system management.

### Starting the Admin Service

1. Before start Admin Service, please make sure RAGFlow system is
already started.

2.  Run the service script:
    ```bash
    python admin/admin_server.py
    ```
The service will start and listen for incoming connections from the CLI
on the configured port.

### Using the Admin CLI

1.  Ensure the Admin Service is running.
2.  Launch the CLI client:
    ```bash
    python admin/admin_client.py -h 0.0.0.0 -p 9381
## Supported Commands
Commands are case-insensitive and must be terminated with a semicolon
(`;`).
### Service Management Commands
-  [x] `LIST SERVICES;`
    -   Lists all available services within the RAGFlow system.
-  [ ] `SHOW SERVICE <id>;`
- Shows detailed status information for the service identified by
`<id>`.
-  [ ] `STARTUP SERVICE <id>;`
    -   Attempts to start the service identified by `<id>`.
-  [ ] `SHUTDOWN SERVICE <id>;`
- Attempts to gracefully shut down the service identified by `<id>`.
-  [ ] `RESTART SERVICE <id>;`
    -   Attempts to restart the service identified by `<id>`.
### User Management Commands
-  [x] `LIST USERS;`
    -   Lists all users known to the system.
-  [ ] `SHOW USER '<username>';`
- Shows details and permissions for the specified user. The username
must be enclosed in single or double quotes.
-  [ ] `DROP USER '<username>';`
    -   Removes the specified user from the system. Use with caution.
-  [ ] `ALTER USER PASSWORD '<username>' '<new_password>';`
    -   Changes the password for the specified user.
### Data and Agent Commands
-  [ ] `LIST DATASETS OF '<username>';`
    -   Lists the datasets associated with the specified user.
-  [ ] `LIST AGENTS OF '<username>';`
    -   Lists the agents associated with the specified user.
### Meta-Commands
Meta-commands are prefixed with a backslash (`\`).
-   `\?` or `\help`
    -   Shows help information for the available commands.
-   `\q` or `\quit`
    -   Exits the CLI application.
## Examples
```commandline
admin> list users;
+-------------------------------+------------------------+-----------+-------------+
| create_date                   | email                  | is_active | nickname    |
+-------------------------------+------------------------+-----------+-------------+
| Fri, 22 Nov 2024 16:03:41 GMT | jeffery@infiniflow.org | 1         | Jeffery     |
| Fri, 22 Nov 2024 16:10:55 GMT | aya@infiniflow.org     | 1         | Waterdancer |
+-------------------------------+------------------------+-----------+-------------+
admin> list services;
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+
| extra                                                                                     | host      | id | name          | port  | service_type   |
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+
| {}                                                                                        | 0.0.0.0   | 0  | ragflow_0     | 9380  | ragflow_server |
| {'meta_type': 'mysql', 'password': 'infini_rag_flow', 'username': 'root'}                 | localhost | 1  | mysql         | 5455  | meta_data      |
| {'password': 'infini_rag_flow', 'store_type': 'minio', 'user': 'rag_flow'}                | localhost | 2  | minio         | 9000  | file_store     |
| {'password': 'infini_rag_flow', 'retrieval_type': 'elasticsearch', 'username': 'elastic'} | localhost | 3  | elasticsearch | 1200  | retrieval      |
| {'db_name': 'default_db', 'retrieval_type': 'infinity'}                                   | localhost | 4  | infinity      | 23817 | retrieval      |
| {'database': 1, 'mq_type': 'redis', 'password': 'infini_rag_flow'}                        | localhost | 5  | redis         | 6379  | message_queue  |
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+
```

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: jinhai <haijin.chn@gmail.com>
2025-09-22 10:37:49 +08:00
Liu An
fc95d113c3
Feat(config): Update service config template new defaults (#10029)
### What problem does this PR solve?

- Update default LLM configuration with BAAI and model details #9404
- Add SMTP configuration section #9479
- Add OpenDAL storage configuration option #8232

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-09-10 16:39:26 +08:00
He Wang
4fc9e42e74
fix: add missing env vars and default values of service_conf.yaml (#9289)
### What problem does this PR solve?

Add missing env var `MYSQL_MAX_PACKET` to service_conf.yaml.template,
and add default values to opendal config to fix npe.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-08-07 10:41:05 +08:00
griffith-h
8b7dbb349e fix: update service_conf.yaml.template (#8863)
fix: modify the connection ports of minio and redis in
service_conf.yaml.template

### What problem does this PR solve?

If you modify the external ports of minio and redis in the .env file, it
will also affect the connection ports inside the container in the
service_conf.yaml.template file, which is unreasonable.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-07-16 15:31:57 +08:00
ruansheng
de8ba7298c
RAGFlow service_conf using .env variable (#8454)
### What problem does this PR solve?
Fix: when using external components, it is impossible to specify the
port, because the variables in the `docker/.env` variable were not
referenced by `docker/service_conf.yaml.template`.

382d2d0373/docker/.env (L85)

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-25 15:24:37 +08:00
yongzhen.wang
311e20599f
fix: error opensearch env key (#8329)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-18 16:41:25 +08:00
Chaoxi Weng
205974c359
Docs: Improve oauth configuration documentation and examples (#7675)
### What problem does this PR solve?

Improve oauth configuration documentation and examples.

- Related pull requests: 
  - #7379
  - #7553
  - #7587
- Related issues:
  -  #3495
### Type of change

- [x] Documentation Update
2025-05-16 14:17:39 +08:00
Chaoxi Weng
a8542508b7
Refa: Deprecate /github_callback in favor of /oauth/callback/<channel> for GitHub OAuth integration (#7587)
### What problem does this PR solve?

Deprecate `/github_callback` route in favor of
`/oauth/callback/<channel>` for GitHub OAuth integration:

- Added GitHub OAuth support in the authentication module
- Introduced `GithubOAuthClient` with methods to fetch and normalize
user info
  - Updated `CLIENT_TYPES` to include GitHub OAuth client
- Deprecated `/github_callback` route and suggested using the generic
`/oauth/callback/<channel>` route

---
- Related pull requests: 
  - #7379
  - #7553 

### Usage

- [Create a GitHub OAuth
App](https://github.com/settings/applications/new) to obtain the
`client_id` and `client_secret`, configure the authorization callback
url: `https://your-app.com/v1/user/oauth/callback/github`
- Edit `service_conf.yaml.template`:
  ```yaml
  # ...
  oauth:
    github:
      type: "github"
      icon: "github"
      display_name: "Github"
      client_id: "your_client_id"
      client_secret: "your_client_secret"
      redirect_uri: "https://your-app.com/v1/user/oauth/callback/github"
  # ...
  ```

### Type of change

- [x] Documentation Update
- [x] Refactoring (non-breaking change)
2025-05-15 14:39:37 +08:00
liu an
6bd7d572ec
Perf: Increase database connection pool size (#7559)
### What problem does this PR solve?

1. The MySQL instance is configured with max_connections=1000,
but our connection pool was limited to max_connections: 100.
This mismatch caused connection pool exhaustion during performance
testing.

2.  Increase stale_timeout to resolve #6548

### Type of change

- [x] Performance Improvement
2025-05-09 17:52:03 +08:00
Chaoxi Weng
e349635a3d
Feat: Add /login/channels route and improve auth logic for frontend third-party login integration (#7521)
### What problem does this PR solve?

Add `/login/channels` route and improve auth logic to support frontend
integration with third-party login providers:

- Add `/login/channels` route to provide authentication channel list
with `display_name` and `icon`
- Optimize user info parsing logic by prioritizing `avatar_url` and
falling back to `picture`
- Simplify OIDC token validation by removing unnecessary `kid` checks
- Ensure `client_id` is safely cast to string during `audience`
validation
- Fix typo

---
- Related pull request: #7379 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2025-05-08 10:23:19 +08:00
Chaoxi Weng
3a43043c8a
Feat: Add support for OAuth2 and OpenID Connect (OIDC) authentication (#7379)
### What problem does this PR solve?

Add support for OAuth2 and OpenID Connect (OIDC) authentication,
allowing OAuth/OIDC authentication using the specified routes:
- `/login/<channel>`: Initiates the OAuth flow for the specified channel
- `/oauth/callback/<channel>`: Handles the OAuth callback after
successful authentication

The callback URL should be configured in your OAuth provider as:
```
https://your-app.com/oauth/callback/<channel>
```

For detailed instructions on configuring **service_conf.yaml.template**,
see: `./api/apps/auth/README.md#usage`.

- Related issues
#3495  

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2025-04-28 16:15:52 +08:00
pyyuhao
c8c3b756b0
Feat: Adds OpenSearch2.19.1 as the vector_database support (#7140)
### What problem does this PR solve?

This PR adds the support for latest OpenSearch2.19.1 as the store engine
& search engine option for RAGFlow.

### Main Benefit

1. OpenSearch2.19.1 is licensed under the [Apache v2.0 License] which is
much better than Elasticsearch
2. For search, OpenSearch2.19.1 supports full-text
search、vector_search、hybrid_search those are similar with Elasticsearch
on schema
3. For store, OpenSearch2.19.1 stores text、vector those are quite
simliar with Elasticsearch on schema

### Changes

- Support opensearch_python_connetor. I make a lot of adaptions since
the schema and api/method between ES and Opensearch differs in many
ways(especially the knn_search has a significant gap) :
rag/utils/opensearch_coon.py
- Support static config adaptions by changing:
conf/service_conf.yaml、api/settings.py、rag/settings.py
- Supprt some store&search schema changes between OpenSearch and ES:
conf/os_mapping.json
- Support OpenSearch python sdk : pyproject.toml
- Support docker config for OpenSearch2.19.1 :
docker/.env、docker/docker-compose-base.yml、docker/service_conf.yaml.template

### How to use
- I didn't change the priority that ES as the default doc/search engine.
Only if in docker/.env , we set DOC_ENGINE=${DOC_ENGINE:-opensearch}, it
will work.


### Others
Our team tested a lot of docs in our environment by using OpenSearch as
the vector database ,it works very well.
All the conifg for OpenSearch is necessary.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Yongteng Lei <yongtengrey@outlook.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2025-04-24 16:03:31 +08:00
RedBookOfMemory
e2b66628f4
Feat: extend S3 storage compatibility and add knowledge base ID prefix (#6355)
### What problem does this PR solve?

- Added support for S3-compatible protocols.
- Enabled the use of knowledge base ID as a file prefix when storing
files in S3.
- Updated docker/README.md to include detailed S3 and OSS configuration
instructions.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-31 16:09:43 +08:00
The Wind
85924e898e
Fix: enhance aliyun oss access with adding prefix path (#5475)
### What problem does this PR solve?

Enhance aliyun oss access with adding prefix path.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-28 15:00:00 +08:00
hy89
651422127c
Feat: Accessing Alibaba Cloud OSS with Amazon S3 SDK (#5438)
Accessing Alibaba Cloud OSS with Amazon S3 SDK
2025-02-27 17:02:42 +08:00
Omar Leonardo Sanchez Granados
a0b461a18e
Add configuration to choose default llm models (#5245)
### What problem does this PR solve?

This pull request includes changes to the `api/settings.py` and
`docker/service_conf.yaml.template` files to add support for default
models in the LLM configuration (specially for LIGHTEN builds). The most
important changes include adding default model configurations and
updating the initialization settings to use these defaults.

For example:
With this configuration Bedrock will be enable by default with claude
and titan embeddings.

```
user_default_llm:
  factory: 'Bedrock'
  api_key: '{}' 
  base_url: ''
  default_models:
    chat_model: 'anthropic.claude-3-5-sonnet-20240620-v1:0'
    embedding_model: 'amazon.titan-embed-text-v2:0'
    rerank_model: ''
    asr_model: ''
    image2text_model: ''
```


### Type of change

- [X] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-02-24 10:13:39 +08:00
Kenny Dizi
bad764bcda
Improve storage engine (#4341)
### What problem does this PR solve?

- Bring `STORAGE_IMPL` back in `rag/svr/cache_file_svr.py`
- Simplify storage connection when working with AWS S3

### Type of change

- [x] Refactoring
2025-01-06 12:06:24 +08:00
Zhichang Yu
08ead81dde
Bump infinity to v0.5.0-dev5 (#3520)
### What problem does this PR solve?

Bump infinity to v0.5.0-dev5

### Type of change

- [x] Refactoring
2024-11-25 11:53:58 +08:00
Zhichang Yu
9d395ab74e
Added doc for switching elasticsearch to infinity (#3370)
### What problem does this PR solve?

Added doc for switching elasticsearch to infinity

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2024-11-14 00:08:55 +08:00
Guido Schmutz
0c95a3382b
Dynamically create the service_conf.yaml file by replacing environment variables from .env (#3341)
### What problem does this PR solve?

This pull request implements the feature mentioned in #3322. 

Instead of manually having to edit the `service_conf.yaml` file when
changes have been made to `.env` and mapping it into the docker
container at runtime, a template file is used and the values replaced by
the environment variables from the `.env` file when the container is
started.
 

### Type of change

- [X] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2024-11-12 22:56:53 +08:00