ragflow/common
sjIlll 1777620ea5
fix: set default embedding model for TEI profile in Docker deployment (#11824)
## What's changed
fix: unify embedding model fallback logic for both TEI and non-TEI
Docker deployments

> This fix targets **Docker / `docker-compose` deployments**, ensuring a
valid default embedding model is always set—regardless of the compose
profile used.

##  Changes

| Scenario | New Behavior |
|--------|--------------|
| **Non-`tei-` profile** (e.g., default deployment) | `EMBEDDING_MDL` is
now correctly initialized from `EMBEDDING_CFG` (derived from
`user_default_llm`), ensuring custom defaults like `bge-m3@Ollama` are
properly applied to new tenants. |
| **`tei-` profile** (`COMPOSE_PROFILES` contains `tei-`) | Still
respects the `TEI_MODEL` environment variable. If unset, falls back to
`EMBEDDING_CFG`. Only when both are empty does it use the built-in
default (`BAAI/bge-small-en-v1.5`), preventing an empty embedding model.
|

##  Why This Change?

- **In non-TEI mode**: The previous logic would reset `EMBEDDING_MDL` to
an empty string, causing pre-configured defaults (e.g., `bge-m3@Ollama`
in the Docker image) to be ignored—leading to tenant initialization
failures or silent misconfigurations.
- **In TEI mode**: Users need the ability to override the model via
`TEI_MODEL`, but without a safe fallback, missing configuration could
break the system. The new logic adopts a **“config-first,
env-var-override”** strategy for robustness in containerized
environments.

##  Implementation

- Updated the assignment logic for `EMBEDDING_MDL` in
`rag/common/settings.py` to follow a unified fallback chain:

EMBEDDING_CFG → TEI_MODEL (if tei- profile active) → built-in default


##  Testing

Verified in Docker deployments:

1. **`COMPOSE_PROFILES=`** (no TEI)  
 → New tenants get `bge-m3@Ollama` as the default embedding model 
2. **`COMPOSE_PROFILES=tei-gpu` with no `TEI_MODEL` set**  
 → Falls back to `BAAI/bge-small-en-v1.5`   
3. **`COMPOSE_PROFILES=tei-gpu` with `TEI_MODEL=my-model`**  
 → New tenants use `my-model` as the embedding model   

Closes #8916 
fix #11522 
fix #11306
2025-12-09 09:38:44 +08:00
..
data_source Fix errors (#11804) 2025-12-08 12:21:18 +08:00
__init__.py
config_utils.py Move some enumerate type to constants.py (#10998) 2025-11-04 19:25:25 +08:00
connection_utils.py Fix: refine error msg. (#11380) 2025-11-19 19:10:45 +08:00
constants.py feat(gcs): Add support for Google Cloud Storage (GCS) integration (#11718) 2025-12-04 10:44:05 +08:00
decorator.py Move singleton to common directory (#10935) 2025-11-02 12:24:08 +08:00
exceptions.py Feat: GraphRAG handle cancel gracefully (#11061) 2025-11-06 16:12:20 +08:00
file_utils.py Refactor file utils (#10970) 2025-11-03 18:54:55 +08:00
float_utils.py
http_client.py fix: align http client proxy kwarg (#11818) 2025-12-09 09:35:03 +08:00
log_utils.py Feat: add Jira connector (#11285) 2025-11-17 09:38:04 +08:00
mcp_tool_call_conn.py Refactor: move mcp connection utilities to common (#11304) 2025-11-17 15:34:17 +08:00
misc_utils.py Refactor code (#11694) 2025-12-03 15:15:00 +08:00
settings.py fix: set default embedding model for TEI profile in Docker deployment (#11824) 2025-12-09 09:38:44 +08:00
signal_utils.py Fix and refactor imports (#11010) 2025-11-05 11:07:54 +08:00
string_utils.py Fix errors (#11804) 2025-12-08 12:21:18 +08:00
time_utils.py
token_utils.py fix(llm): handle None response in total_token_count_from_response (#10941) 2025-11-20 10:04:03 +08:00
versions.py Admin: add 'show version' (#11079) 2025-11-06 19:24:46 +08:00