ragflow/rag
6607changchun 9580e99650
fix: retry embedding with Qwen family models when limits temporarily reached. (#8690)
fix: retry embedding with Qwen family models when limits temporarily
reached.

APIs of Qwen family models are limited by calling rates. When reached,
the "output" attribute of the "resp" will be None, and in turn cause
TypeError when trying to retrieve "embeddings". Since these limits are
almost temporary, I have added a simple retry mechanism to avoid it.
Besides, if retry_max reached, the error can be early raised, instead of
hidden behind "TypeError".

### What problem does this PR solve?

Sometimes Qwen blocks calling due to rate limits, but it will cause the
whole parsing procedure stops when creating knowledge base. In this
situation, resp["output"] will be None, and resp["output"]["embeddings"]
will cause TypeError. Since the limits are temporary, I apply a simple
retry mechanism to solve it.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-07-07 12:15:52 +08:00
..
app Fix: docx parse error. (#8600) 2025-07-01 17:38:11 +08:00
llm fix: retry embedding with Qwen family models when limits temporarily reached. (#8690) 2025-07-07 12:15:52 +08:00
nlp Refa: improve GraphRAG similarity sensitivity to numeric differences (#8479) 2025-06-25 16:20:59 +08:00
prompts Refa: refactor prompts into markdown-style structure using Jinja2 (#8667) 2025-07-04 15:59:41 +08:00
res Perf: ignore concate between rows. (#8507) 2025-06-26 14:55:37 +08:00
svr Fix: The data set created by API call failed to parse after uploading the file. (#8657) 2025-07-04 12:41:28 +08:00
utils Fix: add ES re-connect once request timeout. (#8678) 2025-07-07 09:22:25 +08:00
__init__.py Update comments (#4569) 2025-01-21 20:52:28 +08:00
benchmark.py Refactor embedding batch_size (#3825) 2024-12-03 16:22:39 +08:00
prompt_template.py Refa: refactor prompts into markdown-style structure using Jinja2 (#8667) 2025-07-04 15:59:41 +08:00
prompts.py Fix a small typo in count of used fragments (#8673) 2025-07-04 19:46:31 +08:00
raptor.py Refa: limit embedding concurrency and fix chat_with_tool (#8543) 2025-06-27 19:28:41 +08:00
settings.py Feat: make document parsing and embedding batch sizes configurable via environment variables (#8266) 2025-06-16 13:40:47 +08:00