fix: prevent redundant retries in async_chat_streamly upon success (#11832)
## What changes were proposed in this pull request? Added a return statement after the successful completion of the async for loop in async_chat_streamly. ## Why are the changes needed? Previously, the code lacked a break/return mechanism inside the try block. This caused the retry loop (for attempt in range...) to continue executing even after the LLM response was successfully generated and yielded, resulting in duplicate requests (up to max_retries times). ## Does this PR introduce any user-facing change? No (it fixes an internal logic bug).
This commit is contained in:
parent
bb6022477e
commit
9863862348
1 changed files with 3 additions and 2 deletions
|
|
@ -187,6 +187,9 @@ class Base(ABC):
|
|||
ans = delta_ans
|
||||
total_tokens += tol
|
||||
yield ans
|
||||
|
||||
yield total_tokens
|
||||
return
|
||||
except Exception as e:
|
||||
e = await self._exceptions_async(e, attempt)
|
||||
if e:
|
||||
|
|
@ -194,8 +197,6 @@ class Base(ABC):
|
|||
yield total_tokens
|
||||
return
|
||||
|
||||
yield total_tokens
|
||||
|
||||
def _length_stop(self, ans):
|
||||
if is_chinese([ans]):
|
||||
return ans + LENGTH_NOTIFICATION_CN
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue