Adjust OpenAI temperature default and add mitigation guidance
This commit is contained in:
parent
4a97e9f469
commit
e644a3e02f
1 changed files with 2 additions and 1 deletions
|
|
@ -175,7 +175,8 @@ LLM_BINDING_API_KEY=your_api_key
|
|||
# LLM_BINDING=openai
|
||||
|
||||
### OpenAI Compatible API Specific Parameters
|
||||
OPENAI_LLM_TEMPERATURE=0.8
|
||||
### Increased temperature values may mitigate infinite inference loops in certain LLM, such as Qwen3-30B.
|
||||
# OPENAI_LLM_TEMPERATURE=0.9
|
||||
### Set the max_tokens to mitigate endless output of some LLM (less than LLM_TIMEOUT * llm_output_tokens/second, i.e. 9000 = 180s * 50 tokens/s)
|
||||
### Typically, max_tokens does not include prompt content, though some models, such as Gemini Models, are exceptions
|
||||
### For vLLM/SGLang doployed models, or most of OpenAI compatible API provider
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue