Adjust OpenAI temperature default and add mitigation guidance

This commit is contained in:
yangdx 2025-09-17 02:56:05 +08:00
parent 4a97e9f469
commit e644a3e02f

View file

@ -175,7 +175,8 @@ LLM_BINDING_API_KEY=your_api_key
# LLM_BINDING=openai # LLM_BINDING=openai
### OpenAI Compatible API Specific Parameters ### OpenAI Compatible API Specific Parameters
OPENAI_LLM_TEMPERATURE=0.8 ### Increased temperature values may mitigate infinite inference loops in certain LLM, such as Qwen3-30B.
# OPENAI_LLM_TEMPERATURE=0.9
### Set the max_tokens to mitigate endless output of some LLM (less than LLM_TIMEOUT * llm_output_tokens/second, i.e. 9000 = 180s * 50 tokens/s) ### Set the max_tokens to mitigate endless output of some LLM (less than LLM_TIMEOUT * llm_output_tokens/second, i.e. 9000 = 180s * 50 tokens/s)
### Typically, max_tokens does not include prompt content, though some models, such as Gemini Models, are exceptions ### Typically, max_tokens does not include prompt content, though some models, such as Gemini Models, are exceptions
### For vLLM/SGLang doployed models, or most of OpenAI compatible API provider ### For vLLM/SGLang doployed models, or most of OpenAI compatible API provider