Update OpenAI API config docs for max_tokens and max_completion_tokens

• Clarify max_tokens vs max_completion_tokens
• Add Gemini exception note
• Update parameter descriptions
• Add new completion tokens option
This commit is contained in:
yangdx 2025-09-10 16:23:10 +08:00
parent e3ebf45a18
commit 4a21b7f53f

View file

@ -174,9 +174,12 @@ LLM_BINDING_API_KEY=your_api_key
# LLM_BINDING_API_KEY=your_api_key
# LLM_BINDING=openai
### OpenAI Specific Parameters
### Set the max_output_tokens to mitigate endless output of some LLM (less than LLM_TIMEOUT * llm_output_tokens/second, i.e. 9000 = 180s * 50 tokens/s)
### OpenAI Compatible API Specific Parameters
### Set the max_tokens to mitigate endless output of some LLM (less than LLM_TIMEOUT * llm_output_tokens/second, i.e. 9000 = 180s * 50 tokens/s)
### Typically, max_tokens does not include prompt content, though some models, such as Gemini-2.5-Flash, are exceptions
#### OpenAI's new API utilizes max_completion_tokens instead of max_tokens
# OPENAI_LLM_MAX_TOKENS=9000
# OPENAI_LLM_MAX_COMPLETION_TOKENS=9000
### OpenRouter Specific Parameters
# OPENAI_LLM_EXTRA_BODY='{"reasoning": {"enabled": false}}'