Update OpenAI API config docs for max_tokens and max_completion_tokens
• Clarify max_tokens vs max_completion_tokens • Add Gemini exception note • Update parameter descriptions • Add new completion tokens option
This commit is contained in:
parent
e3ebf45a18
commit
4a21b7f53f
1 changed files with 5 additions and 2 deletions
|
|
@ -174,9 +174,12 @@ LLM_BINDING_API_KEY=your_api_key
|
||||||
# LLM_BINDING_API_KEY=your_api_key
|
# LLM_BINDING_API_KEY=your_api_key
|
||||||
# LLM_BINDING=openai
|
# LLM_BINDING=openai
|
||||||
|
|
||||||
### OpenAI Specific Parameters
|
### OpenAI Compatible API Specific Parameters
|
||||||
### Set the max_output_tokens to mitigate endless output of some LLM (less than LLM_TIMEOUT * llm_output_tokens/second, i.e. 9000 = 180s * 50 tokens/s)
|
### Set the max_tokens to mitigate endless output of some LLM (less than LLM_TIMEOUT * llm_output_tokens/second, i.e. 9000 = 180s * 50 tokens/s)
|
||||||
|
### Typically, max_tokens does not include prompt content, though some models, such as Gemini-2.5-Flash, are exceptions
|
||||||
|
#### OpenAI's new API utilizes max_completion_tokens instead of max_tokens
|
||||||
# OPENAI_LLM_MAX_TOKENS=9000
|
# OPENAI_LLM_MAX_TOKENS=9000
|
||||||
|
# OPENAI_LLM_MAX_COMPLETION_TOKENS=9000
|
||||||
|
|
||||||
### OpenRouter Specific Parameters
|
### OpenRouter Specific Parameters
|
||||||
# OPENAI_LLM_EXTRA_BODY='{"reasoning": {"enabled": false}}'
|
# OPENAI_LLM_EXTRA_BODY='{"reasoning": {"enabled": false}}'
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue