Replace hardcoded DEFAULT_MODEL and DEFAULT_SMALL_MODEL constants across all LLM clients with a centralized, configurable provider defaults system.
Key changes:
- Created provider_defaults.py with centralized configuration for all providers
- Added environment variable support for easy customization (e.g., GEMINI_DEFAULT_MODEL)
- Updated all LLM clients to use configurable defaults instead of hardcoded constants
- Made edge operations max_tokens configurable via EXTRACT_EDGES_MAX_TOKENS
- Updated cross-encoder reranker clients to use provider defaults
- Maintained full backward compatibility with existing configurations
This resolves the issue where Gemini's flash-lite model has location constraints in Vertex AI that differ from the regular flash model, and users couldn't easily override these without editing source code.
Environment variables now supported:
- {PROVIDER}_DEFAULT_MODEL
- {PROVIDER}_DEFAULT_SMALL_MODEL
- {PROVIDER}_DEFAULT_MAX_TOKENS
- {PROVIDER}_DEFAULT_TEMPERATURE
- {PROVIDER}_EXTRACT_EDGES_MAX_TOKENS
- EXTRACT_EDGES_MAX_TOKENS (global fallback)
Fixes#681
Co-authored-by: Daniel Chalef <danielchalef@users.noreply.github.com>
The cross_encoder for Gemini already supported passing in a custom client.
I replicated the same input pattern to embedder and llm_client.
The value is, you can support custom API endpoints and other options like below:
cross_encoder=GeminiRerankerClient(
client=genai.Client(
api_key=os.environ.get('GOOGLE_GENAI_API_KEY'),
http_options=types.HttpOptions(api_version='v1alpha')),
config=LLMConfig(
model="gemini-2.5-flash-lite-preview-06-17"
)
))
* add support for Gemini 2.5 model thinking budget
* allow adding thinking config to support current and future gemini models
* merge
* improve client; add reranker
* refactor: change type hint for gemini_messages to Any for flexibility
* refactor: update GeminiRerankerClient to use direct relevance scoring and improve ranking logic. Add tests
* fix fixtures
---------
Co-authored-by: realugbun <github.disorder751@passmail.net>
* Bump version from 0.9.0 to 0.9.1 in pyproject.toml and update google-genai dependency to >=0.1.0
* Bump version from 0.9.1 to 0.9.2 in pyproject.toml
* Update google-genai dependency version to >=0.8.0 in pyproject.toml
* loc file
* Update pyproject.toml to version 0.9.3, restructure dependencies, and modify author format. Remove outdated Google API key note from README.md.
* upgrade poetry and ruff
* allow usage of different openai compatible clients in embedder and encoder
* azure openai
* cross encoder example
---------
Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>
* implement so
* bug fixes and typing
* inject schema for non-openai clients
* correct datetime format
* remove List keyword
* Refactor node_operations.py to use updated prompt_library functions
* update example