* Add dynamic max_tokens configuration for Anthropic models
Implements model-specific max output token limits for AnthropicClient,
following the same pattern as GeminiClient. This replaces the previous
hardcoded min() cap that was preventing models from using their full
output capacity.
Changes:
- Added ANTHROPIC_MODEL_MAX_TOKENS mapping with limits for all supported
Claude models (ranging from 4K to 65K tokens)
- Implemented _get_max_tokens_for_model() to lookup model-specific limits
- Implemented _resolve_max_tokens() with clear precedence rules:
1. Explicit max_tokens parameter
2. Instance max_tokens from initialization
3. Model-specific limit from mapping
4. Default fallback (8192 tokens)
This allows edge_operations.py to request 16384 tokens for edge extraction
without being artificially capped, while ensuring cheaper models with lower
limits are still properly handled.
Resolves TODO in anthropic_client.py:207-208.
* Clarify that max_tokens mapping represents standard limits
Updated comments to explicitly state that ANTHROPIC_MODEL_MAX_TOKENS
represents standard limits without beta headers. This prevents confusion
about extended limits (e.g., Claude 3.7's 128K with beta header) which
are not currently implemented in this mapping.