* Update default Anthropic model to claude-haiku-4-5-latest
- Add Claude 4.5 models to AnthropicModel type (claude-sonnet-4-5-latest, claude-sonnet-4-5-20250929, claude-haiku-4-5-latest)
- Change DEFAULT_MODEL from claude-3-7-sonnet-latest to claude-haiku-4-5-latest
- Update test assertions to reflect new default model
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add Claude 4.5 models to max_tokens mapping
- Add claude-sonnet-4-5-latest, claude-sonnet-4-5-20250929, and claude-haiku-4-5-latest to ANTHROPIC_MODEL_MAX_TOKENS
- All Claude 4.5 models support 64K (65536) max output tokens
- Based on official Anthropic documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update uv.lock dependencies
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Add dynamic max_tokens configuration for Anthropic models
Implements model-specific max output token limits for AnthropicClient,
following the same pattern as GeminiClient. This replaces the previous
hardcoded min() cap that was preventing models from using their full
output capacity.
Changes:
- Added ANTHROPIC_MODEL_MAX_TOKENS mapping with limits for all supported
Claude models (ranging from 4K to 65K tokens)
- Implemented _get_max_tokens_for_model() to lookup model-specific limits
- Implemented _resolve_max_tokens() with clear precedence rules:
1. Explicit max_tokens parameter
2. Instance max_tokens from initialization
3. Model-specific limit from mapping
4. Default fallback (8192 tokens)
This allows edge_operations.py to request 16384 tokens for edge extraction
without being artificially capped, while ensuring cheaper models with lower
limits are still properly handled.
Resolves TODO in anthropic_client.py:207-208.
* Clarify that max_tokens mapping represents standard limits
Updated comments to explicitly state that ANTHROPIC_MODEL_MAX_TOKENS
represents standard limits without beta headers. This prevents confusion
about extended limits (e.g., Claude 3.7's 128K with beta header) which
are not currently implemented in this mapping.
Changes to `to_prompt_json()` helper to default to minified JSON (no indentation) instead of 2-space indentation. This reduces token consumption in LLM prompts while maintaining all necessary information.
- Changed default `indent` parameter from `2` to `None` in `prompt_helpers.py`
- Updated all prompt modules to remove explicit `indent=2` arguments
- Minor code formatting fixes in LLM clients
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
* Add OpenTelemetry distributed tracing support
- Add tracer abstraction with no-op and OpenTelemetry implementations
- Instrument add_episode and add_episode_bulk with tracing spans
- Instrument LLM client with cache-aware tracing
- Add configurable span name prefix support
- Refactor add_episode methods to improve code quality
- Add OTEL_TRACING.md documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix linting errors in tracing implementation
- Remove unused episodes_by_uuid variable
- Fix tracer type annotations for context manager support
- Replace isinstance tuple with union syntax
- Use contextlib.suppress for exception handling
- Fix import ordering and use AbstractContextManager
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Address PR review feedback on tracing implementation
Critical fixes:
- Remove flawed error span creation in graphiti.py that created orphaned spans
- Restructure LLM client tracing to create span once at start, eliminating code duplication
- Initialize LLM client tracer to NoOpTracer by default to fix type checking
Enhancements:
- Add comprehensive span attributes to add_episode: reference_time, entity/edge type counts, previous episodes count, invalidated edge count, community count
- Optimize isinstance check for better performance
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add prompt name tracking to OpenTelemetry tracing spans
Add prompt_name parameter to all LLM client generate_response() methods
and set it as a span attribute in the llm.generate span. This enables
better observability by identifying which prompt template was used for
each LLM call.
Changes:
- Add prompt_name parameter to LLMClient.generate_response() base method
- Add prompt_name parameter and tracing to OpenAIBaseClient,
AnthropicClient, GeminiClient, and OpenAIGenericClient
- Update all 14 LLM call sites across maintenance operations to include
prompt_name:
- edge_operations.py: 4 calls
- node_operations.py: 6 calls (note: 7 listed but only 6 unique)
- temporal_operations.py: 2 calls
- community_operations.py: 2 calls
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix exception handling in add_episode to record errors in OpenTelemetry span
Moved try-except block inside the OpenTelemetry span context and added
proper error recording with span.set_status() and span.record_exception().
This ensures exceptions are captured in the distributed trace, matching
the pattern used in add_episode_bulk.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* remove temporary debug logging
* add anthropic api to .env.example
* move anthropic int tests to llm_client dir to better match existing test structure
* update `TestLLMClient` to `MockLLMClient` to eliminate pytest warning
* Fix: use self.max_tokens when max_token isnt specified
* Fix: use self.max_tokens in OpenAI clients
* Fix: use self.max_tokens in Anthropic client
* Fix: use self.max_tokens in Gemini client
* update Anthropic client to use tool calling and add tests
* fix linting errors before creating pull request by making literal types for anthropic models
* implement so
* bug fixes and typing
* inject schema for non-openai clients
* correct datetime format
* remove List keyword
* Refactor node_operations.py to use updated prompt_library functions
* update example
* Override default max tokens for Anthropic and Groq clients
* Override default max tokens for Anthropic and Groq clients
* Override default max tokens for Anthropic and Groq clients