Commit graph

17 commits

Author SHA1 Message Date
Daniel Chalef
8b7ad6f84c
Update default Anthropic model to claude-haiku-4-5 (#1070)
* Update default Anthropic model to claude-haiku-4-5-latest

- Add Claude 4.5 models to AnthropicModel type (claude-sonnet-4-5-latest, claude-sonnet-4-5-20250929, claude-haiku-4-5-latest)
- Change DEFAULT_MODEL from claude-3-7-sonnet-latest to claude-haiku-4-5-latest
- Update test assertions to reflect new default model

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Add Claude 4.5 models to max_tokens mapping

- Add claude-sonnet-4-5-latest, claude-sonnet-4-5-20250929, and claude-haiku-4-5-latest to ANTHROPIC_MODEL_MAX_TOKENS
- All Claude 4.5 models support 64K (65536) max output tokens
- Based on official Anthropic documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Update uv.lock dependencies

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-15 09:07:18 -08:00
Matthew Mo
50bcb74502
Add dynamic max_tokens configuration for Anthropic models (#1043)
* Add dynamic max_tokens configuration for Anthropic models

Implements model-specific max output token limits for AnthropicClient,
following the same pattern as GeminiClient. This replaces the previous
hardcoded min() cap that was preventing models from using their full
output capacity.

Changes:
- Added ANTHROPIC_MODEL_MAX_TOKENS mapping with limits for all supported
  Claude models (ranging from 4K to 65K tokens)
- Implemented _get_max_tokens_for_model() to lookup model-specific limits
- Implemented _resolve_max_tokens() with clear precedence rules:
  1. Explicit max_tokens parameter
  2. Instance max_tokens from initialization
  3. Model-specific limit from mapping
  4. Default fallback (8192 tokens)

This allows edge_operations.py to request 16384 tokens for edge extraction
without being artificially capped, while ensuring cheaper models with lower
limits are still properly handled.

Resolves TODO in anthropic_client.py:207-208.

* Clarify that max_tokens mapping represents standard limits

Updated comments to explicitly state that ANTHROPIC_MODEL_MAX_TOKENS
represents standard limits without beta headers. This prevents confusion
about extended limits (e.g., Claude 3.7's 128K with beta header) which
are not currently implemented in this mapping.
2025-11-14 08:34:56 -08:00
Daniel Chalef
196eb2f077
Remove JSON indentation from prompts to reduce token usage (#985)
Changes to `to_prompt_json()` helper to default to minified JSON (no indentation) instead of 2-space indentation. This reduces token consumption in LLM prompts while maintaining all necessary information.

- Changed default `indent` parameter from `2` to `None` in `prompt_helpers.py`
- Updated all prompt modules to remove explicit `indent=2` arguments
- Minor code formatting fixes in LLM clients

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-06 16:08:43 -07:00
Daniel Chalef
6ad695186a
Add OpenTelemetry distributed tracing support (#982)
* Add OpenTelemetry distributed tracing support

- Add tracer abstraction with no-op and OpenTelemetry implementations
- Instrument add_episode and add_episode_bulk with tracing spans
- Instrument LLM client with cache-aware tracing
- Add configurable span name prefix support
- Refactor add_episode methods to improve code quality
- Add OTEL_TRACING.md documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix linting errors in tracing implementation

- Remove unused episodes_by_uuid variable
- Fix tracer type annotations for context manager support
- Replace isinstance tuple with union syntax
- Use contextlib.suppress for exception handling
- Fix import ordering and use AbstractContextManager

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Address PR review feedback on tracing implementation

Critical fixes:
- Remove flawed error span creation in graphiti.py that created orphaned spans
- Restructure LLM client tracing to create span once at start, eliminating code duplication
- Initialize LLM client tracer to NoOpTracer by default to fix type checking

Enhancements:
- Add comprehensive span attributes to add_episode: reference_time, entity/edge type counts, previous episodes count, invalidated edge count, community count
- Optimize isinstance check for better performance

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Add prompt name tracking to OpenTelemetry tracing spans

Add prompt_name parameter to all LLM client generate_response() methods
and set it as a span attribute in the llm.generate span. This enables
better observability by identifying which prompt template was used for
each LLM call.

Changes:
- Add prompt_name parameter to LLMClient.generate_response() base method
- Add prompt_name parameter and tracing to OpenAIBaseClient,
  AnthropicClient, GeminiClient, and OpenAIGenericClient
- Update all 14 LLM call sites across maintenance operations to include
  prompt_name:
  - edge_operations.py: 4 calls
  - node_operations.py: 6 calls (note: 7 listed but only 6 unique)
  - temporal_operations.py: 2 calls
  - community_operations.py: 2 calls

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix exception handling in add_episode to record errors in OpenTelemetry span

Moved try-except block inside the OpenTelemetry span context and added
proper error recording with span.set_status() and span.record_exception().
This ensures exceptions are captured in the distributed trace, matching
the pattern used in add_episode_bulk.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-05 12:26:14 -07:00
Daniel Chalef
513cfbf7b2
Refactor imports (#675)
* Refactor imports

* Fix: Remove duplicate sentence-transformers dependency from dev requirements

* Refactor: Update optional import patterns across various modules for better type checking and error handling

* Update CONTRIBUTING.md

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

---------

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
2025-07-05 08:57:07 -07:00
Evan Schultz
5baaa6fa8c
Anthropic cleanup (#431)
* remove temporary debug logging

* add anthropic api to .env.example

* move anthropic int tests to llm_client dir to better match existing test structure

* update `TestLLMClient` to `MockLLMClient` to eliminate pytest warning
2025-05-03 09:15:03 -04:00
Preston Rasmussen
2ffc58b3da
small model fix (#432)
* updated dedupe nodes operations

* updates

* Update examples/podcast/podcast_transcript.txt

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* mypy

---------

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
2025-05-02 10:08:25 -04:00
Soichi Sumi
17c177e91a
Use self.max_tokens when max_token isnt specified (#382)
* Fix: use self.max_tokens when max_token isnt specified

* Fix: use self.max_tokens in OpenAI clients

* Fix: use self.max_tokens in Anthropic client

* Fix: use self.max_tokens in Gemini client
2025-04-21 11:38:09 -04:00
Evan Schultz
113179f674
Anthropic client (#361)
* update Anthropic client to use tool calling and add tests

* fix linting errors before creating pull request by making literal types for anthropic models
2025-04-16 12:35:07 -07:00
Daniel Chalef
4307274967
Add MCP Server (#301)
* experimental

* experimental

* experimental

* wip

* wip

* wip

* wip

* code cleanup

* refactor and cleanup

* fix lint

* remove unneeded mcp dep

* polish
2025-03-24 17:08:19 -07:00
Preston Rasmussen
0f50b74735
Set max tokens by prompt (#255)
* set max tokens

* update generic openai client

* mypy updates

* fix: dockerfile

---------

Co-authored-by: paulpaliychuk <pavlo.paliychuk.ca@gmail.com>
2025-01-24 10:14:49 -05:00
Daniel Chalef
567a8ab74a
Implement OpenAI Structured Output (#225)
* implement so

* bug fixes and typing

* inject schema for non-openai clients

* correct datetime format

* remove List keyword

* Refactor node_operations.py to use updated prompt_library functions

* update example
2024-12-05 07:03:18 -08:00
Pavlo Paliychuk
a7148d6260
feat: Dedicated embedder interface (#159)
* feat: Add Embedder interface and implement openai embedder

* feat: Add voyage ai embedder
2024-09-27 12:47:04 -04:00
Daniel Chalef
14d5ce0b36
Override default max tokens for Anthropic and Groq clients (#143)
* Override default max tokens for Anthropic and Groq clients

* Override default max tokens for Anthropic and Groq clients

* Override default max tokens for Anthropic and Groq clients
2024-09-22 11:33:54 -07:00
Daniel Chalef
6851b1063a
Fix llm client retry (#102)
* Fix llm client retry

* feat: Improve llm client retry error message
2024-09-10 08:15:27 -07:00
Daniel Chalef
895afc7be1
implement diskcache (#39)
* chore: Add romeo runner

* fix: Linter

* wip

* wip dump

* chore: Update romeo parser

* chore: Anthropic model fix

* wip

* allbirds

* allbirds runner

* format

* wip

* wip

* mypy updates

* update

* remove r

* update tests

* format

* wip

* chore: Strategically update the message

* rebase and fix import issues

* Update package imports for graphiti_core in examples and utils

* nits

* chore: Update OpenAI GPT-4o model to gpt-4o-2024-08-06

* implement groq

* improvments & linting

* cleanup and nits

* Refactor package imports for graphiti_core in examples and utils

* Refactor package imports for graphiti_core in examples and utils

* implement diskcache

* remove debug stuff

* log cache hit when debugging only

* Improve LLM config. Fix bugs (#41)

Refactor LLMConfig class to allow None values for model and base_url

* chore: Resolve mc

---------

Co-authored-by: paulpaliychuk <pavlo.paliychuk.ca@gmail.com>
Co-authored-by: prestonrasmussen <prasmuss15@gmail.com>
2024-08-26 13:13:05 -04:00
Pavlo Paliychuk
0ed7739bc0
Controlled example (#37)
* chore: Add romeo runner

* fix: Linter

* dedupe fixes

* wip

* wip dump

* allbirds

* chore: Update romeo parser

* chore: Anthropic model fix

* allbirds runner

* format

* wip

* mypy updates

* update

* remove r

* update tests

* format

* wip

* wip

* wip

* chore: Strategically update the message

* chore: Add romeo runner

* fix: Linter

* wip

* wip dump

* chore: Update romeo parser

* chore: Anthropic model fix

* wip

* allbirds

* allbirds runner

* format

* wip

* wip

* mypy updates

* update

* remove r

* update tests

* format

* wip

* chore: Strategically update the message

* rebase and fix import issues

* Update package imports for graphiti_core in examples and utils

* nits

* chore: Update OpenAI GPT-4o model to gpt-4o-2024-08-06

* implement groq

* improvments & linting

* cleanup and nits

* Refactor package imports for graphiti_core in examples and utils

* Refactor package imports for graphiti_core in examples and utils

* chore: Nuke unused examples

* chore: Nuke unused examples

* chore: Only run type check on graphiti_core

* fix unit tests

* reformat

* unit test

* fix: Unit tests

* test: Add coverage for extract_date_strings_from_edge

* lint

* remove commented code

---------

Co-authored-by: prestonrasmussen <prasmuss15@gmail.com>
Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>
2024-08-26 10:30:22 -04:00