This commit introduces a comprehensive configuration system that makes Graphiti more flexible and easier to configure across different providers and deployment environments. ## New Features - **Unified Configuration**: New GraphitiConfig class with Pydantic validation - **YAML Support**: Load configuration from .graphiti.yaml files - **Multi-Provider Support**: Easy switching between OpenAI, Azure, Anthropic, Gemini, Groq, and LiteLLM - **LiteLLM Integration**: Unified access to 100+ LLM providers - **Factory Functions**: Automatic client creation from configuration - **Full Backward Compatibility**: Existing code continues to work ## Configuration System - graphiti_core/config/settings.py: Pydantic configuration classes - graphiti_core/config/providers.py: Provider enumerations and defaults - graphiti_core/config/factory.py: Factory functions for client creation ## LiteLLM Client - graphiti_core/llm_client/litellm_client.py: New unified LLM client - Support for Azure OpenAI, AWS Bedrock, Vertex AI, Ollama, vLLM, etc. - Automatic structured output detection ## Documentation - docs/CONFIGURATION.md: Comprehensive configuration guide - examples/graphiti_config_example.yaml: Example configurations - DOMAIN_AGNOSTIC_IMPROVEMENT_PLAN.md: Future improvement roadmap ## Tests - tests/config/test_settings.py: 22 tests for configuration - tests/config/test_factory.py: 12 tests for factories - 33/34 tests passing (97%) ## Issues Addressed - #1004: Azure OpenAI support - #1006: Azure OpenAI reranker support - #1007: vLLM/OpenAI-compatible provider stability - #1074: Ollama embeddings support - #995: Docker Azure OpenAI support 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
23 KiB
Graphiti Domain-Agnostic Improvement Plan
Date: 2025-11-30
Status: Draft
Last Pull: 422558d (main branch)
Executive Summary
This document outlines a strategic plan to make Graphiti more domain-agnostic and adaptable to diverse use cases beyond conversational AI. The current architecture, while powerful, contains several domain-specific assumptions (primarily around messaging/conversational data) that limit its applicability to other domains such as scientific research, legal documents, IoT data, healthcare records, financial transactions, etc.
Current Architecture Analysis
Key Components Review
-
NER & Entity Extraction (
graphiti_core/utils/maintenance/node_operations.py,graphiti_core/prompts/extract_nodes.py)- Hardcoded prompts for three episode types: message, text, JSON
- Domain-specific language (e.g., "speaker", "conversation")
- Entity type classification tightly coupled with extraction logic
-
LLM Client Configuration (
graphiti_core/llm_client/config.py,graphiti_core/graphiti.py)- Defaults to OpenAI across all components
- No centralized model selection strategy
- Temperature (1.0) and max_tokens (8192) hardcoded as defaults
-
Episode Types (
graphiti_core/nodes.py)- Limited to: message, text, JSON
- Each type requires separate prompt functions
- No extensibility mechanism for custom episode types
-
Prompt System (
graphiti_core/prompts/)- Prompts are Python functions, not configurable data
- No template engine or override mechanism
- Domain assumptions embedded in prompt text
-
Search & Retrieval (
graphiti_core/search/)- Flexible but complex configuration
- Limited domain-specific search recipes
- No semantic domain adapters
Identified Issues from GitHub (Top 20)
High-Impact Issues Related to Domain Agnostic Goals:
- #1087: Embedding truncation reduces retrieval quality for text-embedding-3-small
- #1074: Neo4j quickstart returns no results with OpenAI-compatible LLM + Ollama embeddings
- #1007: OpenAIGenericClient outputs unstable for vllm serving gpt-oss-20b
- #1006: OpenAIRerankerClient does not support AzureOpenAILLMClient
- #1004: Azure OpenAI is not supported
- #995: Docker container does not support Azure OpenAI
- #1077: Support for Google Cloud Spanner Graph
- #947: Support for Apache AGE as Graph DB
- #1016: Support episode vector
- #961: Improve Episodes API - return UUID, support GET by ID, custom metadata
Improvement Directives
1. Configurable Prompt System 🔴 Priority: CRITICAL
Objective
Replace hardcoded prompt functions with a templatable, extensible prompt system that supports domain customization.
Implementation Plan
Phase 1: Prompt Template Engine
- Create
PromptTemplateclass with variable interpolation - Support multiple template formats (Jinja2, mustache, or custom)
- Add prompt registry for registration and lookup
# Example API
class PromptTemplate:
def __init__(self, template: str, variables: dict[str, str]):
self.template = template
self.variables = variables
def render(self, context: dict[str, Any]) -> str:
# Template rendering logic
pass
class PromptRegistry:
def register(self, name: str, template: PromptTemplate) -> None:
pass
def get(self, name: str) -> PromptTemplate:
pass
def override(self, name: str, template: PromptTemplate) -> None:
pass
Phase 2: Refactor Existing Prompts
- Convert all prompt functions in
graphiti_core/prompts/to templates - Maintain backward compatibility with existing API
- Add domain-specific prompt overrides
Phase 3: Documentation & Examples
- Create prompt customization guide
- Provide domain-specific examples (legal, scientific, financial)
- Add prompt testing utilities
Priority Rationale
- Impact: Enables all domain customization downstream
- Complexity: Medium - requires careful refactoring
- Dependencies: None - can be done independently
Blockers
- Breaking Changes: Need to maintain backward compatibility
- LLM Provider Compatibility: Different providers may require different prompt formats
- Testing: Need comprehensive test suite for prompt variations
Success Metrics
- Users can customize prompts without code changes
- 5+ domain-specific prompt examples documented
- No regression in existing use cases
2. Pluggable NER & Entity Extraction Pipeline 🔴 Priority: CRITICAL
Objective
Make the entity extraction pipeline modular and extensible for different domain requirements.
Implementation Plan
Phase 1: Extraction Strategy Interface
- Define
ExtractionStrategyprotocol/abstract class - Support custom entity extractors (LLM-based, rule-based, hybrid)
- Allow domain-specific entity type systems
class ExtractionStrategy(Protocol):
async def extract_entities(
self,
episode: EpisodicNode,
context: dict[str, Any],
entity_types: dict[str, type[BaseModel]] | None = None,
) -> list[EntityNode]:
...
async def extract_relations(
self,
episode: EpisodicNode,
entities: list[EntityNode],
context: dict[str, Any],
) -> list[EntityEdge]:
...
Phase 2: Domain-Specific Extractors
- Create extractors for common domains:
ScientificPaperExtractor: Extracts researchers, institutions, findings, citationsLegalDocumentExtractor: Extracts parties, cases, statutes, precedentsFinancialExtractor: Extracts companies, transactions, indicatorsIoTEventExtractor: Extracts devices, sensors, readings, locationsHealthcareExtractor: Extracts patients, conditions, treatments, providers
Phase 3: Extractor Composition
- Allow chaining multiple extractors
- Support fallback strategies
- Enable parallel extraction with merging
Priority Rationale
- Impact: Directly addresses domain specificity in core extraction
- Complexity: High - touches critical path
- Dependencies: Depends on Directive #1 (prompts)
Blockers
- Performance: Multiple extractors may impact latency
- Conflict Resolution: Different extractors may produce conflicting entities
- Schema Validation: Need flexible validation for diverse entity types
Success Metrics
- 3+ domain-specific extractors implemented
- 50%+ reduction in domain customization code
- No performance degradation for default use case
3. Centralized Configuration Management 🟡 Priority: HIGH
Objective
Create a unified configuration system for LLM clients, embedders, and other components.
Implementation Plan
Phase 1: Configuration Schema
- Create
GraphitiConfigwith hierarchical structure - Support environment variables, config files (YAML/TOML), and programmatic config
- Add validation with Pydantic
class LLMProviderConfig(BaseModel):
provider: Literal["openai", "anthropic", "gemini", "groq", "custom"]
model: str
small_model: str | None = None
api_key: str | None = None
base_url: str | None = None
temperature: float = 1.0
max_tokens: int = 8192
class EmbedderConfig(BaseModel):
provider: Literal["openai", "voyage", "gemini", "custom"]
model: str
api_key: str | None = None
embedding_dim: int | None = None
class GraphitiConfig(BaseModel):
llm: LLMProviderConfig
embedder: EmbedderConfig
database: DatabaseConfig
extraction: ExtractionConfig
search: SearchConfig
Phase 2: Config Loading & Merging
- Support config file discovery (
.graphiti.yaml,graphiti.config.toml) - Merge configs from multiple sources (file < env < code)
- Add config validation and helpful error messages
Phase 3: Domain-Specific Presets
- Create preset configs for common use cases
- Support config inheritance and composition
# Example: .graphiti.yaml
extends: "presets/scientific-research"
llm:
provider: anthropic
model: claude-sonnet-4-5-latest
temperature: 0.3
extraction:
entity_types:
- Researcher
- Institution
- Finding
- Methodology
extractors:
- type: llm
prompt: prompts/scientific_entities.yaml
- type: regex
patterns: prompts/scientific_patterns.yaml
Priority Rationale
- Impact: Simplifies deployment and customization
- Complexity: Medium
- Dependencies: None
Blockers
- Backward Compatibility: Must support existing initialization patterns
- Security: API keys and credentials management
- Validation: Complex validation rules across providers
Success Metrics
- Single config file for complete setup
- Zero hardcoded defaults in core code
- 10+ domain preset configs available
4. Extensible Episode Type System 🟡 Priority: HIGH
Objective
Allow users to define custom episode types with associated extraction logic.
Implementation Plan
Phase 1: Episode Type Registry
- Create
EpisodeTypeRegistryfor dynamic episode types - Support custom episode type definitions with Pydantic
class EpisodeTypeDefinition(BaseModel):
name: str
description: str
content_schema: type[BaseModel] | None = None
extraction_strategy: str | ExtractionStrategy
prompt_template: str | None = None
class EpisodeTypeRegistry:
def register(self, episode_type: EpisodeTypeDefinition) -> None:
pass
def get(self, name: str) -> EpisodeTypeDefinition:
pass
Phase 2: Dynamic Dispatch
- Modify
extract_nodes()to dispatch based on episode type - Support fallback to default extraction for undefined types
Phase 3: Common Episode Types
- Provide built-in types for common domains:
scientific_paperlegal_documentfinancial_reportiot_eventhealthcare_recordemailapi_log
Priority Rationale
- Impact: Removes major extensibility bottleneck
- Complexity: Medium
- Dependencies: Depends on Directive #2 (extractors)
Blockers
- Type Safety: Ensuring type safety with dynamic types
- Validation: Schema validation for custom content
- Migration: Migrating existing message/text/JSON types
Success Metrics
- Users can add episode types without code changes
- 5+ built-in episode types for different domains
- Clear migration path from existing types
5. Domain-Specific Search Strategies 🟢 Priority: MEDIUM
Objective
Provide domain-optimized search configurations and strategies.
Implementation Plan
Phase 1: Search Strategy Templates
- Create domain-specific search configs in
search_config_recipes.py - Optimize for domain characteristics (e.g., temporal for financial, spatial for IoT)
# Examples
FINANCIAL_TEMPORAL_SEARCH = SearchConfig(
edge_config=EdgeSearchConfig(
search_methods=[
EdgeSearchMethod.cosine_similarity,
EdgeSearchMethod.bm25,
],
reranker=EdgeReranker.episode_mentions,
),
# Prioritize recent events
# ... domain-specific configuration
)
SCIENTIFIC_CITATION_SEARCH = SearchConfig(
# Optimize for citation networks
# ... domain-specific configuration
)
Phase 2: Semantic Domain Adapters
- Create domain-specific query expansion
- Add domain vocabulary mapping
- Support domain-specific relevance scoring
Phase 3: Search Analytics
- Track search performance by domain
- Provide domain-specific search insights
- Auto-tune search configs based on usage
Priority Rationale
- Impact: Improves search quality for specific domains
- Complexity: Low-Medium
- Dependencies: None - additive feature
Blockers
- Domain Expertise: Requires deep understanding of each domain
- Evaluation: Need domain-specific test datasets
- Maintenance: Each domain strategy needs ongoing optimization
Success Metrics
- 5+ domain-optimized search strategies
- Measurable improvement in domain-specific retrieval quality
- Search strategy recommendation system
6. Multi-Provider LLM & Embedder Support Enhancement 🟢 Priority: MEDIUM
Objective
Improve support for diverse LLM and embedding providers, addressing current issues with Azure, Anthropic, and local models.
Implementation Plan
Phase 1: Provider Abstraction Improvements
- Enhance
LLMClientinterface for provider-specific features - Better handling of structured output across providers (#1007)
- Unified error handling and retries
Phase 2: Provider-Specific Optimizations
- Azure OpenAI full support (#1004, #995, #1006)
- Anthropic optimization for structured output
- Local model support (Ollama, vLLM) (#1074, #1007)
- Google Cloud Vertex AI integration
Phase 3: Embedder Flexibility
- Support mixed embedding strategies (different models for nodes vs edges)
- Domain-specific embedding fine-tuning
- Embedding dimension adaptation (#1087)
Priority Rationale
- Impact: Addresses multiple GitHub issues, improves flexibility
- Complexity: Medium-High (provider-specific quirks)
- Dependencies: Related to Directive #3 (config)
Blockers
- Provider API Changes: External dependencies on provider APIs
- Testing: Requires access to multiple provider accounts
- Cost: Testing across providers can be expensive
Success Metrics
- All providers in CLAUDE.md fully supported
- Resolution of issues #1004, #1006, #1007, #1074, #995
- Provider switching with zero code changes
7. Enhanced Metadata & Custom Attributes 🟢 Priority: MEDIUM
Objective
Support domain-specific metadata on all graph elements (nodes, edges, episodes).
Implementation Plan
Phase 1: Flexible Metadata Schema
- Add
custom_metadata: dict[str, Any]to all core types - Support typed metadata with Pydantic models
- Index metadata for searchability
Phase 2: Domain-Specific Attributes
- Support custom attributes per domain
- Attribute extraction from episodes
- Attribute-based filtering in search
Phase 3: Metadata API Improvements
- Episode API enhancements (#961)
- Metadata update operations
- Bulk metadata operations
Priority Rationale
- Impact: Enables rich domain modeling
- Complexity: Low-Medium
- Dependencies: Database schema changes
Blockers
- Schema Migration: Existing graphs need migration
- Index Performance: Metadata indexing may impact performance
- Validation: Complex validation for diverse metadata
Success Metrics
- Custom metadata on all graph elements
- Metadata-based search and filtering
- Resolution of issue #961
8. Database Provider Expansion 🔵 Priority: LOW
Objective
Support additional graph databases to meet diverse deployment requirements.
Implementation Plan
Phase 1: Abstract Driver Interface
- Enhance
GraphDriverabstraction - Standardize query translation layer
- Support for property graph vs RDF models
Phase 2: New Drivers
- Google Cloud Spanner Graph (#1077)
- Apache AGE (#947)
- Amazon Neptune improvements (#1082)
- TigerGraph, NebulaGraph
Phase 3: Driver Selection Guide
- Performance comparison matrix
- Use case recommendations
- Migration tools between drivers
Priority Rationale
- Impact: Addresses specific GitHub requests, increases deployment options
- Complexity: High (each driver is significant work)
- Dependencies: None
Blockers
- Maintenance Burden: Each driver requires ongoing support
- Feature Parity: Different databases have different capabilities
- Testing: Complex integration testing for each database
Success Metrics
- 2+ new database drivers
- Resolution of issues #1077, #947
- Database migration tools
9. Documentation & Examples for Domain Adaptation 🟡 Priority: HIGH
Objective
Comprehensive documentation showing how to adapt Graphiti to different domains.
Implementation Plan
Phase 1: Domain Adaptation Guide
- Step-by-step guide for domain customization
- Decision tree for configuration choices
- Best practices for each domain type
Phase 2: Complete Domain Examples
- Scientific Research knowledge graph
- Legal Document analysis
- Financial Transaction network
- IoT Event processing
- Healthcare Records integration
Phase 3: Tutorial Series
- Video walkthroughs
- Interactive Jupyter notebooks
- Code generation tools for domain setup
Priority Rationale
- Impact: Critical for adoption in new domains
- Complexity: Medium (requires domain expertise)
- Dependencies: Depends on implementation of above directives
Blockers
- Domain Expertise: Need experts for each domain
- Maintenance: Examples need to stay current with codebase
- Quality: Need real-world datasets and validation
Success Metrics
- 5+ complete domain examples
- Documentation coverage >80%
- User-contributed domain examples
10. Testing & Evaluation Framework for Domains 🟢 Priority: MEDIUM
Objective
Create domain-specific test datasets and evaluation metrics.
Implementation Plan
Phase 1: Domain Test Datasets
- Curate/generate test data for each domain
- Include ground truth annotations
- Support for evaluation benchmarks
Phase 2: Evaluation Metrics
- Domain-specific quality metrics
- Extraction accuracy measurements
- Search relevance evaluation
Phase 3: Continuous Evaluation
- Automated testing across domains
- Performance regression detection
- Quality dashboards
Priority Rationale
- Impact: Ensures quality across domains
- Complexity: Medium
- Dependencies: Depends on domain implementations
Blockers
- Data Acquisition: Domain datasets can be hard to obtain
- Annotation: Ground truth annotation is expensive
- Standardization: Metrics vary significantly by domain
Success Metrics
- Test coverage >70% across domains
- Automated evaluation pipeline
- Public benchmark results
Implementation Roadmap
Phase 1: Foundation (Months 1-3)
Critical Infrastructure
- Directive #1: Configurable Prompt System
- Directive #3: Centralized Configuration Management
- Directive #9: Initial documentation framework
Estimated Effort: 2-3 engineers, 3 months
Phase 2: Core Extensibility (Months 4-6)
Domain Adaptation
- Directive #2: Pluggable NER Pipeline
- Directive #4: Extensible Episode Types
- Directive #7: Enhanced Metadata
Estimated Effort: 2-3 engineers, 3 months
Phase 3: Provider & Database Support (Months 7-9)
Infrastructure Expansion
- Directive #6: Multi-Provider LLM Support
- Directive #8: Database Provider Expansion (Phase 1)
Estimated Effort: 2 engineers, 3 months
Phase 4: Domain Optimization (Months 10-12)
Domain-Specific Features
- Directive #5: Domain-Specific Search
- Directive #10: Testing & Evaluation Framework
- Directive #9: Complete domain examples
Estimated Effort: 2-3 engineers, 3 months
Risk Assessment
High Risk
-
Breaking Changes: Refactoring may break existing integrations
- Mitigation: Semantic versioning, deprecation warnings, migration guides
-
Performance Regression: More abstraction may impact performance
- Mitigation: Continuous benchmarking, performance budgets
-
Complexity Creep: Too much configurability can confuse users
- Mitigation: Sensible defaults, progressive disclosure, presets
Medium Risk
-
Provider API Changes: External dependencies may change
- Mitigation: Abstract interfaces, version pinning, adapter pattern
-
Maintenance Burden: More features = more maintenance
- Mitigation: Automated testing, clear ownership, deprecation policy
-
Documentation Debt: Fast development may outpace docs
- Mitigation: Docs-as-code, automated doc generation, examples as tests
Low Risk
- Community Adoption: Users may not need all domains
- Mitigation: Modular architecture, optional components
Success Criteria
Technical Metrics
- Zero hardcoded domain assumptions in core library
- 5+ domain-specific configurations available
- All GitHub issues (#1004, #1006, #1007, #1074, #995, #1077, #947, #961) resolved
- Test coverage >75% across all domains
- Performance within 10% of current baseline
User Experience Metrics
- Domain setup time <30 minutes (from docs)
- Config-driven customization (no code changes for 80% of use cases)
- 3+ community-contributed domain adaptations
Business Metrics
- Adoption in 3+ new domains (outside conversational AI)
- 50%+ reduction in customization support requests
- Documentation satisfaction >4.0/5.0
Appendix A: Affected Files
Core Files Requiring Changes
High Priority
graphiti_core/graphiti.py- Main class, initializationgraphiti_core/llm_client/config.py- Configuration systemgraphiti_core/prompts/extract_nodes.py- NER promptsgraphiti_core/prompts/extract_edges.py- Relation extraction promptsgraphiti_core/utils/maintenance/node_operations.py- Extraction logic
Medium Priority
graphiti_core/nodes.py- Episode type definitionsgraphiti_core/search/search_config.py- Search configurationgraphiti_core/search/search_config_recipes.py- Search recipesserver/graph_service/config.py- Server configuration
Low Priority
graphiti_core/driver/*.py- Database driversgraphiti_core/embedder/*.py- Embedder clients
Appendix B: Related GitHub Issues
Directly Addressed
- #1087: Embedding truncation
- #1074: No results with Ollama embeddings
- #1007: Unstable outputs with vLLM
- #1006: AzureOpenAI reranker support
- #1004: Azure OpenAI support
- #995: Docker Azure OpenAI support
- #1077: Google Cloud Spanner Graph support
- #947: Apache AGE support
- #961: Episodes API improvements
- #1082: Neptune driver issues
Indirectly Improved
- #1083: Orphaned entities cleanup
- #1062: Stale data in MCP server
- #1021: Incomplete graph structure
- #1018: Search with group_ids
- #1012: group_id and Anthropic issues
- #992: OOM in build_communities
- #963: Duplicate entities
Appendix C: Backward Compatibility Strategy
Deprecation Policy
- Feature Deprecation: 2 minor versions notice
- API Changes: Maintain old API with deprecation warnings
- Configuration: Support both old and new config formats during transition
Migration Support
- Automated migration scripts for major changes
- Detailed migration guides for each release
- Migration validation tools
Version Support
- LTS releases for enterprise users
- Security patches for N-2 versions
- Clear EOL policy
Next Steps
- Review & Approval: Circulate this plan for stakeholder feedback
- Prioritization: Finalize directive priorities based on business needs
- Resource Allocation: Assign engineering teams to Phase 1 directives
- Kickoff: Begin implementation of Directive #1 (Prompt System)
Document Maintainer: Claude (AI Assistant) Last Updated: 2025-11-30 Next Review: After Phase 1 completion