# Graphiti Domain-Agnostic Improvement Plan **Date**: 2025-11-30 **Status**: Draft **Last Pull**: `422558d` (main branch) --- ## Executive Summary This document outlines a strategic plan to make Graphiti more domain-agnostic and adaptable to diverse use cases beyond conversational AI. The current architecture, while powerful, contains several domain-specific assumptions (primarily around messaging/conversational data) that limit its applicability to other domains such as scientific research, legal documents, IoT data, healthcare records, financial transactions, etc. --- ## Current Architecture Analysis ### Key Components Review 1. **NER & Entity Extraction** (`graphiti_core/utils/maintenance/node_operations.py`, `graphiti_core/prompts/extract_nodes.py`) - Hardcoded prompts for three episode types: message, text, JSON - Domain-specific language (e.g., "speaker", "conversation") - Entity type classification tightly coupled with extraction logic 2. **LLM Client Configuration** (`graphiti_core/llm_client/config.py`, `graphiti_core/graphiti.py`) - Defaults to OpenAI across all components - No centralized model selection strategy - Temperature (1.0) and max_tokens (8192) hardcoded as defaults 3. **Episode Types** (`graphiti_core/nodes.py`) - Limited to: message, text, JSON - Each type requires separate prompt functions - No extensibility mechanism for custom episode types 4. **Prompt System** (`graphiti_core/prompts/`) - Prompts are Python functions, not configurable data - No template engine or override mechanism - Domain assumptions embedded in prompt text 5. **Search & Retrieval** (`graphiti_core/search/`) - Flexible but complex configuration - Limited domain-specific search recipes - No semantic domain adapters --- ## Identified Issues from GitHub (Top 20) ### High-Impact Issues Related to Domain Agnostic Goals: 1. **#1087**: Embedding truncation reduces retrieval quality for text-embedding-3-small 2. **#1074**: Neo4j quickstart returns no results with OpenAI-compatible LLM + Ollama embeddings 3. **#1007**: OpenAIGenericClient outputs unstable for vllm serving gpt-oss-20b 4. **#1006**: OpenAIRerankerClient does not support AzureOpenAILLMClient 5. **#1004**: Azure OpenAI is not supported 6. **#995**: Docker container does not support Azure OpenAI 7. **#1077**: Support for Google Cloud Spanner Graph 8. **#947**: Support for Apache AGE as Graph DB 9. **#1016**: Support episode vector 10. **#961**: Improve Episodes API - return UUID, support GET by ID, custom metadata --- ## Improvement Directives ### 1. **Configurable Prompt System** 🔴 **Priority: CRITICAL** #### Objective Replace hardcoded prompt functions with a templatable, extensible prompt system that supports domain customization. #### Implementation Plan **Phase 1: Prompt Template Engine** - Create `PromptTemplate` class with variable interpolation - Support multiple template formats (Jinja2, mustache, or custom) - Add prompt registry for registration and lookup ```python # Example API class PromptTemplate: def __init__(self, template: str, variables: dict[str, str]): self.template = template self.variables = variables def render(self, context: dict[str, Any]) -> str: # Template rendering logic pass class PromptRegistry: def register(self, name: str, template: PromptTemplate) -> None: pass def get(self, name: str) -> PromptTemplate: pass def override(self, name: str, template: PromptTemplate) -> None: pass ``` **Phase 2: Refactor Existing Prompts** - Convert all prompt functions in `graphiti_core/prompts/` to templates - Maintain backward compatibility with existing API - Add domain-specific prompt overrides **Phase 3: Documentation & Examples** - Create prompt customization guide - Provide domain-specific examples (legal, scientific, financial) - Add prompt testing utilities #### Priority Rationale - **Impact**: Enables all domain customization downstream - **Complexity**: Medium - requires careful refactoring - **Dependencies**: None - can be done independently #### Blockers - **Breaking Changes**: Need to maintain backward compatibility - **LLM Provider Compatibility**: Different providers may require different prompt formats - **Testing**: Need comprehensive test suite for prompt variations #### Success Metrics - Users can customize prompts without code changes - 5+ domain-specific prompt examples documented - No regression in existing use cases --- ### 2. **Pluggable NER & Entity Extraction Pipeline** 🔴 **Priority: CRITICAL** #### Objective Make the entity extraction pipeline modular and extensible for different domain requirements. #### Implementation Plan **Phase 1: Extraction Strategy Interface** - Define `ExtractionStrategy` protocol/abstract class - Support custom entity extractors (LLM-based, rule-based, hybrid) - Allow domain-specific entity type systems ```python class ExtractionStrategy(Protocol): async def extract_entities( self, episode: EpisodicNode, context: dict[str, Any], entity_types: dict[str, type[BaseModel]] | None = None, ) -> list[EntityNode]: ... async def extract_relations( self, episode: EpisodicNode, entities: list[EntityNode], context: dict[str, Any], ) -> list[EntityEdge]: ... ``` **Phase 2: Domain-Specific Extractors** - Create extractors for common domains: - `ScientificPaperExtractor`: Extracts researchers, institutions, findings, citations - `LegalDocumentExtractor`: Extracts parties, cases, statutes, precedents - `FinancialExtractor`: Extracts companies, transactions, indicators - `IoTEventExtractor`: Extracts devices, sensors, readings, locations - `HealthcareExtractor`: Extracts patients, conditions, treatments, providers **Phase 3: Extractor Composition** - Allow chaining multiple extractors - Support fallback strategies - Enable parallel extraction with merging #### Priority Rationale - **Impact**: Directly addresses domain specificity in core extraction - **Complexity**: High - touches critical path - **Dependencies**: Depends on Directive #1 (prompts) #### Blockers - **Performance**: Multiple extractors may impact latency - **Conflict Resolution**: Different extractors may produce conflicting entities - **Schema Validation**: Need flexible validation for diverse entity types #### Success Metrics - 3+ domain-specific extractors implemented - 50%+ reduction in domain customization code - No performance degradation for default use case --- ### 3. **Centralized Configuration Management** 🟡 **Priority: HIGH** #### Objective Create a unified configuration system for LLM clients, embedders, and other components. #### Implementation Plan **Phase 1: Configuration Schema** - Create `GraphitiConfig` with hierarchical structure - Support environment variables, config files (YAML/TOML), and programmatic config - Add validation with Pydantic ```python class LLMProviderConfig(BaseModel): provider: Literal["openai", "anthropic", "gemini", "groq", "custom"] model: str small_model: str | None = None api_key: str | None = None base_url: str | None = None temperature: float = 1.0 max_tokens: int = 8192 class EmbedderConfig(BaseModel): provider: Literal["openai", "voyage", "gemini", "custom"] model: str api_key: str | None = None embedding_dim: int | None = None class GraphitiConfig(BaseModel): llm: LLMProviderConfig embedder: EmbedderConfig database: DatabaseConfig extraction: ExtractionConfig search: SearchConfig ``` **Phase 2: Config Loading & Merging** - Support config file discovery (`.graphiti.yaml`, `graphiti.config.toml`) - Merge configs from multiple sources (file < env < code) - Add config validation and helpful error messages **Phase 3: Domain-Specific Presets** - Create preset configs for common use cases - Support config inheritance and composition ```yaml # Example: .graphiti.yaml extends: "presets/scientific-research" llm: provider: anthropic model: claude-sonnet-4-5-latest temperature: 0.3 extraction: entity_types: - Researcher - Institution - Finding - Methodology extractors: - type: llm prompt: prompts/scientific_entities.yaml - type: regex patterns: prompts/scientific_patterns.yaml ``` #### Priority Rationale - **Impact**: Simplifies deployment and customization - **Complexity**: Medium - **Dependencies**: None #### Blockers - **Backward Compatibility**: Must support existing initialization patterns - **Security**: API keys and credentials management - **Validation**: Complex validation rules across providers #### Success Metrics - Single config file for complete setup - Zero hardcoded defaults in core code - 10+ domain preset configs available --- ### 4. **Extensible Episode Type System** 🟡 **Priority: HIGH** #### Objective Allow users to define custom episode types with associated extraction logic. #### Implementation Plan **Phase 1: Episode Type Registry** - Create `EpisodeTypeRegistry` for dynamic episode types - Support custom episode type definitions with Pydantic ```python class EpisodeTypeDefinition(BaseModel): name: str description: str content_schema: type[BaseModel] | None = None extraction_strategy: str | ExtractionStrategy prompt_template: str | None = None class EpisodeTypeRegistry: def register(self, episode_type: EpisodeTypeDefinition) -> None: pass def get(self, name: str) -> EpisodeTypeDefinition: pass ``` **Phase 2: Dynamic Dispatch** - Modify `extract_nodes()` to dispatch based on episode type - Support fallback to default extraction for undefined types **Phase 3: Common Episode Types** - Provide built-in types for common domains: - `scientific_paper` - `legal_document` - `financial_report` - `iot_event` - `healthcare_record` - `email` - `api_log` #### Priority Rationale - **Impact**: Removes major extensibility bottleneck - **Complexity**: Medium - **Dependencies**: Depends on Directive #2 (extractors) #### Blockers - **Type Safety**: Ensuring type safety with dynamic types - **Validation**: Schema validation for custom content - **Migration**: Migrating existing message/text/JSON types #### Success Metrics - Users can add episode types without code changes - 5+ built-in episode types for different domains - Clear migration path from existing types --- ### 5. **Domain-Specific Search Strategies** 🟢 **Priority: MEDIUM** #### Objective Provide domain-optimized search configurations and strategies. #### Implementation Plan **Phase 1: Search Strategy Templates** - Create domain-specific search configs in `search_config_recipes.py` - Optimize for domain characteristics (e.g., temporal for financial, spatial for IoT) ```python # Examples FINANCIAL_TEMPORAL_SEARCH = SearchConfig( edge_config=EdgeSearchConfig( search_methods=[ EdgeSearchMethod.cosine_similarity, EdgeSearchMethod.bm25, ], reranker=EdgeReranker.episode_mentions, ), # Prioritize recent events # ... domain-specific configuration ) SCIENTIFIC_CITATION_SEARCH = SearchConfig( # Optimize for citation networks # ... domain-specific configuration ) ``` **Phase 2: Semantic Domain Adapters** - Create domain-specific query expansion - Add domain vocabulary mapping - Support domain-specific relevance scoring **Phase 3: Search Analytics** - Track search performance by domain - Provide domain-specific search insights - Auto-tune search configs based on usage #### Priority Rationale - **Impact**: Improves search quality for specific domains - **Complexity**: Low-Medium - **Dependencies**: None - additive feature #### Blockers - **Domain Expertise**: Requires deep understanding of each domain - **Evaluation**: Need domain-specific test datasets - **Maintenance**: Each domain strategy needs ongoing optimization #### Success Metrics - 5+ domain-optimized search strategies - Measurable improvement in domain-specific retrieval quality - Search strategy recommendation system --- ### 6. **Multi-Provider LLM & Embedder Support Enhancement** 🟢 **Priority: MEDIUM** #### Objective Improve support for diverse LLM and embedding providers, addressing current issues with Azure, Anthropic, and local models. #### Implementation Plan **Phase 1: Provider Abstraction Improvements** - Enhance `LLMClient` interface for provider-specific features - Better handling of structured output across providers (#1007) - Unified error handling and retries **Phase 2: Provider-Specific Optimizations** - Azure OpenAI full support (#1004, #995, #1006) - Anthropic optimization for structured output - Local model support (Ollama, vLLM) (#1074, #1007) - Google Cloud Vertex AI integration **Phase 3: Embedder Flexibility** - Support mixed embedding strategies (different models for nodes vs edges) - Domain-specific embedding fine-tuning - Embedding dimension adaptation (#1087) #### Priority Rationale - **Impact**: Addresses multiple GitHub issues, improves flexibility - **Complexity**: Medium-High (provider-specific quirks) - **Dependencies**: Related to Directive #3 (config) #### Blockers - **Provider API Changes**: External dependencies on provider APIs - **Testing**: Requires access to multiple provider accounts - **Cost**: Testing across providers can be expensive #### Success Metrics - All providers in CLAUDE.md fully supported - Resolution of issues #1004, #1006, #1007, #1074, #995 - Provider switching with zero code changes --- ### 7. **Enhanced Metadata & Custom Attributes** 🟢 **Priority: MEDIUM** #### Objective Support domain-specific metadata on all graph elements (nodes, edges, episodes). #### Implementation Plan **Phase 1: Flexible Metadata Schema** - Add `custom_metadata: dict[str, Any]` to all core types - Support typed metadata with Pydantic models - Index metadata for searchability **Phase 2: Domain-Specific Attributes** - Support custom attributes per domain - Attribute extraction from episodes - Attribute-based filtering in search **Phase 3: Metadata API Improvements** - Episode API enhancements (#961) - Metadata update operations - Bulk metadata operations #### Priority Rationale - **Impact**: Enables rich domain modeling - **Complexity**: Low-Medium - **Dependencies**: Database schema changes #### Blockers - **Schema Migration**: Existing graphs need migration - **Index Performance**: Metadata indexing may impact performance - **Validation**: Complex validation for diverse metadata #### Success Metrics - Custom metadata on all graph elements - Metadata-based search and filtering - Resolution of issue #961 --- ### 8. **Database Provider Expansion** 🔵 **Priority: LOW** #### Objective Support additional graph databases to meet diverse deployment requirements. #### Implementation Plan **Phase 1: Abstract Driver Interface** - Enhance `GraphDriver` abstraction - Standardize query translation layer - Support for property graph vs RDF models **Phase 2: New Drivers** - Google Cloud Spanner Graph (#1077) - Apache AGE (#947) - Amazon Neptune improvements (#1082) - TigerGraph, NebulaGraph **Phase 3: Driver Selection Guide** - Performance comparison matrix - Use case recommendations - Migration tools between drivers #### Priority Rationale - **Impact**: Addresses specific GitHub requests, increases deployment options - **Complexity**: High (each driver is significant work) - **Dependencies**: None #### Blockers - **Maintenance Burden**: Each driver requires ongoing support - **Feature Parity**: Different databases have different capabilities - **Testing**: Complex integration testing for each database #### Success Metrics - 2+ new database drivers - Resolution of issues #1077, #947 - Database migration tools --- ### 9. **Documentation & Examples for Domain Adaptation** 🟡 **Priority: HIGH** #### Objective Comprehensive documentation showing how to adapt Graphiti to different domains. #### Implementation Plan **Phase 1: Domain Adaptation Guide** - Step-by-step guide for domain customization - Decision tree for configuration choices - Best practices for each domain type **Phase 2: Complete Domain Examples** - Scientific Research knowledge graph - Legal Document analysis - Financial Transaction network - IoT Event processing - Healthcare Records integration **Phase 3: Tutorial Series** - Video walkthroughs - Interactive Jupyter notebooks - Code generation tools for domain setup #### Priority Rationale - **Impact**: Critical for adoption in new domains - **Complexity**: Medium (requires domain expertise) - **Dependencies**: Depends on implementation of above directives #### Blockers - **Domain Expertise**: Need experts for each domain - **Maintenance**: Examples need to stay current with codebase - **Quality**: Need real-world datasets and validation #### Success Metrics - 5+ complete domain examples - Documentation coverage >80% - User-contributed domain examples --- ### 10. **Testing & Evaluation Framework for Domains** 🟢 **Priority: MEDIUM** #### Objective Create domain-specific test datasets and evaluation metrics. #### Implementation Plan **Phase 1: Domain Test Datasets** - Curate/generate test data for each domain - Include ground truth annotations - Support for evaluation benchmarks **Phase 2: Evaluation Metrics** - Domain-specific quality metrics - Extraction accuracy measurements - Search relevance evaluation **Phase 3: Continuous Evaluation** - Automated testing across domains - Performance regression detection - Quality dashboards #### Priority Rationale - **Impact**: Ensures quality across domains - **Complexity**: Medium - **Dependencies**: Depends on domain implementations #### Blockers - **Data Acquisition**: Domain datasets can be hard to obtain - **Annotation**: Ground truth annotation is expensive - **Standardization**: Metrics vary significantly by domain #### Success Metrics - Test coverage >70% across domains - Automated evaluation pipeline - Public benchmark results --- ## Implementation Roadmap ### Phase 1: Foundation (Months 1-3) **Critical Infrastructure** - [ ] Directive #1: Configurable Prompt System - [ ] Directive #3: Centralized Configuration Management - [ ] Directive #9: Initial documentation framework **Estimated Effort**: 2-3 engineers, 3 months ### Phase 2: Core Extensibility (Months 4-6) **Domain Adaptation** - [ ] Directive #2: Pluggable NER Pipeline - [ ] Directive #4: Extensible Episode Types - [ ] Directive #7: Enhanced Metadata **Estimated Effort**: 2-3 engineers, 3 months ### Phase 3: Provider & Database Support (Months 7-9) **Infrastructure Expansion** - [ ] Directive #6: Multi-Provider LLM Support - [ ] Directive #8: Database Provider Expansion (Phase 1) **Estimated Effort**: 2 engineers, 3 months ### Phase 4: Domain Optimization (Months 10-12) **Domain-Specific Features** - [ ] Directive #5: Domain-Specific Search - [ ] Directive #10: Testing & Evaluation Framework - [ ] Directive #9: Complete domain examples **Estimated Effort**: 2-3 engineers, 3 months --- ## Risk Assessment ### High Risk 1. **Breaking Changes**: Refactoring may break existing integrations - *Mitigation*: Semantic versioning, deprecation warnings, migration guides 2. **Performance Regression**: More abstraction may impact performance - *Mitigation*: Continuous benchmarking, performance budgets 3. **Complexity Creep**: Too much configurability can confuse users - *Mitigation*: Sensible defaults, progressive disclosure, presets ### Medium Risk 1. **Provider API Changes**: External dependencies may change - *Mitigation*: Abstract interfaces, version pinning, adapter pattern 2. **Maintenance Burden**: More features = more maintenance - *Mitigation*: Automated testing, clear ownership, deprecation policy 3. **Documentation Debt**: Fast development may outpace docs - *Mitigation*: Docs-as-code, automated doc generation, examples as tests ### Low Risk 1. **Community Adoption**: Users may not need all domains - *Mitigation*: Modular architecture, optional components --- ## Success Criteria ### Technical Metrics - [ ] Zero hardcoded domain assumptions in core library - [ ] 5+ domain-specific configurations available - [ ] All GitHub issues (#1004, #1006, #1007, #1074, #995, #1077, #947, #961) resolved - [ ] Test coverage >75% across all domains - [ ] Performance within 10% of current baseline ### User Experience Metrics - [ ] Domain setup time <30 minutes (from docs) - [ ] Config-driven customization (no code changes for 80% of use cases) - [ ] 3+ community-contributed domain adaptations ### Business Metrics - [ ] Adoption in 3+ new domains (outside conversational AI) - [ ] 50%+ reduction in customization support requests - [ ] Documentation satisfaction >4.0/5.0 --- ## Appendix A: Affected Files ### Core Files Requiring Changes **High Priority** - `graphiti_core/graphiti.py` - Main class, initialization - `graphiti_core/llm_client/config.py` - Configuration system - `graphiti_core/prompts/extract_nodes.py` - NER prompts - `graphiti_core/prompts/extract_edges.py` - Relation extraction prompts - `graphiti_core/utils/maintenance/node_operations.py` - Extraction logic **Medium Priority** - `graphiti_core/nodes.py` - Episode type definitions - `graphiti_core/search/search_config.py` - Search configuration - `graphiti_core/search/search_config_recipes.py` - Search recipes - `server/graph_service/config.py` - Server configuration **Low Priority** - `graphiti_core/driver/*.py` - Database drivers - `graphiti_core/embedder/*.py` - Embedder clients --- ## Appendix B: Related GitHub Issues ### Directly Addressed - #1087: Embedding truncation - #1074: No results with Ollama embeddings - #1007: Unstable outputs with vLLM - #1006: AzureOpenAI reranker support - #1004: Azure OpenAI support - #995: Docker Azure OpenAI support - #1077: Google Cloud Spanner Graph support - #947: Apache AGE support - #961: Episodes API improvements - #1082: Neptune driver issues ### Indirectly Improved - #1083: Orphaned entities cleanup - #1062: Stale data in MCP server - #1021: Incomplete graph structure - #1018: Search with group_ids - #1012: group_id and Anthropic issues - #992: OOM in build_communities - #963: Duplicate entities --- ## Appendix C: Backward Compatibility Strategy ### Deprecation Policy 1. **Feature Deprecation**: 2 minor versions notice 2. **API Changes**: Maintain old API with deprecation warnings 3. **Configuration**: Support both old and new config formats during transition ### Migration Support - Automated migration scripts for major changes - Detailed migration guides for each release - Migration validation tools ### Version Support - LTS releases for enterprise users - Security patches for N-2 versions - Clear EOL policy --- ## Next Steps 1. **Review & Approval**: Circulate this plan for stakeholder feedback 2. **Prioritization**: Finalize directive priorities based on business needs 3. **Resource Allocation**: Assign engineering teams to Phase 1 directives 4. **Kickoff**: Begin implementation of Directive #1 (Prompt System) --- **Document Maintainer**: Claude (AI Assistant) **Last Updated**: 2025-11-30 **Next Review**: After Phase 1 completion