graphiti/check_source_data.py
Claude 21c0eae78f
Add: Graph exploration tools for general-purpose PKM
Implemented two new MCP tools and enhanced workflow instructions to enable
effective Personal Knowledge Management across all use cases (architecture
decisions, projects, coaching, research, etc.).

## New Tools

1. **get_entity_connections** - Direct graph traversal showing ALL relationships
   - Returns complete connection data for an entity
   - Guarantees completeness vs semantic search
   - Essential for pattern detection and exploration
   - Leverages EntityEdge.get_by_node_uuid()

2. **get_entity_timeline** - Chronological episode history
   - Shows ALL episodes mentioning an entity
   - Enables temporal tracking and evolution analysis
   - Critical for understanding how concepts evolved
   - Leverages EpisodicNode.get_by_entity_node_uuid()

## Enhanced Workflow Instructions

Updated GRAPHITI_MCP_INSTRUCTIONS with:
- Clear "SEARCH FIRST, THEN ADD" workflow with decision flowcharts
- Tool selection guide (when to use each tool)
- Distinction between graph traversal vs semantic search
- Multiple concrete examples across different domains
- Key principles for effective PKM usage

## Updated add_memory Docstring

Added prominent warning to search before adding:
- Step-by-step workflow guidance
- Emphasizes creating connections vs isolated nodes
- References new exploration tools

## Benefits

- Prevents disconnected/duplicate entities
- Enables reliable pattern recognition with complete data
- Cost-effective (single graph query vs multiple semantic searches)
- Temporal tracking for evolution analysis
- Works equally well for technical and personal knowledge

## Implementation Details

- 0 changes to graphiti_core (uses existing features only)
- All new code in mcp_server/src/graphiti_mcp_server.py
- Backward compatible (adds tools, doesn't modify existing)
- Follows existing MCP tool patterns and conventions
- Passes all lint and syntax checks

Related: DOCS/IMPLEMENTATION-Graph-Exploration-Tools.md
2025-11-15 09:51:11 +00:00

84 lines
2.2 KiB
Python

#!/usr/bin/env python3
"""Check what's in the source database."""
import os
from neo4j import GraphDatabase
NEO4J_URI = 'bolt://192.168.1.25:7687'
NEO4J_USER = 'neo4j'
NEO4J_PASSWORD = '!"MiTa1205'
SOURCE_DATABASE = 'neo4j'
SOURCE_GROUP_ID = 'lvarming73'
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USER, NEO4J_PASSWORD))
print('=' * 70)
print('Checking Source Database')
print('=' * 70)
with driver.session(database=SOURCE_DATABASE) as session:
# Check total nodes
result = session.run(
"""
MATCH (n {group_id: $group_id})
RETURN count(n) as total
""",
group_id=SOURCE_GROUP_ID,
)
total = result.single()['total']
print(f"\n✓ Total nodes with group_id '{SOURCE_GROUP_ID}': {total}")
# Check date range
result = session.run(
"""
MATCH (n:Episodic {group_id: $group_id})
WHERE n.created_at IS NOT NULL
RETURN
min(n.created_at) as earliest,
max(n.created_at) as latest,
count(n) as total
""",
group_id=SOURCE_GROUP_ID,
)
dates = result.single()
if dates and dates['total'] > 0:
print(f'\n✓ Episodic date range:')
print(f' Earliest: {dates["earliest"]}')
print(f' Latest: {dates["latest"]}')
print(f' Total episodes: {dates["total"]}')
else:
print('\n⚠️ No episodic nodes with dates found')
# Sample episodic nodes by date
result = session.run(
"""
MATCH (n:Episodic {group_id: $group_id})
RETURN n.name as name, n.created_at as created_at
ORDER BY n.created_at
LIMIT 10
""",
group_id=SOURCE_GROUP_ID,
)
print(f'\n✓ Oldest episodic nodes:')
for record in result:
print(f' - {record["name"]}: {record["created_at"]}')
# Check for other group_ids in neo4j database
result = session.run("""
MATCH (n)
WHERE n.group_id IS NOT NULL
RETURN DISTINCT n.group_id as group_id, count(n) as count
ORDER BY count DESC
""")
print(f"\n✓ All group_ids in '{SOURCE_DATABASE}' database:")
for record in result:
print(f' {record["group_id"]}: {record["count"]} nodes')
driver.close()
print('\n' + '=' * 70)