supmo668 a944871942 feat: Add Gremlin query language support for Neptune Database

Adds experimental support for Apache TinkerPop Gremlin as an alternative
query language for AWS Neptune Database, alongside the existing openCypher
support. This enables users to choose their preferred query language and
opens the door for future support of other Gremlin-compatible databases.

- QueryLanguage enum (CYPHER, GREMLIN) for explicit language selection
- Dual-mode NeptuneDriver supporting both Cypher and Gremlin
- Gremlin query generation functions for common graph operations
- Graceful degradation when gremlinpython is not installed
- 100% backward compatible (defaults to CYPHER)

- graphiti_core/driver/driver.py: Added QueryLanguage enum
- graphiti_core/driver/neptune_driver.py: Dual client initialization
  and query routing based on language selection
- graphiti_core/graph_queries.py: 9 new Gremlin query generation functions

- graphiti_core/utils/maintenance/graph_data_operations.py: Updated
  clear_data() to support both query languages

- tests/test_neptune_gremlin_int.py: Comprehensive integration tests
- examples/quickstart/quickstart_neptune_gremlin.py: Usage example
- examples/quickstart/README.md: Updated with Gremlin instructions
- GREMLIN_FEATURE.md: Complete feature documentation

- pyproject.toml: Added gremlinpython>=3.7.0 to neptune extras

```python
from graphiti_core.driver.driver import QueryLanguage
from graphiti_core.driver.neptune_driver import NeptuneDriver

driver = NeptuneDriver(
    host='neptune-db://cluster.amazonaws.com',
    aoss_host='aoss-cluster.amazonaws.com',
    query_language=QueryLanguage.GREMLIN
)
```

- Only Neptune Database supports Gremlin (not Neptune Analytics)
- Fulltext and vector search still use OpenSearch (AOSS) integration
- Complete search_utils.py Gremlin implementation pending (future work)

- ✅ All existing unit tests pass (103/103)
- ✅ New integration tests for Gremlin operations
- ✅ Type checking passes
- ✅ Linting passes

None. Fully backward compatible.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-05 23:45:59 -08:00

5.6 KiB

Raw Blame History

Graphiti Quickstart Example

This example demonstrates the basic functionality of Graphiti, including:

Connecting to a Neo4j or FalkorDB database
Initializing Graphiti indices and constraints
Adding episodes to the graph
Searching the graph with semantic and keyword matching
Exploring graph-based search with reranking using the top search result's source node UUID
Performing node search using predefined search recipes

Prerequisites

Python 3.9+
OpenAI API key (set as OPENAI_API_KEY environment variable)
For Neo4j:
- Neo4j Desktop installed and running
- A local DBMS created and started in Neo4j Desktop
For FalkorDB:
- FalkorDB server running (see FalkorDB documentation for setup)
For Amazon Neptune:
- Amazon Neptune Database or Neptune Analytics running (see Amazon Neptune documentation for setup)
- OpenSearch Service cluster for fulltext search
- Note: Neptune Database supports both Cypher and Gremlin query languages. Neptune Analytics only supports Cypher.

Setup Instructions

Install the required dependencies:

pip install graphiti-core

Set up environment variables:

# Required for LLM and embedding
export OPENAI_API_KEY=your_openai_api_key

# Optional Neo4j connection parameters (defaults shown)
export NEO4J_URI=bolt://localhost:7687
export NEO4J_USER=neo4j
export NEO4J_PASSWORD=password

# Optional FalkorDB connection parameters (defaults shown)
export FALKORDB_URI=falkor://localhost:6379

# Optional Amazon Neptune connection parameters
NEPTUNE_HOST=your_neptune_host
NEPTUNE_PORT=your_port_or_8182
AOSS_HOST=your_aoss_host
AOSS_PORT=your_port_or_443

# To use a different database, modify the driver constructor in the script

TIP: For Amazon Neptune host string please use the following formats

For Neptune Database: neptune-db://<cluster endpoint>
For Neptune Analytics: neptune-graph://<graph identifier>

Run the example:

python quickstart_neo4j.py

# For FalkorDB
python quickstart_falkordb.py

# For Amazon Neptune (using Cypher)
python quickstart_neptune.py

# For Amazon Neptune Database (using Gremlin)
python quickstart_neptune_gremlin.py

Using Gremlin with Neptune Database

Neptune Database supports both openCypher and Gremlin query languages. To use Gremlin:

from graphiti_core.driver.driver import QueryLanguage
from graphiti_core.driver.neptune_driver import NeptuneDriver

driver = NeptuneDriver(
    host='neptune-db://your-cluster.amazonaws.com',
    aoss_host='your-aoss-cluster.amazonaws.com',
    query_language=QueryLanguage.GREMLIN  # Use Gremlin instead of Cypher
)

Important Notes:

Only Neptune Database supports Gremlin. Neptune Analytics does not support Gremlin.
Gremlin support is experimental and focuses on basic graph operations.
Vector similarity and fulltext search still use OpenSearch integration.
The high-level Graphiti API remains the same regardless of query language.

What This Example Demonstrates

Graph Initialization: Setting up the Graphiti indices and constraints in Neo4j, Amazon Neptune, or FalkorDB
Adding Episodes: Adding text content that will be analyzed and converted into knowledge graph nodes and edges
Edge Search Functionality: Performing hybrid searches that combine semantic similarity and BM25 retrieval to find relationships (edges)
Graph-Aware Search: Using the source node UUID from the top search result to rerank additional search results based on graph distance
Node Search Using Recipes: Using predefined search configurations like NODE_HYBRID_SEARCH_RRF to directly search for nodes rather than edges
Result Processing: Understanding the structure of search results including facts, nodes, and temporal metadata

Next Steps

After running this example, you can:

Modify the episode content to add your own information
Try different search queries to explore the knowledge extraction
Experiment with different center nodes for graph-distance-based reranking
Try other predefined search recipes from graphiti_core.search.search_config_recipes
Explore the more advanced examples in the other directories

Troubleshooting

"Graph not found: default_db" Error

If you encounter the error Neo.ClientError.Database.DatabaseNotFound: Graph not found: default_db, this occurs when the driver is trying to connect to a database that doesn't exist.

Solution: The Neo4j driver defaults to using neo4j as the database name. If you need to use a different database, modify the driver constructor in the script:

# In quickstart_neo4j.py, change:
driver = Neo4jDriver(uri=neo4j_uri, user=neo4j_user, password=neo4j_password)

# To specify a different database:
driver = Neo4jDriver(uri=neo4j_uri, user=neo4j_user, password=neo4j_password, database="your_db_name")

Understanding the Output

Edge Search Results

The edge search results include EntityEdge objects with:

UUID: Unique identifier for the edge
Fact: The extracted fact from the episode
Valid at/invalid at: Time period during which the fact was true (if available)
Source/target node UUIDs: Connections between entities in the knowledge graph

Node Search Results

The node search results include EntityNode objects with:

UUID: Unique identifier for the node
Name: The name of the entity
Content Summary: A summary of the node's content
Node Labels: The types of the node (e.g., Person, Organization)
Created At: When the node was created
Attributes: Additional properties associated with the node

5.6 KiB Raw Blame History