remove parallel runtime and build dynamic indexes sequentially
This commit is contained in:
parent
1460172568
commit
29ba336189
4 changed files with 168 additions and 155 deletions
130
README.md
130
README.md
|
|
@ -31,9 +31,15 @@ Graphiti
|
|||
<br />
|
||||
|
||||
> [!TIP]
|
||||
> Check out the new [MCP server for Graphiti](mcp_server/README.md)! Give Claude, Cursor, and other MCP clients powerful Knowledge Graph-based memory.
|
||||
> Check out the new [MCP server for Graphiti](mcp_server/README.md)! Give Claude, Cursor, and other MCP clients powerful
|
||||
> Knowledge Graph-based memory.
|
||||
|
||||
Graphiti is a framework for building and querying temporally-aware knowledge graphs, specifically tailored for AI agents operating in dynamic environments. Unlike traditional retrieval-augmented generation (RAG) methods, Graphiti continuously integrates user interactions, structured and unstructured enterprise data, and external information into a coherent, queryable graph. The framework supports incremental data updates, efficient retrieval, and precise historical queries without requiring complete graph recomputation, making it suitable for developing interactive, context-aware AI applications.
|
||||
Graphiti is a framework for building and querying temporally-aware knowledge graphs, specifically tailored for AI agents
|
||||
operating in dynamic environments. Unlike traditional retrieval-augmented generation (RAG) methods, Graphiti
|
||||
continuously integrates user interactions, structured and unstructured enterprise data, and external information into a
|
||||
coherent, queryable graph. The framework supports incremental data updates, efficient retrieval, and precise historical
|
||||
queries without requiring complete graph recomputation, making it suitable for developing interactive, context-aware AI
|
||||
applications.
|
||||
|
||||
Use Graphiti to:
|
||||
|
||||
|
|
@ -49,14 +55,16 @@ Use Graphiti to:
|
|||
|
||||
<br />
|
||||
|
||||
A knowledge graph is a network of interconnected facts, such as _"Kendra loves Adidas shoes."_ Each fact is a "triplet" represented by two entities, or
|
||||
A knowledge graph is a network of interconnected facts, such as _"Kendra loves Adidas shoes."_ Each fact is a "triplet"
|
||||
represented by two entities, or
|
||||
nodes ("Kendra", "Adidas shoes"), and their relationship, or edge ("loves"). Knowledge Graphs have been explored
|
||||
extensively for information retrieval. What makes Graphiti unique is its ability to autonomously build a knowledge graph
|
||||
while handling changing relationships and maintaining historical context.
|
||||
|
||||
## Graphiti and Zep's Context Engineering Platform.
|
||||
|
||||
Graphiti powers the core of [Zep](https://www.getzep.com), a turn-key context engineering platform for AI Agents. Zep offers agent memory, Graph RAG for dynamic data, and context retrieval and assembly.
|
||||
Graphiti powers the core of [Zep](https://www.getzep.com), a turn-key context engineering platform for AI Agents. Zep
|
||||
offers agent memory, Graph RAG for dynamic data, and context retrieval and assembly.
|
||||
|
||||
Using Graphiti, we've demonstrated Zep is
|
||||
the [State of the Art in Agent Memory](https://blog.getzep.com/state-of-the-art-agent-memory/).
|
||||
|
|
@ -71,12 +79,16 @@ We're excited to open-source Graphiti, believing its potential reaches far beyon
|
|||
|
||||
## Why Graphiti?
|
||||
|
||||
Traditional RAG approaches often rely on batch processing and static data summarization, making them inefficient for frequently changing data. Graphiti addresses these challenges by providing:
|
||||
Traditional RAG approaches often rely on batch processing and static data summarization, making them inefficient for
|
||||
frequently changing data. Graphiti addresses these challenges by providing:
|
||||
|
||||
- **Real-Time Incremental Updates:** Immediate integration of new data episodes without batch recomputation.
|
||||
- **Bi-Temporal Data Model:** Explicit tracking of event occurrence and ingestion times, allowing accurate point-in-time queries.
|
||||
- **Efficient Hybrid Retrieval:** Combines semantic embeddings, keyword (BM25), and graph traversal to achieve low-latency queries without reliance on LLM summarization.
|
||||
- **Custom Entity Definitions:** Flexible ontology creation and support for developer-defined entities through straightforward Pydantic models.
|
||||
- **Bi-Temporal Data Model:** Explicit tracking of event occurrence and ingestion times, allowing accurate point-in-time
|
||||
queries.
|
||||
- **Efficient Hybrid Retrieval:** Combines semantic embeddings, keyword (BM25), and graph traversal to achieve
|
||||
low-latency queries without reliance on LLM summarization.
|
||||
- **Custom Entity Definitions:** Flexible ontology creation and support for developer-defined entities through
|
||||
straightforward Pydantic models.
|
||||
- **Scalability:** Efficiently manages large datasets with parallel processing, suitable for enterprise environments.
|
||||
|
||||
<p align="center">
|
||||
|
|
@ -86,7 +98,7 @@ Traditional RAG approaches often rely on batch processing and static data summar
|
|||
## Graphiti vs. GraphRAG
|
||||
|
||||
| Aspect | GraphRAG | Graphiti |
|
||||
| -------------------------- | ------------------------------------- | ------------------------------------------------ |
|
||||
|----------------------------|---------------------------------------|--------------------------------------------------|
|
||||
| **Primary Use** | Static document summarization | Dynamic data management |
|
||||
| **Data Handling** | Batch-oriented processing | Continuous, incremental updates |
|
||||
| **Knowledge Structure** | Entity clusters & community summaries | Episodic data, semantic entities, communities |
|
||||
|
|
@ -98,14 +110,16 @@ Traditional RAG approaches often rely on batch processing and static data summar
|
|||
| **Custom Entity Types** | No | Yes, customizable |
|
||||
| **Scalability** | Moderate | High, optimized for large datasets |
|
||||
|
||||
Graphiti is specifically designed to address the challenges of dynamic and frequently updated datasets, making it particularly suitable for applications requiring real-time interaction and precise historical queries.
|
||||
Graphiti is specifically designed to address the challenges of dynamic and frequently updated datasets, making it
|
||||
particularly suitable for applications requiring real-time interaction and precise historical queries.
|
||||
|
||||
## Installation
|
||||
|
||||
Requirements:
|
||||
|
||||
- Python 3.10 or higher
|
||||
- Neo4j 5.26 / FalkorDB 1.1.2 / Kuzu 0.11.2 / Amazon Neptune Database Cluster or Neptune Analytics Graph + Amazon OpenSearch Serverless collection (serves as the full text search backend)
|
||||
- Neo4j 5.26 / FalkorDB 1.1.2 / Kuzu 0.11.2 / Amazon Neptune Database Cluster or Neptune Analytics Graph + Amazon
|
||||
OpenSearch Serverless collection (serves as the full text search backend)
|
||||
- OpenAI API key (Graphiti defaults to OpenAI for LLM inference and embedding)
|
||||
|
||||
> [!IMPORTANT]
|
||||
|
|
@ -194,20 +208,26 @@ pip install graphiti-core[neptune]
|
|||
|
||||
## Default to Low Concurrency; LLM Provider 429 Rate Limit Errors
|
||||
|
||||
Graphiti's ingestion pipelines are designed for high concurrency. By default, concurrency is set low to avoid LLM Provider 429 Rate Limit Errors. If you find Graphiti slow, please increase concurrency as described below.
|
||||
Graphiti's ingestion pipelines are designed for high concurrency. By default, concurrency is set low to avoid LLM
|
||||
Provider 429 Rate Limit Errors. If you find Graphiti slow, please increase concurrency as described below.
|
||||
|
||||
Concurrency controlled by the `SEMAPHORE_LIMIT` environment variable. By default, `SEMAPHORE_LIMIT` is set to `10` concurrent operations to help prevent `429` rate limit errors from your LLM provider. If you encounter such errors, try lowering this value.
|
||||
Concurrency controlled by the `SEMAPHORE_LIMIT` environment variable. By default, `SEMAPHORE_LIMIT` is set to `10`
|
||||
concurrent operations to help prevent `429` rate limit errors from your LLM provider. If you encounter such errors, try
|
||||
lowering this value.
|
||||
|
||||
If your LLM provider allows higher throughput, you can increase `SEMAPHORE_LIMIT` to boost episode ingestion performance.
|
||||
If your LLM provider allows higher throughput, you can increase `SEMAPHORE_LIMIT` to boost episode ingestion
|
||||
performance.
|
||||
|
||||
## Quick Start
|
||||
|
||||
> [!IMPORTANT]
|
||||
> Graphiti defaults to using OpenAI for LLM inference and embedding. Ensure that an `OPENAI_API_KEY` is set in your environment.
|
||||
> Graphiti defaults to using OpenAI for LLM inference and embedding. Ensure that an `OPENAI_API_KEY` is set in your
|
||||
> environment.
|
||||
> Support for Anthropic and Groq LLM inferences is available, too. Other LLM providers may be supported via OpenAI
|
||||
> compatible APIs.
|
||||
|
||||
For a complete working example, see the [Quickstart Example](./examples/quickstart/README.md) in the examples directory. The quickstart demonstrates:
|
||||
For a complete working example, see the [Quickstart Example](./examples/quickstart/README.md) in the examples directory.
|
||||
The quickstart demonstrates:
|
||||
|
||||
1. Connecting to a Neo4j, Amazon Neptune, FalkorDB, or Kuzu database
|
||||
2. Initializing Graphiti indices and constraints
|
||||
|
|
@ -216,11 +236,13 @@ For a complete working example, see the [Quickstart Example](./examples/quicksta
|
|||
5. Reranking search results using graph distance
|
||||
6. Searching for nodes using predefined search recipes
|
||||
|
||||
The example is fully documented with clear explanations of each functionality and includes a comprehensive README with setup instructions and next steps.
|
||||
The example is fully documented with clear explanations of each functionality and includes a comprehensive README with
|
||||
setup instructions and next steps.
|
||||
|
||||
## MCP Server
|
||||
|
||||
The `mcp_server` directory contains a Model Context Protocol (MCP) server implementation for Graphiti. This server allows AI assistants to interact with Graphiti's knowledge graph capabilities through the MCP protocol.
|
||||
The `mcp_server` directory contains a Model Context Protocol (MCP) server implementation for Graphiti. This server
|
||||
allows AI assistants to interact with Graphiti's knowledge graph capabilities through the MCP protocol.
|
||||
|
||||
Key features of the MCP server include:
|
||||
|
||||
|
|
@ -230,7 +252,8 @@ Key features of the MCP server include:
|
|||
- Group management for organizing related data
|
||||
- Graph maintenance operations
|
||||
|
||||
The MCP server can be deployed using Docker with Neo4j, making it easy to integrate Graphiti into your AI assistant workflows.
|
||||
The MCP server can be deployed using Docker with Neo4j, making it easy to integrate Graphiti into your AI assistant
|
||||
workflows.
|
||||
|
||||
For detailed setup instructions and usage examples, see the [MCP server README](./mcp_server/README.md).
|
||||
|
||||
|
|
@ -253,7 +276,8 @@ Database names are configured directly in the driver constructors:
|
|||
- **Neo4j**: Database name defaults to `neo4j` (hardcoded in Neo4jDriver)
|
||||
- **FalkorDB**: Database name defaults to `default_db` (hardcoded in FalkorDriver)
|
||||
|
||||
As of v0.17.0, if you need to customize your database configuration, you can instantiate a database driver and pass it to the Graphiti constructor using the `graph_driver` parameter.
|
||||
As of v0.17.0, if you need to customize your database configuration, you can instantiate a database driver and pass it
|
||||
to the Graphiti constructor using the `graph_driver` parameter.
|
||||
|
||||
#### Neo4j with Custom Database Name
|
||||
|
||||
|
|
@ -313,10 +337,14 @@ from graphiti_core.driver.neptune_driver import NeptuneDriver
|
|||
|
||||
# Create a FalkorDB driver with custom database name
|
||||
driver = NeptuneDriver(
|
||||
host=<NEPTUNE ENDPOINT>,
|
||||
aoss_host=<Amazon OpenSearch Serverless Host>,
|
||||
port=<PORT> # Optional, defaults to 8182,
|
||||
aoss_port=<PORT> # Optional, defaults to 443
|
||||
host= < NEPTUNE
|
||||
ENDPOINT >,
|
||||
aoss_host = < Amazon
|
||||
OpenSearch
|
||||
Serverless
|
||||
Host >,
|
||||
port = < PORT > # Optional, defaults to 8182,
|
||||
aoss_port = < PORT > # Optional, defaults to 443
|
||||
)
|
||||
|
||||
driver = NeptuneDriver(host=neptune_uri, aoss_host=aoss_host, port=neptune_port)
|
||||
|
|
@ -325,24 +353,20 @@ driver = NeptuneDriver(host=neptune_uri, aoss_host=aoss_host, port=neptune_port)
|
|||
graphiti = Graphiti(graph_driver=driver)
|
||||
```
|
||||
|
||||
|
||||
### Performance Configuration
|
||||
|
||||
`USE_PARALLEL_RUNTIME` is an optional boolean variable that can be set to true if you wish
|
||||
to enable Neo4j's parallel runtime feature for several of our search queries.
|
||||
Note that this feature is not supported for Neo4j Community edition or for smaller AuraDB instances,
|
||||
as such this feature is off by default.
|
||||
|
||||
## Using Graphiti with Azure OpenAI
|
||||
|
||||
Graphiti supports Azure OpenAI for both LLM inference and embeddings. Azure deployments often require different endpoints for LLM and embedding services, and separate deployments for default and small models.
|
||||
Graphiti supports Azure OpenAI for both LLM inference and embeddings. Azure deployments often require different
|
||||
endpoints for LLM and embedding services, and separate deployments for default and small models.
|
||||
|
||||
> [!IMPORTANT]
|
||||
> **Azure OpenAI v1 API Opt-in Required for Structured Outputs**
|
||||
>
|
||||
> Graphiti uses structured outputs via the `client.beta.chat.completions.parse()` method, which requires Azure OpenAI deployments to opt into the v1 API. Without this opt-in, you'll encounter 404 Resource not found errors during episode ingestion.
|
||||
>
|
||||
> To enable v1 API support in your Azure OpenAI deployment, follow Microsoft's guide: [Azure OpenAI API version lifecycle](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/api-version-lifecycle?tabs=key#api-evolution).
|
||||
>
|
||||
> Graphiti uses structured outputs via the `client.beta.chat.completions.parse()` method, which requires Azure OpenAI
|
||||
> deployments to opt into the v1 API. Without this opt-in, you'll encounter 404 Resource not found errors during episode
|
||||
> ingestion.
|
||||
>
|
||||
> To enable v1 API support in your Azure OpenAI deployment, follow Microsoft's
|
||||
> guide: [Azure OpenAI API version lifecycle](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/api-version-lifecycle?tabs=key#api-evolution).
|
||||
|
||||
```python
|
||||
from openai import AsyncAzureOpenAI
|
||||
|
|
@ -402,11 +426,13 @@ graphiti = Graphiti(
|
|||
# Now you can use Graphiti with Azure OpenAI
|
||||
```
|
||||
|
||||
Make sure to replace the placeholder values with your actual Azure OpenAI credentials and deployment names that match your Azure OpenAI service configuration.
|
||||
Make sure to replace the placeholder values with your actual Azure OpenAI credentials and deployment names that match
|
||||
your Azure OpenAI service configuration.
|
||||
|
||||
## Using Graphiti with Google Gemini
|
||||
|
||||
Graphiti supports Google's Gemini models for LLM inference, embeddings, and cross-encoding/reranking. To use Gemini, you'll need to configure the LLM client, embedder, and the cross-encoder with your Google API key.
|
||||
Graphiti supports Google's Gemini models for LLM inference, embeddings, and cross-encoding/reranking. To use Gemini,
|
||||
you'll need to configure the LLM client, embedder, and the cross-encoder with your Google API key.
|
||||
|
||||
Install Graphiti:
|
||||
|
||||
|
|
@ -455,13 +481,17 @@ graphiti = Graphiti(
|
|||
# Now you can use Graphiti with Google Gemini for all components
|
||||
```
|
||||
|
||||
The Gemini reranker uses the `gemini-2.5-flash-lite-preview-06-17` model by default, which is optimized for cost-effective and low-latency classification tasks. It uses the same boolean classification approach as the OpenAI reranker, leveraging Gemini's log probabilities feature to rank passage relevance.
|
||||
The Gemini reranker uses the `gemini-2.5-flash-lite-preview-06-17` model by default, which is optimized for
|
||||
cost-effective and low-latency classification tasks. It uses the same boolean classification approach as the OpenAI
|
||||
reranker, leveraging Gemini's log probabilities feature to rank passage relevance.
|
||||
|
||||
## Using Graphiti with Ollama (Local LLM)
|
||||
|
||||
Graphiti supports Ollama for running local LLMs and embedding models via Ollama's OpenAI-compatible API. This is ideal for privacy-focused applications or when you want to avoid API costs.
|
||||
Graphiti supports Ollama for running local LLMs and embedding models via Ollama's OpenAI-compatible API. This is ideal
|
||||
for privacy-focused applications or when you want to avoid API costs.
|
||||
|
||||
Install the models:
|
||||
|
||||
```bash
|
||||
ollama pull deepseek-r1:7b # LLM
|
||||
ollama pull nomic-embed-text # embeddings
|
||||
|
|
@ -514,7 +544,8 @@ Ensure Ollama is running (`ollama serve`) and that you have pulled the models yo
|
|||
|
||||
## Telemetry
|
||||
|
||||
Graphiti collects anonymous usage statistics to help us understand how the framework is being used and improve it for everyone. We believe transparency is important, so here's exactly what we collect and why.
|
||||
Graphiti collects anonymous usage statistics to help us understand how the framework is being used and improve it for
|
||||
everyone. We believe transparency is important, so here's exactly what we collect and why.
|
||||
|
||||
### What We Collect
|
||||
|
||||
|
|
@ -524,9 +555,9 @@ When you initialize a Graphiti instance, we collect:
|
|||
- **System information**: Operating system, Python version, and system architecture
|
||||
- **Graphiti version**: The version you're using
|
||||
- **Configuration choices**:
|
||||
- LLM provider type (OpenAI, Azure, Anthropic, etc.)
|
||||
- Database backend (Neo4j, FalkorDB, Kuzu, Amazon Neptune Database or Neptune Analytics)
|
||||
- Embedder provider (OpenAI, Azure, Voyage, etc.)
|
||||
- LLM provider type (OpenAI, Azure, Anthropic, etc.)
|
||||
- Database backend (Neo4j, FalkorDB, Kuzu, Amazon Neptune Database or Neptune Analytics)
|
||||
- Embedder provider (OpenAI, Azure, Voyage, etc.)
|
||||
|
||||
### What We Don't Collect
|
||||
|
||||
|
|
@ -578,10 +609,12 @@ echo 'export GRAPHITI_TELEMETRY_ENABLED=false' >> ~/.zshrc
|
|||
|
||||
```python
|
||||
import os
|
||||
|
||||
os.environ['GRAPHITI_TELEMETRY_ENABLED'] = 'false'
|
||||
|
||||
# Then initialize Graphiti as usual
|
||||
from graphiti_core import Graphiti
|
||||
|
||||
graphiti = Graphiti(...)
|
||||
```
|
||||
|
||||
|
|
@ -590,7 +623,8 @@ Telemetry is automatically disabled during test runs (when `pytest` is detected)
|
|||
### Technical Details
|
||||
|
||||
- Telemetry uses PostHog for anonymous analytics collection
|
||||
- All telemetry operations are designed to fail silently - they will never interrupt your application or affect Graphiti functionality
|
||||
- All telemetry operations are designed to fail silently - they will never interrupt your application or affect Graphiti
|
||||
functionality
|
||||
- The anonymous ID is stored locally and is not tied to any personal information
|
||||
|
||||
## Status and Roadmap
|
||||
|
|
@ -598,8 +632,8 @@ Telemetry is automatically disabled during test runs (when `pytest` is detected)
|
|||
Graphiti is under active development. We aim to maintain API stability while working on:
|
||||
|
||||
- [x] Supporting custom graph schemas:
|
||||
- Allow developers to provide their own defined node and edge classes when ingesting episodes
|
||||
- Enable more flexible knowledge representation tailored to specific use cases
|
||||
- Allow developers to provide their own defined node and edge classes when ingesting episodes
|
||||
- Enable more flexible knowledge representation tailored to specific use cases
|
||||
- [x] Enhancing retrieval capabilities with more robust and configurable options
|
||||
- [x] Graphiti MCP Server
|
||||
- [ ] Expanding test coverage to ensure reliability and catch edge cases
|
||||
|
|
|
|||
|
|
@ -38,10 +38,6 @@ SEMAPHORE_LIMIT = int(os.getenv('SEMAPHORE_LIMIT', 20))
|
|||
MAX_REFLEXION_ITERATIONS = int(os.getenv('MAX_REFLEXION_ITERATIONS', 0))
|
||||
DEFAULT_PAGE_LIMIT = 20
|
||||
|
||||
RUNTIME_QUERY: LiteralString = (
|
||||
'CYPHER runtime = parallel parallelRuntimeSupport=all\n' if USE_PARALLEL_RUNTIME else ''
|
||||
)
|
||||
|
||||
|
||||
def parse_db_date(input_date: neo4j_time.DateTime | str | None) -> datetime | None:
|
||||
if isinstance(input_date, neo4j_time.DateTime):
|
||||
|
|
|
|||
|
|
@ -32,7 +32,6 @@ from graphiti_core.graph_queries import (
|
|||
get_vector_cosine_func_query,
|
||||
)
|
||||
from graphiti_core.helpers import (
|
||||
RUNTIME_QUERY,
|
||||
lucene_sanitize,
|
||||
normalize_l2,
|
||||
semaphore_gather,
|
||||
|
|
@ -211,11 +210,11 @@ async def edge_fulltext_search(
|
|||
# Match the edge ids and return the values
|
||||
query = (
|
||||
"""
|
||||
UNWIND $ids as id
|
||||
MATCH (n:Entity)-[e:RELATES_TO]->(m:Entity)
|
||||
WHERE e.group_id IN $group_ids
|
||||
AND id(e)=id
|
||||
"""
|
||||
UNWIND $ids as id
|
||||
MATCH (n:Entity)-[e:RELATES_TO]->(m:Entity)
|
||||
WHERE e.group_id IN $group_ids
|
||||
AND id(e)=id
|
||||
"""
|
||||
+ filter_query
|
||||
+ """
|
||||
AND id(e)=id
|
||||
|
|
@ -320,10 +319,9 @@ async def edge_similarity_search(
|
|||
|
||||
if driver.provider == GraphProvider.NEPTUNE:
|
||||
query = (
|
||||
RUNTIME_QUERY
|
||||
+ """
|
||||
MATCH (n:Entity)-[e:RELATES_TO]->(m:Entity)
|
||||
"""
|
||||
MATCH (n:Entity)-[e:RELATES_TO]->(m:Entity)
|
||||
"""
|
||||
+ filter_query
|
||||
+ """
|
||||
RETURN DISTINCT id(e) as id, e.fact_embedding as embedding
|
||||
|
|
@ -383,8 +381,7 @@ async def edge_similarity_search(
|
|||
return []
|
||||
else:
|
||||
query = (
|
||||
RUNTIME_QUERY
|
||||
+ match_query
|
||||
match_query
|
||||
+ filter_query
|
||||
+ """
|
||||
WITH DISTINCT e, n, m, """
|
||||
|
|
@ -577,11 +574,11 @@ async def node_fulltext_search(
|
|||
# Match the edge ides and return the values
|
||||
query = (
|
||||
"""
|
||||
UNWIND $ids as i
|
||||
MATCH (n:Entity)
|
||||
WHERE n.uuid=i.id
|
||||
RETURN
|
||||
"""
|
||||
UNWIND $ids as i
|
||||
MATCH (n:Entity)
|
||||
WHERE n.uuid=i.id
|
||||
RETURN
|
||||
"""
|
||||
+ get_entity_node_return_query(driver.provider)
|
||||
+ """
|
||||
ORDER BY i.score DESC
|
||||
|
|
@ -658,10 +655,9 @@ async def node_similarity_search(
|
|||
|
||||
if driver.provider == GraphProvider.NEPTUNE:
|
||||
query = (
|
||||
RUNTIME_QUERY
|
||||
+ """
|
||||
MATCH (n:Entity)
|
||||
"""
|
||||
MATCH (n:Entity)
|
||||
"""
|
||||
+ filter_query
|
||||
+ """
|
||||
RETURN DISTINCT id(n) as id, n.name_embedding as embedding
|
||||
|
|
@ -690,11 +686,11 @@ async def node_similarity_search(
|
|||
# Match the edge ides and return the values
|
||||
query = (
|
||||
"""
|
||||
UNWIND $ids as i
|
||||
MATCH (n:Entity)
|
||||
WHERE id(n)=i.id
|
||||
RETURN
|
||||
"""
|
||||
UNWIND $ids as i
|
||||
MATCH (n:Entity)
|
||||
WHERE id(n)=i.id
|
||||
RETURN
|
||||
"""
|
||||
+ get_entity_node_return_query(driver.provider)
|
||||
+ """
|
||||
ORDER BY i.score DESC
|
||||
|
|
@ -743,10 +739,9 @@ async def node_similarity_search(
|
|||
|
||||
else:
|
||||
query = (
|
||||
RUNTIME_QUERY
|
||||
+ """
|
||||
MATCH (n:Entity)
|
||||
"""
|
||||
MATCH (n:Entity)
|
||||
"""
|
||||
+ filter_query
|
||||
+ """
|
||||
WITH n, """
|
||||
|
|
@ -1051,10 +1046,9 @@ async def community_similarity_search(
|
|||
|
||||
if driver.provider == GraphProvider.NEPTUNE:
|
||||
query = (
|
||||
RUNTIME_QUERY
|
||||
+ """
|
||||
MATCH (n:Community)
|
||||
"""
|
||||
MATCH (n:Community)
|
||||
"""
|
||||
+ group_filter_query
|
||||
+ """
|
||||
RETURN DISTINCT id(n) as id, n.name_embedding as embedding
|
||||
|
|
@ -1112,10 +1106,9 @@ async def community_similarity_search(
|
|||
search_vector_var = f'CAST($search_vector AS FLOAT[{len(search_vector)}])'
|
||||
|
||||
query = (
|
||||
RUNTIME_QUERY
|
||||
+ """
|
||||
MATCH (c:Community)
|
||||
"""
|
||||
MATCH (c:Community)
|
||||
"""
|
||||
+ group_filter_query
|
||||
+ """
|
||||
WITH c,
|
||||
|
|
@ -1256,11 +1249,10 @@ async def get_relevant_nodes(
|
|||
|
||||
# FIXME: Kuzu currently does not support using variables such as `node.fulltext_query` as an input to FTS, which means `get_relevant_nodes()` won't work with Kuzu as the graph driver.
|
||||
query = (
|
||||
RUNTIME_QUERY
|
||||
+ """
|
||||
UNWIND $nodes AS node
|
||||
MATCH (n:Entity {group_id: $group_id})
|
||||
"""
|
||||
UNWIND $nodes AS node
|
||||
MATCH (n:Entity {group_id: $group_id})
|
||||
"""
|
||||
+ filter_query
|
||||
+ """
|
||||
WITH node, n, """
|
||||
|
|
@ -1304,11 +1296,10 @@ async def get_relevant_nodes(
|
|||
)
|
||||
else:
|
||||
query = (
|
||||
RUNTIME_QUERY
|
||||
+ """
|
||||
UNWIND $nodes AS node
|
||||
MATCH (n:Entity {group_id: $group_id})
|
||||
"""
|
||||
UNWIND $nodes AS node
|
||||
MATCH (n:Entity {group_id: $group_id})
|
||||
"""
|
||||
+ filter_query
|
||||
+ """
|
||||
WITH node, n, """
|
||||
|
|
@ -1396,11 +1387,10 @@ async def get_relevant_edges(
|
|||
|
||||
if driver.provider == GraphProvider.NEPTUNE:
|
||||
query = (
|
||||
RUNTIME_QUERY
|
||||
+ """
|
||||
UNWIND $edges AS edge
|
||||
MATCH (n:Entity {uuid: edge.source_node_uuid})-[e:RELATES_TO {group_id: edge.group_id}]-(m:Entity {uuid: edge.target_node_uuid})
|
||||
"""
|
||||
UNWIND $edges AS edge
|
||||
MATCH (n:Entity {uuid: edge.source_node_uuid})-[e:RELATES_TO {group_id: edge.group_id}]-(m:Entity {uuid: edge.target_node_uuid})
|
||||
"""
|
||||
+ filter_query
|
||||
+ """
|
||||
WITH e, edge
|
||||
|
|
@ -1469,11 +1459,10 @@ async def get_relevant_edges(
|
|||
return []
|
||||
|
||||
query = (
|
||||
RUNTIME_QUERY
|
||||
+ """
|
||||
UNWIND $edges AS edge
|
||||
MATCH (n:Entity {uuid: edge.source_node_uuid})-[:RELATES_TO]-(e:RelatesToNode_ {group_id: edge.group_id})-[:RELATES_TO]-(m:Entity {uuid: edge.target_node_uuid})
|
||||
"""
|
||||
UNWIND $edges AS edge
|
||||
MATCH (n:Entity {uuid: edge.source_node_uuid})-[:RELATES_TO]-(e:RelatesToNode_ {group_id: edge.group_id})-[:RELATES_TO]-(m:Entity {uuid: edge.target_node_uuid})
|
||||
"""
|
||||
+ filter_query
|
||||
+ """
|
||||
WITH e, edge, n, m, """
|
||||
|
|
@ -1508,11 +1497,10 @@ async def get_relevant_edges(
|
|||
)
|
||||
else:
|
||||
query = (
|
||||
RUNTIME_QUERY
|
||||
+ """
|
||||
UNWIND $edges AS edge
|
||||
MATCH (n:Entity {uuid: edge.source_node_uuid})-[e:RELATES_TO {group_id: edge.group_id}]-(m:Entity {uuid: edge.target_node_uuid})
|
||||
"""
|
||||
UNWIND $edges AS edge
|
||||
MATCH (n:Entity {uuid: edge.source_node_uuid})-[e:RELATES_TO {group_id: edge.group_id}]-(m:Entity {uuid: edge.target_node_uuid})
|
||||
"""
|
||||
+ filter_query
|
||||
+ """
|
||||
WITH e, edge, """
|
||||
|
|
@ -1584,12 +1572,11 @@ async def get_edge_invalidation_candidates(
|
|||
|
||||
if driver.provider == GraphProvider.NEPTUNE:
|
||||
query = (
|
||||
RUNTIME_QUERY
|
||||
+ """
|
||||
UNWIND $edges AS edge
|
||||
MATCH (n:Entity)-[e:RELATES_TO {group_id: edge.group_id}]->(m:Entity)
|
||||
WHERE n.uuid IN [edge.source_node_uuid, edge.target_node_uuid] OR m.uuid IN [edge.target_node_uuid, edge.source_node_uuid]
|
||||
"""
|
||||
UNWIND $edges AS edge
|
||||
MATCH (n:Entity)-[e:RELATES_TO {group_id: edge.group_id}]->(m:Entity)
|
||||
WHERE n.uuid IN [edge.source_node_uuid, edge.target_node_uuid] OR m.uuid IN [edge.target_node_uuid, edge.source_node_uuid]
|
||||
"""
|
||||
+ filter_query
|
||||
+ """
|
||||
WITH e, edge
|
||||
|
|
@ -1658,12 +1645,11 @@ async def get_edge_invalidation_candidates(
|
|||
return []
|
||||
|
||||
query = (
|
||||
RUNTIME_QUERY
|
||||
+ """
|
||||
UNWIND $edges AS edge
|
||||
MATCH (n:Entity)-[:RELATES_TO]->(e:RelatesToNode_ {group_id: edge.group_id})-[:RELATES_TO]->(m:Entity)
|
||||
WHERE (n.uuid IN [edge.source_node_uuid, edge.target_node_uuid] OR m.uuid IN [edge.target_node_uuid, edge.source_node_uuid])
|
||||
"""
|
||||
UNWIND $edges AS edge
|
||||
MATCH (n:Entity)-[:RELATES_TO]->(e:RelatesToNode_ {group_id: edge.group_id})-[:RELATES_TO]->(m:Entity)
|
||||
WHERE (n.uuid IN [edge.source_node_uuid, edge.target_node_uuid] OR m.uuid IN [edge.target_node_uuid, edge.source_node_uuid])
|
||||
"""
|
||||
+ filter_query
|
||||
+ """
|
||||
WITH edge, e, n, m, """
|
||||
|
|
@ -1698,12 +1684,11 @@ async def get_edge_invalidation_candidates(
|
|||
)
|
||||
else:
|
||||
query = (
|
||||
RUNTIME_QUERY
|
||||
+ """
|
||||
UNWIND $edges AS edge
|
||||
MATCH (n:Entity)-[e:RELATES_TO {group_id: edge.group_id}]->(m:Entity)
|
||||
WHERE n.uuid IN [edge.source_node_uuid, edge.target_node_uuid] OR m.uuid IN [edge.target_node_uuid, edge.source_node_uuid]
|
||||
"""
|
||||
UNWIND $edges AS edge
|
||||
MATCH (n:Entity)-[e:RELATES_TO {group_id: edge.group_id}]->(m:Entity)
|
||||
WHERE n.uuid IN [edge.source_node_uuid, edge.target_node_uuid] OR m.uuid IN [edge.target_node_uuid, edge.source_node_uuid]
|
||||
"""
|
||||
+ filter_query
|
||||
+ """
|
||||
WITH edge, e, """
|
||||
|
|
|
|||
|
|
@ -149,9 +149,9 @@ async def retrieve_episodes(
|
|||
|
||||
query: LiteralString = (
|
||||
"""
|
||||
MATCH (e:Episodic)
|
||||
WHERE e.valid_at <= $reference_time
|
||||
"""
|
||||
MATCH (e:Episodic)
|
||||
WHERE e.valid_at <= $reference_time
|
||||
"""
|
||||
+ query_filter
|
||||
+ """
|
||||
RETURN
|
||||
|
|
@ -180,41 +180,39 @@ async def retrieve_episodes(
|
|||
async def build_dynamic_indexes(driver: GraphDriver, group_id: str):
|
||||
# Make sure indices exist for this group_id in Neo4j
|
||||
if driver.provider == GraphProvider.NEO4J:
|
||||
await semaphore_gather(
|
||||
driver.execute_query(
|
||||
"""CREATE FULLTEXT INDEX $episode_content IF NOT EXISTS
|
||||
FOR (e:"""
|
||||
+ 'Episodic_'
|
||||
+ group_id.replace('-', '')
|
||||
+ """) ON EACH [e.content, e.source, e.source_description, e.group_id]""",
|
||||
episode_content='episode_content_' + group_id.replace('-', ''),
|
||||
),
|
||||
driver.execute_query(
|
||||
"""CREATE FULLTEXT INDEX $node_name_and_summary IF NOT EXISTS FOR (n:"""
|
||||
+ 'Entity_'
|
||||
+ group_id.replace('-', '')
|
||||
+ """) ON EACH [n.name, n.summary, n.group_id]""",
|
||||
node_name_and_summary='node_name_and_summary_' + group_id.replace('-', ''),
|
||||
),
|
||||
driver.execute_query(
|
||||
"""CREATE FULLTEXT INDEX $community_name IF NOT EXISTS
|
||||
FOR (n:"""
|
||||
+ 'Community_'
|
||||
+ group_id.replace('-', '')
|
||||
+ """) ON EACH [n.name, n.group_id]""",
|
||||
community_name='Community_' + group_id.replace('-', ''),
|
||||
),
|
||||
driver.execute_query(
|
||||
"""CREATE VECTOR INDEX $group_entity_vector IF NOT EXISTS
|
||||
FOR (n:"""
|
||||
+ 'Entity_'
|
||||
+ group_id.replace('-', '')
|
||||
+ """)
|
||||
await driver.execute_query(
|
||||
"""CREATE FULLTEXT INDEX $episode_content IF NOT EXISTS
|
||||
FOR (e:"""
|
||||
+ 'Episodic_'
|
||||
+ group_id.replace('-', '')
|
||||
+ """) ON EACH [e.content, e.source, e.source_description, e.group_id]""",
|
||||
episode_content='episode_content_' + group_id.replace('-', ''),
|
||||
)
|
||||
await driver.execute_query(
|
||||
"""CREATE FULLTEXT INDEX $node_name_and_summary IF NOT EXISTS FOR (n:"""
|
||||
+ 'Entity_'
|
||||
+ group_id.replace('-', '')
|
||||
+ """) ON EACH [n.name, n.summary, n.group_id]""",
|
||||
node_name_and_summary='node_name_and_summary_' + group_id.replace('-', ''),
|
||||
)
|
||||
await driver.execute_query(
|
||||
"""CREATE FULLTEXT INDEX $community_name IF NOT EXISTS
|
||||
FOR (n:"""
|
||||
+ 'Community_'
|
||||
+ group_id.replace('-', '')
|
||||
+ """) ON EACH [n.name, n.group_id]""",
|
||||
community_name='Community_' + group_id.replace('-', ''),
|
||||
)
|
||||
await driver.execute_query(
|
||||
"""CREATE VECTOR INDEX $group_entity_vector IF NOT EXISTS
|
||||
FOR (n:"""
|
||||
+ 'Entity_'
|
||||
+ group_id.replace('-', '')
|
||||
+ """)
|
||||
ON n.embedding
|
||||
OPTIONS { indexConfig: {
|
||||
`vector.dimensions`: 1024,
|
||||
`vector.similarity_function`: 'cosine'
|
||||
}}""",
|
||||
group_entity_vector='group_entity_vector_' + group_id.replace('-', ''),
|
||||
),
|
||||
group_entity_vector='group_entity_vector_' + group_id.replace('-', ''),
|
||||
)
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue