<!-- .github/pull_request_template.md -->
## Description
Resolve issue with cypher search by encoding the return value from the
cypher query into JSON. Uses fastapi json encoder
## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):
## Screenshots/Videos (if applicable)
Example of result now with Cypher search with the following query "MATCH
(src)-[rel]->(nbr) RETURN src, rel" on Simple example:
```
{
"search_result":[
[
[
{
"_id":{
"offset":0,
"table":0
},
"_label":"Node",
"id":"87372381-a9fe-5b82-9c92-3f5dbab1bc35",
"name":"",
"type":"DocumentChunk",
"created_at":"2025-11-05T14:12:46.707597",
"updated_at":"2025-11-05T14:12:54.801747",
"properties":"{\"created_at\": 1762351945009, \"updated_at\": 1762351945009, \"ontology_valid\": false, \"version\": 1, \"topological_rank\": 0, \"metadata\": {\"index_fields\": [\"text\"]}, \"belongs_to_set\": null, \"text\": \"\\n Natural language processing (NLP) is an interdisciplinary\\n subfield of computer science and information retrieval.\\n \", \"chunk_size\": 48, \"chunk_index\": 0, \"cut_type\": \"paragraph_end\"}"
},
{
"_src":{
"offset":0,
"table":0
},
"_dst":{
"offset":1,
"table":0
},
"_label":"EDGE",
"_id":{
"offset":0,
"table":1
},
"relationship_name":"contains",
"created_at":"2025-11-05T14:12:47.217590",
"updated_at":"2025-11-05T14:12:55.193003",
"properties":"{\"source_node_id\": \"87372381-a9fe-5b82-9c92-3f5dbab1bc35\", \"target_node_id\": \"bc338a39-64d6-549a-acec-da60846dd90d\", \"relationship_name\": \"contains\", \"updated_at\": \"2025-11-05 14:12:54\", \"relationship_type\": \"contains\", \"edge_text\": \"relationship_name: contains; entity_name: natural language processing (nlp); entity_description: An interdisciplinary subfield of computer science and information retrieval concerned with interactions between computers and human (natural) languages.\"}"
}
]
]
],
"dataset_id":"UUID(""af4b1c1c-90fc-59b7-952c-1da9bbde370c"")",
"dataset_name":"main_dataset",
"graphs":"None"
}
```
Relates to https://github.com/topoteretes/cognee/pull/1725
Issue: https://github.com/topoteretes/cognee/issues/1723
## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
85 lines
2.8 KiB
Python
85 lines
2.8 KiB
Python
from typing import Any, Optional
|
|
from fastapi.encoders import jsonable_encoder
|
|
|
|
from cognee.infrastructure.databases.graph import get_graph_engine
|
|
from cognee.modules.retrieval.base_retriever import BaseRetriever
|
|
from cognee.modules.retrieval.utils.completion import generate_completion
|
|
from cognee.modules.retrieval.exceptions import SearchTypeNotSupported, CypherSearchError
|
|
from cognee.shared.logging_utils import get_logger
|
|
|
|
logger = get_logger("CypherSearchRetriever")
|
|
|
|
|
|
class CypherSearchRetriever(BaseRetriever):
|
|
"""
|
|
Retriever for handling cypher-based search.
|
|
|
|
Public methods include:
|
|
- get_context: Retrieves relevant context using a cypher query.
|
|
- get_completion: Returns the graph connections context.
|
|
"""
|
|
|
|
def __init__(
|
|
self,
|
|
user_prompt_path: str = "context_for_question.txt",
|
|
system_prompt_path: str = "answer_simple_question.txt",
|
|
):
|
|
"""Initialize retriever with optional custom prompt paths."""
|
|
self.user_prompt_path = user_prompt_path
|
|
self.system_prompt_path = system_prompt_path
|
|
|
|
async def get_context(self, query: str) -> Any:
|
|
"""
|
|
Retrieves relevant context using a cypher query.
|
|
|
|
If any error occurs during execution, logs the error and raises CypherSearchError.
|
|
|
|
Parameters:
|
|
-----------
|
|
|
|
- query (str): The cypher query used to retrieve context.
|
|
|
|
Returns:
|
|
--------
|
|
|
|
- Any: The result of the cypher query execution.
|
|
"""
|
|
try:
|
|
graph_engine = await get_graph_engine()
|
|
is_empty = await graph_engine.is_empty()
|
|
|
|
if is_empty:
|
|
logger.warning("Search attempt on an empty knowledge graph")
|
|
return []
|
|
|
|
result = jsonable_encoder(await graph_engine.query(query))
|
|
except Exception as e:
|
|
logger.error("Failed to execture cypher search retrieval: %s", str(e))
|
|
raise CypherSearchError() from e
|
|
return result
|
|
|
|
async def get_completion(
|
|
self, query: str, context: Optional[Any] = None, session_id: Optional[str] = None
|
|
) -> Any:
|
|
"""
|
|
Returns the graph connections context.
|
|
|
|
If no context is provided, it retrieves the context using the specified query.
|
|
|
|
Parameters:
|
|
-----------
|
|
|
|
- query (str): The query to retrieve context.
|
|
- context (Optional[Any]): Optional context to use, otherwise fetched using the
|
|
query. (default None)
|
|
- session_id (Optional[str]): Optional session identifier for caching. If None,
|
|
defaults to 'default_session'. (default None)
|
|
|
|
Returns:
|
|
--------
|
|
|
|
- Any: The context, either provided or retrieved.
|
|
"""
|
|
if context is None:
|
|
context = await self.get_context(query)
|
|
return context
|