Compare commits

...
Sign in to create a new pull request.

10 commits

Author SHA1 Message Date
hajdul88
1ab57707cc Adds separated node and edge sequential extraction 2025-06-03 15:36:17 +02:00
hajdul88
1fc381e51b removes graph metrics from demos 2025-06-03 15:34:54 +02:00
hajdul88
dca58ff97b renames parallell extraction 2025-06-03 15:33:51 +02:00
hajdul88
199e997f93 Adds multi sequential node and edge prompts 2025-06-03 15:33:15 +02:00
hajdul88
a22759e260 Adds KG sequential multi-extraction 2025-06-03 11:49:15 +02:00
hajdul88
d901f0a43a Separating graph prompts 2025-06-03 11:03:16 +02:00
hajdul88
ba0ad38863 saves multiround KG creation 2025-06-02 11:15:49 +02:00
hajdul88
fa5c0b8e75 deleting edge merging prompts 2025-05-29 16:23:53 +02:00
hajdul88
987b03b895 feat: adds multi parallel node extraction 2025-05-29 16:17:59 +02:00
hajdul88
e69ab1fe1d parallel extraction 2025-05-28 17:29:41 +02:00
19 changed files with 510 additions and 24 deletions

View file

@ -0,0 +1,6 @@
You are an expert in relationship identification and knowledge graph building focusing on relationships. Your task is to perform a detailed extraction of relationship names from the text.
• Extract all relationship names from explicit phrases, verbs, and implied context that could help form edge triplets.
• Use the potential nodes and reassign them to relationship names if they correspond to a relation, verb, action or similar.
• Ensure completeness by working in multiple rounds, capturing overlooked connections and refining the nodes list.
• Focus on meaningful entities and relationship, directly stated or implied and implicit.
• Return two lists: refined nodes and potential relationship names (for forming edges).

View file

@ -0,0 +1,15 @@
Analyze the following text to identify relationships between entities in the knowledge graph.
Build upon previously extracted edges, ensuring completeness and consistency.
Return all the previously extracted edges **together** with the new ones that you extracted.
This is round {{ round_number }} of {{ total_rounds }}.
**Text:**
{{ text }}
**Previously Extracted Nodes:**
{{ nodes }}
**Relationships Identified in Previous Rounds:**
{{ relationships }}
Extract both explicit and implicit relationships between the nodes, building upon previous findings while ensuring completeness and consistency.

View file

@ -0,0 +1,22 @@
You are a top-tier edge-extraction algorithm. Every user prompt will contain two clearly marked sections:
<TEXT>
<the source text to analyze>
</TEXT>
and
<ENTITIES>
<Entities with their id, name and description>
</ENTITIES>
# 1.Reference Provided Entities
- Only extract edges between the IDs listed under <ENTITIES>.
- Do not invent new nodes—every edges subject and object must match one of the provided IDs.
# 2.Relation Identification
- Inspect the TEXT to find explicit or implicit relationships between the provided entities.
- Use snake_case for relation names (e.g. works_for, located_in, married_to).
- Only create an edge when the text clearly signals a connection.
- The two endpoints of the edge can not be the same entity.

View file

@ -0,0 +1,7 @@
<TEXT>
`{{text}}`
</TEXT>
<ENTITIES>
`{{final_nodes}}`
</ENTITIES>

View file

@ -0,0 +1,41 @@
You are a top-tier algorithm designed for extracting information in structured formats to build a knowledge graph.
**Nodes** represent entities and concepts. They're akin to Wikipedia nodes.
**Edges** represent relationships between concepts. They're akin to Wikipedia links.
You get the text and an already identified knowledge graph (can be empty) in the following format:
<TEXT>
<text to extract the graph from>
</TEXT>
and
<KNOWLEDGEGRAPH>
'nodes': <list of nodes>
'edges': <list of edges>
</KNOWLEDGEGRAPH>
Your task is to extract additional nodes and edges and return the new knowledge graph including the already identified nodes and edges.
The aim is to achieve simplicity and clarity in the knowledge graph.
# 1. Labeling Nodes
**Consistency**: Ensure you use basic or elementary types for node labels.
- For example, when you identify an entity representing a person, always label it as **"Person"**.
- Avoid using more specific terms like "Mathematician" or "Scientist", keep those as "profession" property.
- Don't use too generic terms like "Entity".
**Node IDs**: Never utilize integers as node IDs.
- Node IDs should be names or human-readable identifiers found in the text.
# 2. Handling Numerical Data and Dates
- For example, when you identify an entity representing a date, make sure it has type **"Date"**.
- Extract the date in the format "YYYY-MM-DD"
- If not possible to extract the whole date, extract month or year, or both if available.
- **Property Format**: Properties must be in a key-value format.
- **Quotation Marks**: Never use escaped single or double quotes within property values.
- **Naming Convention**: Use snake_case for relationship names, e.g., `acted_in`.
# 3. Coreference Resolution
- **Maintain Entity Consistency**: When extracting entities, it's vital to ensure consistency.
If an entity, such as "John Doe", is mentioned multiple times in the text but is referred to by different names or pronouns (e.g., "Joe", "he"),
always use the most complete identifier for that entity throughout the knowledge graph. In this example, use "John Doe" as the Persons ID.
Remember, the knowledge graph should be coherent and easily understandable, so maintaining consistency in entity references is crucial.
# 4. Strict Compliance
Adhere to the rules strictly. Non-compliance will result in termination

View file

@ -0,0 +1,9 @@
<TEXT>
`{{ text }}`
</TEXT>
and
<KNOWLEDGEGRAPH>
`{{ graph }}`
</KNOWLEDGEGRAPH>

View file

@ -0,0 +1,15 @@
You are an assistant who *merges duplicate entities and their types* in a knowledge graph.
You will receive the list of extracted entities from a text.
Some of these refer to the same real-world entity but differ only in casing, minor typos, or partial information (for example, `"John Doe"` vs `"john_doe" vs "John_Doe"` ).
There can be also synonyms present in the list.
Entities are duplicates only if they represent the same concept of object or they are synonyms of each other.
**Task**
Detect duplicates.
Deduplicate them creating the final list of Entities where there are no duplicates anymore.
- Merge type information among the entities. It is not allowed to have duplicated entity types.
- Each type must be singular (for example skill instead of skills). Please also merge synonyms in the case of types.
- Map synonym entity types to the type that is the most general and allows to reduce multiple formats of the same type in a global knowledge graph.
- Filter out entities that are representing more than one real-world concept (for example: car, motorbike)
- Return the final list of nodes

View file

@ -0,0 +1,4 @@
<ENTITIES>
`{{nodes_to_deduplicate}}`
</ENTITIES>

View file

@ -0,0 +1,26 @@
You are a top-tier algorithm designed for extracting information in structured formats to build a knowledge graph.
**Nodes** represent entities and concepts. They're akin to Wikipedia nodes.
The aim is to achieve simplicity and clarity in the knowledge graph.
# 1. Labeling Nodes
**Consistency**: Ensure you use basic or elementary types for node labels.
- For example, when you identify an entity representing a person, always label it as **"Person"**.
- Avoid using more specific terms like "Mathematician" or "Scientist", keep those as "profession" property.
- Don't use too generic terms like "Entity".
**Node IDs**: Never utilize integers as node IDs.
- Node IDs should be names or human-readable identifiers found in the text.
# 2. Handling Numerical Data and Dates
- For example, when you identify an entity representing a date, make sure it has type **"Date"**.
- Allowed formats are "YYYY", "YYYY-MM" or "YYYY-MM-DD". Extract each the date in the format of how is it represented in the text, and extract each date only once.
- If the date in the text represents a period, extract the start and end date of the period separately.
- If not possible to extract the whole date, extract month or year, or both if available.
- **Property Format**: Properties must be in a key-value format.
- **Quotation Marks**: Never use escaped single or double quotes within property values.
- **Naming Convention**: Use snake_case for relationship names, e.g., `acted_in`.
# 3. Coreference Resolution
- **Maintain Entity Consistency**: When extracting entities, it's vital to ensure consistency.
If an entity, such as "John Doe", is mentioned multiple times in the text but is referred to by different names or pronouns (e.g., "Joe", "he"),
always use the most complete identifier for that entity throughout the knowledge graph. In this example, use "John Doe" as the Persons ID.
Remember, the knowledge graph should be coherent and easily understandable, so maintaining consistency in entity references is crucial.
# 4. Strict Compliance
Adhere to the rules strictly. Non-compliance will result in termination

View file

@ -0,0 +1,9 @@
You are an expert in entity extraction and knowledge graph building focusing on the node identification.
Your task is to perform a detailed entity and concept extraction from text to generate a list of potential nodes for a knowledge graph.
• Node IDs should be names or human-readable identifiers found in the text.
• Extract clear, distinct entities and concepts as individual strings.
• Be exhaustive, ensure completeness by capturing all the entities, names, nouns, noun-parts, and implied or implicit mentions.
• Also extract potential entity type nodes, directly mentioned or implied.
• Avoid duplicates and overly generic terms.
• Consider different perspectives and indirect references.
• Return only a list of unique node strings with all the entities.

View file

@ -0,0 +1,10 @@
Extract distinct entities and concepts from the following text to expand the knowledge graph.
Build upon previously extracted entities, ensuring completeness and consistency.
Return all the previously extracted entities **together** with the new ones that you extracted.
This is round {{ round_number }} of {{ total_rounds }}.
**Text:**
{{ text }}
**Previously Extracted Entities:**
{{ nodes }}

View file

@ -1,9 +1,17 @@
import os import os
from typing import Type import asyncio
import json
from fileinput import filename
from typing import Type, List, Tuple, Dict, Any, Set
from langchain_experimental.graph_transformers.llm import system_prompt
from pydantic import BaseModel from pydantic import BaseModel
from streamlit import context
from cognee.infrastructure.llm.get_llm_client import get_llm_client from cognee.infrastructure.llm.get_llm_client import get_llm_client
from cognee.infrastructure.llm.prompts import render_prompt from cognee.infrastructure.llm.prompts import render_prompt
from cognee.infrastructure.llm.config import get_llm_config from cognee.infrastructure.llm.config import get_llm_config
from cognee.shared.data_models import KnowledgeGraph, NodeList, EdgeList, Node, Edge
async def extract_content_graph(content: str, response_model: Type[BaseModel]): async def extract_content_graph(content: str, response_model: Type[BaseModel]):
@ -21,10 +29,124 @@ async def extract_content_graph(content: str, response_model: Type[BaseModel]):
else: else:
base_directory = None base_directory = None
system_prompt = render_prompt(prompt_path, {}, base_directory=base_directory) system_prompt_graph = render_prompt(prompt_path, {}, base_directory=base_directory)
content_graph = await llm_client.acreate_structured_output( content_graph = await llm_client.acreate_structured_output(
content, system_prompt, response_model content, system_prompt_graph, response_model
) )
return content_graph return content_graph
def dedupe_and_normalize_nodes(nodes: List[Node]) -> List[Node]:
seen: Set[Tuple[str, str]] = set()
out: List[Node] = []
for node in nodes:
node.name = node.name.lower()
node.type = node.type.lower()
node.name = node.name.lower().replace("_", " ")
node.type = node.type.lower().replace("_", " ")
key = (node.name, node.type)
if key not in seen:
seen.add(key)
out.append(node)
return out
def dedupe_and_normalize_edges(edges: List[Edge]) -> List[Edge]:
seen: Set[Tuple[str, str, str]] = set()
out: List[Edge] = []
for edge in edges:
edge.relationship_name = edge.relationship_name.lower()
key = (edge.source_node_id, edge.relationship_name, edge.target_node_id)
if key not in seen:
seen.add(key)
out.append(edge)
return out
async def extract_content_graph2(
content: str, response_model: Type[BaseModel], node_rounds: int = 1, edge_rounds: int = 1
):
llm_client = get_llm_client()
###### NODE EXTRACTION
node_prompt_path = "node_extraction_prompt.txt"
node_system = render_prompt(node_prompt_path, {})
node_tasks = [
llm_client.acreate_structured_output(content, node_system, NodeList)
for _ in range(node_rounds)
]
node_results = await asyncio.gather(*node_tasks)
all_nodes: List[Node] = [node for nl in node_results for node in nl.nodes]
###### NODE DEDUPLICATION
all_nodes = dedupe_and_normalize_nodes(all_nodes)
all_nodes_merged = {
"nodes_to_deduplicate": json.dumps([n.model_dump() for n in all_nodes], ensure_ascii=False)
}
merge_system_prompt = "merge_nodes_system_prompt.txt"
merge_user_prompt = "merge_nodes_user_prompt.txt"
merge_system = render_prompt(filename=merge_system_prompt, context={})
merge_user = render_prompt(filename=merge_user_prompt, context=all_nodes_merged)
final_nodes_list = await llm_client.acreate_structured_output(
text_input=merge_user, system_prompt=merge_system, response_model=NodeList
)
###### EDGE EXTRACTION
edge_system_prompt = "edge_extraction_system_prompt.txt"
edge_user_prompt = "edge_extraction_user_prompt.txt"
edge_system = render_prompt(edge_system_prompt, {})
nodes_for_edge_extraction = {
"final_nodes": json.dumps(
[n.model_dump() for n in final_nodes_list.nodes], ensure_ascii=False
),
"text": content,
}
edge_user = render_prompt(edge_user_prompt, context=nodes_for_edge_extraction)
edge_tasks = [
llm_client.acreate_structured_output(
text_input=edge_user, system_prompt=edge_system, response_model=EdgeList
)
for _ in range(edge_rounds)
]
edge_results = await asyncio.gather(*edge_tasks)
all_edges: List[Edge] = [edge for nl in edge_results for edge in nl.edges]
###### EDGE DEDUPLICATION
all_edges = dedupe_and_normalize_edges(all_edges)
all_edges_merged = {
"edges_to_deduplicate": json.dumps([n.model_dump() for n in all_edges], ensure_ascii=False)
}
merge_system_prompt = "merge_edges_system_prompt.txt"
merge_user_prompt = "merge_edges_user_prompt.txt"
merge_system = render_prompt(filename=merge_system_prompt, context={})
merge_user = render_prompt(filename=merge_user_prompt, context=all_edges_merged)
final_edges_list = await llm_client.acreate_structured_output(
text_input=merge_user, system_prompt=merge_system, response_model=EdgeList
)
return KnowledgeGraph(nodes=final_nodes_list.nodes, edges=final_edges_list.edges)

View file

@ -0,0 +1,46 @@
import json
from typing import Type
from pydantic import BaseModel
from cognee.infrastructure.llm.get_llm_client import get_llm_client
from cognee.infrastructure.llm.prompts import render_prompt
from cognee.shared.data_models import KnowledgeGraph
async def extract_content_graph_sequential(
content: str, response_model: Type[BaseModel], graph_extraction_rounds: int = 2
):
llm_client = get_llm_client()
graph_system_prompt_path = "generate_graph_prompt_sequential.txt"
graph_user_prompt_path = "generate_graph_prompt_sequential_user.txt"
graph_system = render_prompt(graph_system_prompt_path, {})
current_nodes = []
current_edges = []
knowledge_graph = KnowledgeGraph(nodes=[], edges=[])
for round_idx in range(graph_extraction_rounds):
nodes_json = json.dumps([n.model_dump() for n in current_nodes], ensure_ascii=False)
edges_json = json.dumps([e.model_dump() for e in current_edges], ensure_ascii=False)
graph_user = render_prompt(
graph_user_prompt_path, #:TODO: this could use some formatting due to html #34 codes.
{
"text": content,
"graph": f"nodes: {nodes_json}, edges: {edges_json}",
},
)
knowledge_graph = await llm_client.acreate_structured_output(
text_input=graph_user,
system_prompt=graph_system,
response_model=response_model,
)
current_nodes = knowledge_graph.nodes
current_edges = knowledge_graph.edges
return knowledge_graph

View file

@ -0,0 +1,81 @@
import asyncio
import json
from typing import List, Tuple, Set
from cognee.infrastructure.llm.get_llm_client import get_llm_client
from cognee.infrastructure.llm.prompts import render_prompt
from cognee.shared.data_models import KnowledgeGraph, NodeList, EdgeList, Node, Edge
def dedupe_and_normalize_nodes(nodes: List[Node]) -> List[Node]:
seen: Set[Tuple[str, str]] = set()
out: List[Node] = []
for node in nodes:
node.name = node.name.lower()
node.type = node.type.lower()
node.name = node.name.lower().replace("_", " ")
node.type = node.type.lower().replace("_", " ")
key = (node.name, node.type)
if key not in seen:
seen.add(key)
out.append(node)
return out
async def extract_content_node_edge_multi_parallel(content: str, node_rounds: int = 1):
llm_client = get_llm_client()
###### NODE EXTRACTION
node_prompt_path = "node_extraction_prompt.txt"
node_system = render_prompt(node_prompt_path, {})
node_tasks = [
llm_client.acreate_structured_output(content, node_system, NodeList)
for _ in range(node_rounds)
]
node_results = await asyncio.gather(*node_tasks)
all_nodes: List[Node] = [node for nl in node_results for node in nl.nodes]
###### NODE DEDUPLICATION
all_nodes = dedupe_and_normalize_nodes(all_nodes)
all_nodes_merged = {
"nodes_to_deduplicate": json.dumps([n.model_dump() for n in all_nodes], ensure_ascii=False)
}
merge_system_prompt = "merge_nodes_system_prompt.txt"
merge_user_prompt = "merge_nodes_user_prompt.txt"
merge_system = render_prompt(filename=merge_system_prompt, context={})
merge_user = render_prompt(filename=merge_user_prompt, context=all_nodes_merged)
final_nodes_list = await llm_client.acreate_structured_output(
text_input=merge_user, system_prompt=merge_system, response_model=NodeList
)
###### EDGE EXTRACTION
edge_system_prompt = "edge_extraction_system_prompt.txt"
edge_user_prompt = "edge_extraction_user_prompt.txt"
edge_system = render_prompt(edge_system_prompt, {})
nodes_for_edge_extraction = {
"final_nodes": json.dumps(
[n.model_dump() for n in final_nodes_list.nodes], ensure_ascii=False
),
"text": content,
}
edge_user = render_prompt(edge_user_prompt, context=nodes_for_edge_extraction)
final_edges_list = await llm_client.acreate_structured_output(
text_input=edge_user, system_prompt=edge_system, response_model=EdgeList
)
return KnowledgeGraph(nodes=final_nodes_list.nodes, edges=final_edges_list.edges)

View file

@ -0,0 +1,57 @@
import json
from cognee.infrastructure.llm.get_llm_client import get_llm_client
from cognee.infrastructure.llm.prompts import render_prompt
from cognee.shared.data_models import KnowledgeGraph, NodeList, EdgeList
async def extract_content_node_edge_multi_sequential(
content: str, node_rounds: int = 2, edge_rounds=2
):
llm_client = get_llm_client()
current_nodes = NodeList()
for pass_idx in range(node_rounds):
nodes_json = json.dumps([n.model_dump() for n in current_nodes.nodes], ensure_ascii=False)
node_system = render_prompt("node_extraction_prompt_sequential.txt", {})
node_user = render_prompt(
"node_extraction_prompt_sequential_user.txt",
{
"text": content,
"nodes": {nodes_json},
"total_rounds": {node_rounds},
"round_number": {pass_idx},
},
)
current_nodes = await llm_client.acreate_structured_output(node_user, node_system, NodeList)
final_nodes = current_nodes
final_nodes_json = json.dumps([n.model_dump() for n in final_nodes.nodes], ensure_ascii=False)
current_edges = EdgeList()
for pass_idx in range(edge_rounds):
edges_json = json.dumps([n.model_dump() for n in current_edges.edges], ensure_ascii=False)
edges_system = render_prompt("edge_extraction_prompt_sequential.txt", {})
edges_user = render_prompt(
"edge_extraction_prompt_sequential_user.txt",
{
"text": content,
"nodes": {final_nodes_json},
"edges": {edges_json},
"total_rounds": {node_rounds},
"round_number": {pass_idx},
},
)
current_edges = await llm_client.acreate_structured_output(
edges_user, edges_system, EdgeList
)
final_edges = current_edges
return KnowledgeGraph(nodes=final_nodes.nodes, edges=final_edges.edges)

View file

@ -46,9 +46,6 @@ else:
name: str name: str
type: str type: str
description: str description: str
properties: Optional[Dict[str, Any]] = Field(
None, description="A dictionary of properties associated with the node."
)
class Edge(BaseModel): class Edge(BaseModel):
"""Edge in a knowledge graph.""" """Edge in a knowledge graph."""
@ -56,9 +53,16 @@ else:
source_node_id: str source_node_id: str
target_node_id: str target_node_id: str
relationship_name: str relationship_name: str
properties: Optional[Dict[str, Any]] = Field(
None, description="A dictionary of properties associated with the edge." class NodeList(BaseModel):
) """Nodes"""
nodes: List[Node] = Field(..., default_factory=list)
class EdgeList(BaseModel):
"""Nodes"""
edges: List[Edge] = Field(..., default_factory=list)
class KnowledgeGraph(BaseModel): class KnowledgeGraph(BaseModel):
"""Knowledge graph.""" """Knowledge graph."""

View file

@ -6,7 +6,22 @@ from pydantic import BaseModel
from cognee.infrastructure.databases.graph import get_graph_engine from cognee.infrastructure.databases.graph import get_graph_engine
from cognee.modules.ontology.rdf_xml.OntologyResolver import OntologyResolver from cognee.modules.ontology.rdf_xml.OntologyResolver import OntologyResolver
from cognee.modules.chunking.models.DocumentChunk import DocumentChunk from cognee.modules.chunking.models.DocumentChunk import DocumentChunk
from cognee.modules.data.extraction.knowledge_graph import extract_content_graph
from cognee.modules.data.extraction.knowledge_graph.extract_content_graph import (
extract_content_graph,
)
from cognee.modules.data.extraction.knowledge_graph.extract_content_node_edge_multi_parallel import (
extract_content_node_edge_multi_parallel,
)
from cognee.modules.data.extraction.knowledge_graph.extract_content_graph_sequential import (
extract_content_graph_sequential,
)
from cognee.modules.data.extraction.knowledge_graph.extract_content_node_edge_multi_sequential import (
extract_content_node_edge_multi_sequential,
)
from cognee.modules.graph.utils import ( from cognee.modules.graph.utils import (
expand_with_nodes_and_edges, expand_with_nodes_and_edges,
retrieve_existing_edges, retrieve_existing_edges,
@ -59,10 +74,17 @@ async def extract_graph_from_data(
Extracts and integrates a knowledge graph from the text content of document chunks using a specified graph model. Extracts and integrates a knowledge graph from the text content of document chunks using a specified graph model.
""" """
chunk_graphs = await asyncio.gather( chunk_graphs = await asyncio.gather(
*[extract_content_graph(chunk.text, graph_model) for chunk in data_chunks] # *[extract_content_graph(chunk.text, graph_model) for chunk in data_chunks]
# *[extract_content_node_edge_multi_parallel(content=chunk.text, node_rounds=2) for chunk in data_chunks]
# *[extract_content_graph_sequential(content=chunk.text, response_model=graph_model, graph_extraction_rounds=2) for chunk in data_chunks]
*[
extract_content_node_edge_multi_sequential(
content=chunk.text, node_rounds=1, edge_rounds=1
)
for chunk in data_chunks
]
) )
# Note: Filter edges with missing source or target nodes
if graph_model == KnowledgeGraph: if graph_model == KnowledgeGraph:
for graph in chunk_graphs: for graph in chunk_graphs:
valid_node_ids = {node.id for node in graph.nodes} valid_node_ids = {node.id for node in graph.nodes}
@ -71,7 +93,6 @@ async def extract_graph_from_data(
for edge in graph.edges for edge in graph.edges
if edge.source_node_id in valid_node_ids and edge.target_node_id in valid_node_ids if edge.source_node_id in valid_node_ids and edge.target_node_id in valid_node_ids
] ]
return await integrate_chunk_graphs( return await integrate_chunk_graphs(
data_chunks, chunk_graphs, graph_model, ontology_adapter or OntologyResolver() data_chunks, chunk_graphs, graph_model, ontology_adapter or OntologyResolver()
) )

View file

@ -180,14 +180,9 @@ async def main(enable_steps):
# Step 3: Create knowledge graph # Step 3: Create knowledge graph
if enable_steps.get("cognify"): if enable_steps.get("cognify"):
pipeline_run = await cognee.cognify() await cognee.cognify()
print("Knowledge graph created.") print("Knowledge graph created.")
# Step 4: Calculate descriptive metrics
if enable_steps.get("graph_metrics"):
await get_pipeline_run_metrics(pipeline_run, include_optional=True)
print("Descriptive graph metrics saved to database.")
# Step 5: Query insights # Step 5: Query insights
if enable_steps.get("retriever"): if enable_steps.get("retriever"):
search_results = await cognee.search( search_results = await cognee.search(

View file

@ -62,13 +62,9 @@ async def main():
os.path.dirname(os.path.abspath(__file__)), "ontology_input_example/basic_ontology.owl" os.path.dirname(os.path.abspath(__file__)), "ontology_input_example/basic_ontology.owl"
) )
pipeline_run = await cognee.cognify(ontology_file_path=ontology_path) await cognee.cognify(ontology_file_path=ontology_path)
print("Knowledge with ontology created.") print("Knowledge with ontology created.")
# Step 4: Calculate descriptive metrics
await get_pipeline_run_metrics(pipeline_run, include_optional=True)
print("Descriptive graph metrics saved to database.")
# Step 5: Query insights # Step 5: Query insights
search_results = await cognee.search( search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_type=SearchType.GRAPH_COMPLETION,