refactor: restructure examples and starter kit into new-examples (#1862)

<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [x] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Deprecated legacy examples and added a migration guide mapping old
paths to new locations
* Added a comprehensive new-examples README detailing configurations,
pipelines, demos, and migration notes

* **New Features**
* Added many runnable examples and demos: database configs,
embedding/LLM setups, permissions and access-control, custom pipelines
(organizational, product recommendation, code analysis, procurement),
multimedia, visualization, temporal/ontology demos, and a local UI
starter

* **Chores**
  * Updated CI/test entrypoints to use the new-examples layout

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
This commit is contained in:
Hande 2025-12-20 02:07:28 +01:00 committed by GitHub
parent 9b2b1a9c13
commit 5f8a3e24bd
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
56 changed files with 5961 additions and 0 deletions

View file

@ -1,3 +1,15 @@
# ⚠️ DEPRECATED - Go to `new-examples/` Instead
This starter kit is deprecated. Its examples have been integrated into the `/new-examples/` folder.
| Old Location | New Location |
|--------------|--------------|
| `src/pipelines/default.py` | none |
| `src/pipelines/low_level.py` | `new-examples/custom_pipelines/organizational_hierarchy/` |
| `src/pipelines/custom-model.py` | `new-examples/demos/custom_graph_model_entity_schema_definition.py` |
| `src/data/` | Included in `new-examples/custom_pipelines/organizational_hierarchy/data/` |
----------
# Cognee Starter Kit
Welcome to the <a href="https://github.com/topoteretes/cognee">cognee</a> Starter Repo! This repository is designed to help you get started quickly by providing a structured dataset and pre-built data pipelines using cognee to build powerful knowledge graphs.

48
examples/README.md Normal file
View file

@ -0,0 +1,48 @@
# ⚠️ DEPRECATED - Go to `new-examples/` Instead
This folder is deprecated. All examples have been reorganized into `/new-examples/`.
## Migration Guide
| Old Location | New Location |
|--------------|--------------|
| `python/simple_example.py` | `new-examples/demos/simple_default_cognee_pipelines_example.py` |
| `python/cognee_simple_document_demo.py` | `new-examples/demos/simple_document_qa/` |
| `python/multimedia_example.py` | `new-examples/demos/multimedia_processing/` |
| `python/ontology_demo_example.py` | `new-examples/demos/ontology_reference_vocabulary/` |
| `python/ontology_demo_example_2.py` | `new-examples/demos/ontology_medical_comparison/` |
| `python/temporal_example.py` | `new-examples/demos/temporal_awareness_example.py` |
| `python/conversation_session_persistence_example.py` | `new-examples/demos/conversation_session_persistence_example.py` |
| `python/feedback_enrichment_minimal_example.py` | `new-examples/demos/feedback_enrichment_minimal_example.py` |
| `python/simple_node_set_example.py` | `new-examples/demos/nodeset_memory_grouping_with_tags_example.py` |
| `python/weighted_edges_example.py` | `new-examples/demos/weighted_edges_relationships_example.py` |
| `python/dynamic_multiple_edges_example.py` | `new-examples/demos/dynamic_multiple_weighted_edges_example.py` |
| `python/web_url_fetcher_example.py` | `new-examples/demos/web_url_content_ingestion_example.py` |
| `python/permissions_example.py` | `new-examples/configurations/permissions_example/` |
| `python/run_custom_pipeline_example.py` | `new-examples/custom_pipelines/custom_cognify_pipeline_example.py` |
| `python/dynamic_steps_example.py` | `new-examples/custom_pipelines/dynamic_steps_resume_analysis_hr_example.py` |
| `python/memify_coding_agent_example.py` | `new-examples/custom_pipelines/memify_coding_agent_rule_extraction_example.py` |
| `python/agentic_reasoning_procurement_example.py` | `new-examples/custom_pipelines/agentic_reasoning_procurement_example.py` |
| `python/code_graph_example.py` | `new-examples/custom_pipelines/code_graph_repository_analysis_example.py` |
| `python/relational_database_migration_example.py` | `new-examples/custom_pipelines/relational_database_to_knowledge_graph_migration_example.py` |
| `database_examples/chromadb_example.py` | `new-examples/configurations/database_examples/chromadb_vector_database_configuration.py` |
| `database_examples/kuzu_example.py` | `new-examples/configurations/database_examples/kuzu_graph_database_configuration.py` |
| `database_examples/neo4j_example.py` | `new-examples/configurations/database_examples/neo4j_graph_database_configuration.py` |
| `database_examples/neptune_analytics_example.py` | `new-examples/configurations/database_examples/neptune_analytics_aws_database_configuration.py` |
| `database_examples/pgvector_example.py` | `new-examples/configurations/database_examples/pgvector_postgres_vector_database_configuration.py` |
| `low_level/pipeline.py` | `new-examples/custom_pipelines/organizational_hierarchy/` |
| `low_level/product_recommendation.py` | `new-examples/custom_pipelines/product_recommendation/` |
| `start_ui_example.py` | `new-examples/demos/start_local_ui_frontend_example.py` |
| `relational_db_with_dlt/relational_db_and_dlt.py` | `new-examples/custom_pipelines/relational_database_to_knowledge_graph_migration_example.py` |
## Files NOT Migrated
| File | Reason |
|------|--------|
| `python/graphiti_example.py` | External Graphiti integration; not core Cognee |
| `python/weighted_graph_visualization.html` | Generated artifact, not source code |
| `python/artifacts/` | Output directory, not example code |
| `relational_db_with_dlt/fix_foreign_keys.sql` | SQL helper script, not standalone example |
| `python/ontology_input_example/` | Data files moved to ontology demo folders |
| `low_level/*.json` | Data files moved to respective pipeline folders |

66
new-examples/README.md Normal file
View file

@ -0,0 +1,66 @@
# Cognee Examples
## 📁 Structure
| Folder | Purpose |
|--------|---------|
| `configurations/` | Database, LLM, embedding, and permission setups |
| `custom_pipelines/` | Building custom memory pipelines |
| `demos/` | Feature demos and getting started examples |
## 🔧 Configurations
| Path | Description |
|------|-------------|
| `database_examples/chromadb_vector_database_configuration.py` | ChromaDB vector database |
| `database_examples/kuzu_graph_database_configuration.py` | KuzuDB graph database |
| `database_examples/neo4j_graph_database_configuration.py` | Neo4j graph database |
| `database_examples/neptune_analytics_aws_database_configuration.py` | AWS Neptune Analytics |
| `database_examples/pgvector_postgres_vector_database_configuration.py` | PostgreSQL with PGVector |
| `database_examples/s3_storage_configuration.py` | Amazon S3 storage |
| `llm_configurations/openai_setup.py` | OpenAI LLM setup |
| `llm_configurations/azure_openai_setup.py` | Azure OpenAI LLM setup |
| `embedding_configurations/openai_setup.py` | OpenAI embeddings |
| `embedding_configurations/azure_openai_setup.py` | Azure OpenAI embeddings |
| `structured_output_configurations.py/baml_setup.py` | BAML structured output |
| `structured_output_configurations.py/litellm_intructor_setup.py` | LiteLLM Instructor setup |
| `permissions_example/` | Multi-user access control (with sample PDF) |
| `distributed_execution_with_modal_example.py` | Scale with Modal.com |
## 🔄 Custom Pipelines
| Path | Description |
|------|-------------|
| `custom_cognify_pipeline_example.py` | Customize cognify pipelines |
| `memify_coding_agent_rule_extraction_example.py` | Extract rules from conversations |
| `relational_database_to_knowledge_graph_migration_example.py` | SQL to knowledge graph |
| `agentic_reasoning_procurement_example.py` | AI procurement assistant |
| `code_graph_repository_analysis_example.py` | Code repository analysis |
| `dynamic_steps_resume_analysis_hr_example.py` | CV/resume filtering |
| `organizational_hierarchy/` | Org structure graphs (with JSON data) |
| `organizational_hierarchy/organizational_hierarchy_pipeline_low_level_example.py` | Low-level pipeline variant |
| `product_recommendation/` | Recommendation system (with customer data) |
## 🎯 Demos
| Path | Description |
|------|-------------|
| `simple_default_cognee_pipelines_example.py` | Default pipeline usage ★ |
| `simple_document_qa/` | Document Q&A (with alice_in_wonderland.txt) |
| `core_features_getting_started_example.py` | Intro to Cognee |
| `multimedia_processing/` | Audio/image processing (with media files) |
| `ontology_reference_vocabulary/` | Ontology as vocabulary (with OWL file) |
| `ontology_medical_comparison/` | Medical ontology comparison (with papers + OWL) |
| `web_url_content_ingestion_example.py` | Extract from web pages and ingest directly to memory |
| `temporal_awareness_example.py` | Time-based queries |
| `retrievers_and_search_examples.py` | Retriever patterns and search types guide |
| `feedback_enrichment_minimal_example.py` | User feedback enrichment |
| `nodeset_memory_grouping_with_tags_example.py` | Memory grouping with tags |
| `weighted_edges_relationships_example.py` | Weighted edge relationships |
| `dynamic_multiple_weighted_edges_example.py` | Multiple weighted edges |
| `custom_graph_model_entity_schema_definition.py` | Custom entity schemas ★ |
| `graph_visualization_example.py` | Visualize knowledge graphs |
| `conversation_session_persistence_example.py` | Session persistence |
| `custom_prompt_guide.py` | Custom prompts for extraction |
| `direct_llm_call_for_structured_output_example.py` | Direct LLM structured output |
| `start_local_ui_frontend_example.py` | Launch Cognee UI |

View file

@ -0,0 +1,89 @@
import os
import pathlib
import asyncio
import cognee
from cognee.modules.search.types import SearchType
async def main():
"""
Example script demonstrating how to use Cognee with ChromaDB
This example:
1. Configures Cognee to use ChromaDB as vector database
2. Sets up data directories
3. Adds sample data to Cognee
4. Processes (cognifies) the data
5. Performs different types of searches
"""
# Configure ChromaDB as the vector database provider
cognee.config.set_vector_db_config(
{
"vector_db_url": "http://localhost:8000", # Default ChromaDB server URL
"vector_db_key": "", # ChromaDB doesn't require an API key by default
"vector_db_provider": "chromadb", # Specify ChromaDB as provider
}
)
# Set up data directories for storing documents and system files
# You should adjust these paths to your needs
current_dir = pathlib.Path(__file__).parent
data_directory_path = str(current_dir / "data_storage")
cognee.config.data_root_directory(data_directory_path)
cognee_directory_path = str(current_dir / "cognee_system")
cognee.config.system_root_directory(cognee_directory_path)
# Clean any existing data (optional)
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
# Create a dataset
dataset_name = "chromadb_example"
# Add sample text to the dataset
sample_text = """ChromaDB is an open-source embedding database.
It allows users to store and query embeddings and their associated metadata.
ChromaDB can be deployed in various ways: in-memory, on disk via sqlite, or as a persistent service.
It is designed to be fast, scalable, and easy to use, making it a popular choice for AI applications.
The database is built to handle vector search efficiently, which is essential for semantic search applications.
ChromaDB supports multiple distance metrics for vector similarity search and can be integrated with various ML frameworks."""
# Add the sample text to the dataset
await cognee.add([sample_text], dataset_name)
# Process the added document to extract knowledge
await cognee.cognify([dataset_name])
# Now let's perform some searches
# 1. Search for insights related to "ChromaDB"
insights_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text="ChromaDB"
)
print("\nInsights about ChromaDB:")
for result in insights_results:
print(f"- {result}")
# 2. Search for text chunks related to "vector search"
chunks_results = await cognee.search(
query_type=SearchType.CHUNKS, query_text="vector search", datasets=[dataset_name]
)
print("\nChunks about vector search:")
for result in chunks_results:
print(f"- {result}")
# 3. Get graph completion related to databases
graph_completion_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text="database"
)
print("\nGraph completion for databases:")
for result in graph_completion_results:
print(f"- {result}")
# Clean up (optional)
# await cognee.prune.prune_data()
# await cognee.prune.prune_system(metadata=True)
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,87 @@
import os
import pathlib
import asyncio
import cognee
from cognee.modules.search.types import SearchType
async def main():
"""
Example script demonstrating how to use Cognee with KuzuDB
This example:
1. Configures Cognee to use KuzuDB as graph database
2. Sets up data directories
3. Adds sample data to Cognee
4. Processes (cognifies) the data
5. Performs different types of searches
"""
# Configure KuzuDB as the graph database provider
cognee.config.set_graph_db_config(
{
"graph_database_provider": "kuzu", # Specify KuzuDB as provider
}
)
# Set up data directories for storing documents and system files
# You should adjust these paths to your needs
current_dir = pathlib.Path(__file__).parent
data_directory_path = str(current_dir / "data_storage")
cognee.config.data_root_directory(data_directory_path)
cognee_directory_path = str(current_dir / "cognee_system")
cognee.config.system_root_directory(cognee_directory_path)
# Clean any existing data (optional)
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
# Create a dataset
dataset_name = "kuzu_example"
# Add sample text to the dataset
sample_text = """KuzuDB is a graph database system optimized for running complex graph analytics.
It is designed to be a high-performance graph database for data science workloads.
KuzuDB is built with modern hardware optimizations in mind.
It provides support for property graphs and offers a Cypher-like query language.
KuzuDB can handle both transactional and analytical graph workloads.
The database now includes vector search capabilities for AI applications and semantic search."""
# Add the sample text to the dataset
await cognee.add([sample_text], dataset_name)
# Process the added document to extract knowledge
await cognee.cognify([dataset_name])
# Now let's perform some searches
# 1. Search for insights related to "KuzuDB"
insights_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text="KuzuDB"
)
print("\nInsights about KuzuDB:")
for result in insights_results:
print(f"- {result}")
# 2. Search for text chunks related to "graph database"
chunks_results = await cognee.search(
query_type=SearchType.CHUNKS, query_text="graph database", datasets=[dataset_name]
)
print("\nChunks about graph database:")
for result in chunks_results:
print(f"- {result}")
# 3. Get graph completion related to databases
graph_completion_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text="database"
)
print("\nGraph completion for databases:")
for result in graph_completion_results:
print(f"- {result}")
# Clean up (optional)
# await cognee.prune.prune_data()
# await cognee.prune.prune_system(metadata=True)
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,96 @@
import os
import pathlib
import asyncio
import cognee
from cognee.modules.search.types import SearchType
async def main():
"""
Example script demonstrating how to use Cognee with Neo4j
This example:
1. Configures Cognee to use Neo4j as graph database
2. Sets up data directories
3. Adds sample data to Cognee
4. Processes (cognifies) the data
5. Performs different types of searches
"""
# Set up Neo4j credentials in .env file and get the values from environment variables
neo4j_url = os.getenv("GRAPH_DATABASE_URL")
neo4j_user = os.getenv("GRAPH_DATABASE_USERNAME")
neo4j_pass = os.getenv("GRAPH_DATABASE_PASSWORD")
# Configure Neo4j as the graph database provider
cognee.config.set_graph_db_config(
{
"graph_database_url": neo4j_url, # Neo4j Bolt URL
"graph_database_provider": "neo4j", # Specify Neo4j as provider
"graph_database_username": neo4j_user, # Neo4j username
"graph_database_password": neo4j_pass, # Neo4j password
}
)
# Set up data directories for storing documents and system files
# You should adjust these paths to your needs
current_dir = pathlib.Path(__file__).parent
data_directory_path = str(current_dir / "data_storage")
cognee.config.data_root_directory(data_directory_path)
cognee_directory_path = str(current_dir / "cognee_system")
cognee.config.system_root_directory(cognee_directory_path)
# Clean any existing data (optional)
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
# Create a dataset
dataset_name = "neo4j_example"
# Add sample text to the dataset
sample_text = """Neo4j is a graph database management system.
It stores data in nodes and relationships rather than tables as in traditional relational databases.
Neo4j provides a powerful query language called Cypher for graph traversal and analysis.
It now supports vector indexing for similarity search with the vector index plugin.
Neo4j allows embedding generation and vector search to be combined with graph operations.
Applications can use Neo4j to connect vector search with graph context for more meaningful results."""
# Add the sample text to the dataset
await cognee.add([sample_text], dataset_name)
# Process the added document to extract knowledge
await cognee.cognify([dataset_name])
# Now let's perform some searches
# 1. Search for insights related to "Neo4j"
insights_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text="Neo4j"
)
print("\nInsights about Neo4j:")
for result in insights_results:
print(f"- {result}")
# 2. Search for text chunks related to "graph database"
chunks_results = await cognee.search(
query_type=SearchType.CHUNKS, query_text="graph database", datasets=[dataset_name]
)
print("\nChunks about graph database:")
for result in chunks_results:
print(f"- {result}")
# 3. Get graph completion related to databases
graph_completion_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text="database"
)
print("\nGraph completion for databases:")
for result in graph_completion_results:
print(f"- {result}")
# Clean up (optional)
# await cognee.prune.prune_data()
# await cognee.prune.prune_system(metadata=True)
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,110 @@
import base64
import json
import os
import pathlib
import asyncio
import cognee
from cognee.modules.search.types import SearchType
from dotenv import load_dotenv
load_dotenv()
async def main():
"""
Example script demonstrating how to use Cognee with Amazon Neptune Analytics
This example:
1. Configures Cognee to use Neptune Analytics as graph database
2. Sets up data directories
3. Adds sample data to Cognee
4. Processes/cognifies the data
5. Performs different types of searches
"""
# Set up Amazon credentials in .env file and get the values from environment variables
graph_endpoint_url = "neptune-graph://" + os.getenv("GRAPH_ID", "")
# Configure Neptune Analytics as the graph & vector database provider
cognee.config.set_graph_db_config(
{
"graph_database_provider": "neptune_analytics", # Specify Neptune Analytics as provider
"graph_database_url": graph_endpoint_url, # Neptune Analytics endpoint with the format neptune-graph://<GRAPH_ID>
}
)
cognee.config.set_vector_db_config(
{
"vector_db_provider": "neptune_analytics", # Specify Neptune Analytics as provider
"vector_db_url": graph_endpoint_url, # Neptune Analytics endpoint with the format neptune-graph://<GRAPH_ID>
}
)
# Set up data directories for storing documents and system files
# You should adjust these paths to your needs
current_dir = pathlib.Path(__file__).parent
data_directory_path = str(current_dir / "data_storage")
cognee.config.data_root_directory(data_directory_path)
cognee_directory_path = str(current_dir / "cognee_system")
cognee.config.system_root_directory(cognee_directory_path)
# Clean any existing data (optional)
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
# Create a dataset
dataset_name = "neptune_example"
# Add sample text to the dataset
sample_text_1 = """Neptune Analytics is a memory-optimized graph database engine for analytics. With Neptune
Analytics, you can get insights and find trends by processing large amounts of graph data in seconds. To analyze
graph data quickly and easily, Neptune Analytics stores large graph datasets in memory. It supports a library of
optimized graph analytic algorithms, low-latency graph queries, and vector search capabilities within graph
traversals.
"""
sample_text_2 = """Neptune Analytics is an ideal choice for investigatory, exploratory, or data-science workloads
that require fast iteration for data, analytical and algorithmic processing, or vector search on graph data. It
complements Amazon Neptune Database, a popular managed graph database. To perform intensive analysis, you can load
the data from a Neptune Database graph or snapshot into Neptune Analytics. You can also load graph data that's
stored in Amazon S3.
"""
# Add the sample text to the dataset
await cognee.add([sample_text_1, sample_text_2], dataset_name)
# Process the added document to extract knowledge
await cognee.cognify([dataset_name])
# Now let's perform some searches
# 1. Search for insights related to "Neptune Analytics"
insights_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text="Neptune Analytics"
)
print("\n========Insights about Neptune Analytics========:")
for result in insights_results:
print(f"- {result}")
# 2. Search for text chunks related to "graph database"
chunks_results = await cognee.search(
query_type=SearchType.CHUNKS, query_text="graph database", datasets=[dataset_name]
)
print("\n========Chunks about graph database========:")
for result in chunks_results:
print(f"- {result}")
# 3. Get graph completion related to databases
graph_completion_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text="database"
)
print("\n========Graph completion for databases========:")
for result in graph_completion_results:
print(f"- {result}")
# Clean up (optional)
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,101 @@
import os
import pathlib
import asyncio
import cognee
from cognee.modules.search.types import SearchType
async def main():
"""
Example script demonstrating how to use Cognee with PGVector
This example:
1. Configures Cognee to use PostgreSQL with PGVector extension as vector database
2. Sets up data directories
3. Adds sample data to Cognee
4. Processes (cognifies) the data
5. Performs different types of searches
"""
# Configure PGVector as the vector database provider
cognee.config.set_vector_db_config(
{
"vector_db_provider": "pgvector", # Specify PGVector as provider
}
)
# Configure PostgreSQL connection details
# These settings are required for PGVector
cognee.config.set_relational_db_config(
{
"db_path": "",
"db_name": "cognee_db",
"db_host": "127.0.0.1",
"db_port": "5432",
"db_username": "cognee",
"db_password": "cognee",
"db_provider": "postgres",
}
)
# Set up data directories for storing documents and system files
# You should adjust these paths to your needs
current_dir = pathlib.Path(__file__).parent
data_directory_path = str(current_dir / "data_storage")
cognee.config.data_root_directory(data_directory_path)
cognee_directory_path = str(current_dir / "cognee_system")
cognee.config.system_root_directory(cognee_directory_path)
# Clean any existing data (optional)
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
# Create a dataset
dataset_name = "pgvector_example"
# Add sample text to the dataset
sample_text = """PGVector is an extension for PostgreSQL that adds vector similarity search capabilities.
It supports multiple indexing methods, including IVFFlat, HNSW, and brute-force search.
PGVector allows you to store vector embeddings directly in your PostgreSQL database.
It provides distance functions like L2 distance, inner product, and cosine distance.
Using PGVector, you can perform both metadata filtering and vector similarity search in a single query.
The extension is often used for applications like semantic search, recommendations, and image similarity."""
# Add the sample text to the dataset
await cognee.add([sample_text], dataset_name)
# Process the added document to extract knowledge
await cognee.cognify([dataset_name])
# Now let's perform some searches
# 1. Search for insights related to "PGVector"
insights_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text="PGVector"
)
print("\nInsights about PGVector:")
for result in insights_results:
print(f"- {result}")
# 2. Search for text chunks related to "vector similarity"
chunks_results = await cognee.search(
query_type=SearchType.CHUNKS, query_text="vector similarity", datasets=[dataset_name]
)
print("\nChunks about vector similarity:")
for result in chunks_results:
print(f"- {result}")
# 3. Get graph completion related to databases
graph_completion_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text="database"
)
print("\nGraph completion for databases:")
for result in graph_completion_results:
print(f"- {result}")
# Clean up (optional)
# await cognee.prune.prune_data()
# await cognee.prune.prune_system(metadata=True)
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,5 @@
"""
S3 Storage Configuration Example
Reference: https://docs.cognee.ai/guides/s3-storage
"""

View file

@ -0,0 +1,6 @@
"""
Distributed Execution with Modal Example
Reference: https://docs.cognee.ai/guides/distributed-execution
"""

View file

@ -0,0 +1,5 @@
"""
Azure OpenAI Embedding Setup Example
Reference: https://docs.cognee.ai/setup-configuration/embedding-providers
"""

View file

@ -0,0 +1,5 @@
"""
OpenAI Embedding Setup Example
Reference: https://docs.cognee.ai/setup-configuration/embedding-providers
"""

View file

@ -0,0 +1,5 @@
"""
Azure OpenAI Setup Example
Reference: https://docs.cognee.ai/setup-configuration/llm-providers
"""

View file

@ -0,0 +1,5 @@
"""
OpenAI Setup Example
Reference: https://docs.cognee.ai/setup-configuration/llm-providers
"""

View file

@ -0,0 +1,228 @@
import os
import cognee
import pathlib
from cognee.modules.users.exceptions import PermissionDeniedError
from cognee.modules.users.tenants.methods import select_tenant
from cognee.shared.logging_utils import get_logger
from cognee.modules.search.types import SearchType
from cognee.modules.users.methods import create_user
from cognee.modules.users.permissions.methods import authorized_give_permission_on_datasets
from cognee.modules.users.roles.methods import add_user_to_role
from cognee.modules.users.roles.methods import create_role
from cognee.modules.users.tenants.methods import create_tenant
from cognee.modules.users.tenants.methods import add_user_to_tenant
from cognee.modules.engine.operations.setup import setup
from cognee.shared.logging_utils import setup_logging, CRITICAL
logger = get_logger()
async def main():
# ENABLE PERMISSIONS FEATURE
# Note: When ENABLE_BACKEND_ACCESS_CONTROL is enabled vector provider is automatically set to use LanceDB
# and graph provider is set to use Kuzu.
os.environ["ENABLE_BACKEND_ACCESS_CONTROL"] = "True"
# Set the rest of your environment variables as needed. By default OpenAI is used as the LLM provider
# Reference the .env.tempalte file for available option and how to change LLM provider: https://github.com/topoteretes/cognee/blob/main/.env.template
# For example to set your OpenAI LLM API key use:
# os.environ["LLM_API_KEY"] = "your-api-key"
# Create a clean slate for cognee -- reset data and system state
print("Resetting cognee data...")
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
print("Data reset complete.\n")
# Set up the necessary databases and tables for user management.
await setup()
# NOTE: When a document is added in Cognee with permissions enabled only the owner of the document has permissions
# to work with the document initially.
# Add document for user_1, add it under dataset name AI
explanation_file_path = os.path.join(
pathlib.Path(__file__).parent, "data/artificial_intelligence.pdf"
)
print("Creating user_1: user_1@example.com")
user_1 = await create_user("user_1@example.com", "example")
await cognee.add([explanation_file_path], dataset_name="AI", user=user_1)
# Add document for user_2, add it under dataset name QUANTUM
text = """A quantum computer is a computer that takes advantage of quantum mechanical phenomena.
At small scales, physical matter exhibits properties of both particles and waves, and quantum computing leverages
this behavior, specifically quantum superposition and entanglement, using specialized hardware that supports the
preparation and manipulation of quantum states.
"""
print("\nCreating user_2: user_2@example.com")
user_2 = await create_user("user_2@example.com", "example")
await cognee.add([text], dataset_name="QUANTUM", user=user_2)
# Run cognify for both datasets as the appropriate user/owner
print("\nCreating different datasets for user_1 (AI dataset) and user_2 (QUANTUM dataset)")
ai_cognify_result = await cognee.cognify(["AI"], user=user_1)
quantum_cognify_result = await cognee.cognify(["QUANTUM"], user=user_2)
# Extract dataset_ids from cognify results
def extract_dataset_id_from_cognify(cognify_result):
"""Extract dataset_id from cognify output dictionary"""
for dataset_id, pipeline_result in cognify_result.items():
return dataset_id # Return the first dataset_id
return None
# Get dataset IDs from cognify results
# Note: When we want to work with datasets from other users (search, add, cognify and etc.) we must supply dataset
# information through dataset_id using dataset name only looks for datasets owned by current user
ai_dataset_id = extract_dataset_id_from_cognify(ai_cognify_result)
quantum_dataset_id = extract_dataset_id_from_cognify(quantum_cognify_result)
# We can see here that user_1 can read his own dataset (AI dataset)
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text="What is in the document?",
user=user_1,
datasets=[ai_dataset_id],
)
print("\nSearch results as user_1 on dataset owned by user_1:")
for result in search_results:
print(f"{result}\n")
# But user_1 cant read the dataset owned by user_2 (QUANTUM dataset)
print("\nSearch result as user_1 on the dataset owned by user_2:")
try:
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text="What is in the document?",
user=user_1,
datasets=[quantum_dataset_id],
)
except PermissionDeniedError:
print(f"User: {user_1} does not have permission to read from dataset: QUANTUM")
# user_1 currently also can't add a document to user_2's dataset (QUANTUM dataset)
print("\nAttempting to add new data as user_1 to dataset owned by user_2:")
try:
await cognee.add(
[explanation_file_path],
dataset_id=quantum_dataset_id,
user=user_1,
)
except PermissionDeniedError:
print(f"User: {user_1} does not have permission to write to dataset: QUANTUM")
# We've shown that user_1 can't interact with the dataset from user_2
# Now have user_2 give proper permission to user_1 to read QUANTUM dataset
# Note: supported permission types are "read", "write", "delete" and "share"
print(
"\nOperation started as user_2 to give read permission to user_1 for the dataset owned by user_2"
)
await authorized_give_permission_on_datasets(
user_1.id,
[quantum_dataset_id],
"read",
user_2.id,
)
# Now user_1 can read from quantum dataset after proper permissions have been assigned by the QUANTUM dataset owner.
print("\nSearch result as user_1 on the dataset owned by user_2:")
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text="What is in the document?",
user=user_1,
dataset_ids=[quantum_dataset_id],
)
for result in search_results:
print(f"{result}\n")
# If we'd like for user_1 to add new documents to the QUANTUM dataset owned by user_2, user_1 would have to get
# "write" access permission, which user_1 currently does not have
# Users can also be added to Roles and Tenants and then permission can be assigned on a Role/Tenant level as well
# To create a Role a user first must be an owner of a Tenant
print("User 2 is creating CogneeLab tenant/organization")
tenant_id = await create_tenant("CogneeLab", user_2.id)
print("User 2 is selecting CogneeLab tenant/organization as active tenant")
await select_tenant(user_id=user_2.id, tenant_id=tenant_id)
print("\nUser 2 is creating Researcher role")
role_id = await create_role(role_name="Researcher", owner_id=user_2.id)
print("\nCreating user_3: user_3@example.com")
user_3 = await create_user("user_3@example.com", "example")
# To add a user to a role he must be part of the same tenant/organization
print("\nOperation started as user_2 to add user_3 to CogneeLab tenant/organization")
await add_user_to_tenant(user_id=user_3.id, tenant_id=tenant_id, owner_id=user_2.id)
print(
"\nOperation started by user_2, as tenant owner, to add user_3 to Researcher role inside the tenant/organization"
)
await add_user_to_role(user_id=user_3.id, role_id=role_id, owner_id=user_2.id)
print("\nOperation as user_3 to select CogneeLab tenant/organization as active tenant")
await select_tenant(user_id=user_3.id, tenant_id=tenant_id)
print(
"\nOperation started as user_2, with CogneeLab as its active tenant, to give read permission to Researcher role for the dataset QUANTUM owned by user_2"
)
# Even though the dataset owner is user_2, the dataset doesn't belong to the tenant/organization CogneeLab.
# So we can't assign permissions to it when we're acting in the CogneeLab tenant.
try:
await authorized_give_permission_on_datasets(
role_id,
[quantum_dataset_id],
"read",
user_2.id,
)
except PermissionDeniedError:
print(
"User 2 could not give permission to the role as the QUANTUM dataset is not part of the CogneeLab tenant"
)
print(
"We will now create a new QUANTUM dataset with the QUANTUM_COGNEE_LAB name in the CogneeLab tenant so that permissions can be assigned to the Researcher role inside the tenant/organization"
)
# We can re-create the QUANTUM dataset in the CogneeLab tenant. The old QUANTUM dataset is still owned by user_2 personally
# and can still be accessed by selecting the personal tenant for user 2.
from cognee.modules.users.methods import get_user
# Note: We need to update user_2 from the database to refresh its tenant context changes
user_2 = await get_user(user_2.id)
await cognee.add([text], dataset_name="QUANTUM_COGNEE_LAB", user=user_2)
quantum_cognee_lab_cognify_result = await cognee.cognify(["QUANTUM_COGNEE_LAB"], user=user_2)
# The recreated Quantum dataset will now have a different dataset_id as it's a new dataset in a different organization
quantum_cognee_lab_dataset_id = extract_dataset_id_from_cognify(
quantum_cognee_lab_cognify_result
)
print(
"\nOperation started as user_2, with CogneeLab as its active tenant, to give read permission to Researcher role for the dataset QUANTUM owned by the CogneeLab tenant"
)
await authorized_give_permission_on_datasets(
role_id,
[quantum_cognee_lab_dataset_id],
"read",
user_2.id,
)
# Now user_3 can read from QUANTUM dataset as part of the Researcher role after proper permissions have been assigned by the QUANTUM dataset owner, user_2.
print("\nSearch result as user_3 on the QUANTUM dataset owned by the CogneeLab organization:")
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text="What is in the document?",
user=user_3,
dataset_ids=[quantum_cognee_lab_dataset_id],
)
for result in search_results:
print(f"{result}\n")
# Note: All of these function calls and permission system is available through our backend endpoints as well
if __name__ == "__main__":
import asyncio
logger = setup_logging(log_level=CRITICAL)
asyncio.run(main())

View file

@ -0,0 +1,5 @@
"""
BAML Setup Example
Reference: https://docs.cognee.ai/setup-configuration/structured-output-backends
"""

View file

@ -0,0 +1,5 @@
"""
Litellm Instructor Setup Example
Reference: https://docs.cognee.ai/setup-configuration/structured-output-backends
"""

View file

@ -0,0 +1,203 @@
import os
import logging
import cognee
import asyncio
from cognee.infrastructure.llm.LLMGateway import LLMGateway
from dotenv import load_dotenv
from cognee.api.v1.search import SearchType
from cognee.modules.engine.models import NodeSet
from cognee.shared.logging_utils import setup_logging
load_dotenv()
os.environ["LLM_API_KEY"] = ""
# Notes: Nodesets cognee feature only works with kuzu and Neo4j graph databases
os.environ["GRAPH_DATABASE_PROVIDER"] = "kuzu"
class ProcurementMemorySystem:
"""Procurement system with persistent memory using Cognee"""
async def setup_memory_data(self):
"""Load and store procurement data in memory"""
# Procurement system dummy data
vendor_conversation_text_techsupply = """
Assistant: Hello! This is Sarah from TechSupply Solutions.
Thanks for reaching out for your IT procurement needs.
User: We're looking to procure 50 high-performance enterprise laptops.
Specs: Intel i7, 16GB RAM, 512GB SSD, dedicated graphics card.
Budget: $80,000. What models do you have?
Assistant: TechSupply Solutions can offer Dell Precision 5570 ($1,450) and Lenovo ThinkPad P1 ($1,550).
Both come with a 3-year warranty. Delivery: 23 weeks (Dell), 34 weeks (Lenovo).
User: Do you provide bulk discounts? We're planning another 200 units next quarter.
Assistant: Yes! Orders over $50,000 get 8% off.
So for your current order:
- Dell = $1,334 each ($66,700 total)
- Lenovo = $1,426 each ($71,300 total)
And for 200 units next quarter, we can offer 12% off with flexible delivery.
"""
vendor_conversation_text_office_solutions = """
Assistant: Hi, this is Martin from vendor Office Solutions. How can we assist you?
User: We need 50 laptops for our engineers.
Specs: i7 CPU, 16GB RAM, 512GB SSD, dedicated GPU.
We can spend up to $80,000. Can you meet this?
Assistant: Office Solutions can offer HP ZBook Power G9 for $1,600 each.
Comes with 2-year warranty, delivery time is 45 weeks.
User: That's a bit long — any options to speed it up?
Assistant: We can expedite for $75 per unit, bringing delivery to 34 weeks.
Also, for orders over $60,000 we give 6% off.
So:
- Base price = $1,600 $1,504 with discount
- Expedited price = $1,579
User: Understood. Any room for better warranty terms?
Assistant: Were working on adding a 3-year warranty option next quarter for enterprise clients.
"""
previous_purchases_text = """
Previous Purchase Records:
1. Vendor: TechSupply Solutions
Item: Desktop computers - 25 units
Amount: $35,000
Date: 2024-01-15
Performance: Excellent delivery, good quality, delivered 2 days early
Rating: 5/5
Notes: Responsive support team, competitive pricing
2. Vendor: Office Solutions
Item: Office furniture
Amount: $12,000
Date: 2024-02-20
Performance: Delayed delivery by 1 week, average quality
Rating: 2/5
Notes: Poor communication, but acceptable product quality
"""
procurement_preferences_text = """
Procurement Policies and Preferences:
1. Preferred vendors must have 3+ year warranty coverage
2. Maximum delivery time: 30 days for non-critical items
3. Bulk discount requirements: minimum 5% for orders over $50,000
4. Prioritize vendors with sustainable/green practices
5. Vendor rating system: require minimum 4/5 rating for new contracts
"""
# Initializing and pruning databases
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
# Store data in different memory categories
await cognee.add(
data=[vendor_conversation_text_techsupply, vendor_conversation_text_office_solutions],
node_set=["vendor_conversations"],
)
await cognee.add(data=previous_purchases_text, node_set=["purchase_history"])
await cognee.add(data=procurement_preferences_text, node_set=["procurement_policies"])
# Process all data through Cognee's knowledge graph
await cognee.cognify()
async def search_memory(self, query, search_categories=None):
"""Search across different memory layers"""
results = {}
for category in search_categories:
category_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text=query,
node_type=NodeSet,
node_name=[category],
top_k=30,
)
results[category] = category_results
return results
async def run_procurement_example():
"""Main function demonstrating procurement memory system"""
print("Building AI Procurement System with Memory: Cognee Integration...\n")
# Initialize the procurement memory system
procurement_system = ProcurementMemorySystem()
# Setup memory with procurement data
print("Setting up procurement memory data...")
await procurement_system.setup_memory_data()
print("Memory successfully populated and processed.\n")
research_questions = {
"vendor_conversations": [
"What are the laptops that are discussed, together with their vendors?",
"What pricing was offered by each vendor before and after discounts?",
"What were the delivery time estimates for each product?",
],
"purchase_history": [
"Which vendors have we worked with in the past?",
"What were the satisfaction ratings for each vendor?",
"Were there any complaints or red flags associated with specific vendors?",
],
"procurement_policies": [
"What are our companys bulk discount requirements?",
"What is the maximum acceptable delivery time for non-critical items?",
"What is the minimum vendor rating for new contracts?",
],
}
research_notes = {}
print("Running contextual research questions...\n")
for category, questions in research_questions.items():
print(f"Category: {category}")
research_notes[category] = []
for q in questions:
print(f"Question: \n{q}")
results = await procurement_system.search_memory(q, search_categories=[category])
top_answer = results[category][0]
print(f"Answer: \n{top_answer.strip()}\n")
research_notes[category].append({"question": q, "answer": top_answer})
print("Contextual research complete.\n")
print("Compiling structured research information for decision-making...\n")
research_information = "\n\n".join(
f"Q: {note['question']}\nA: {note['answer'].strip()}"
for section in research_notes.values()
for note in section
)
print("Compiled Research Summary:\n")
print(research_information)
print("\nPassing research to LLM for final procurement recommendation...\n")
final_decision = await LLMGateway.acreate_structured_output(
text_input=research_information,
system_prompt="""You are a procurement decision assistant. Use the provided QA pairs that were collected through a research phase. Recommend the best vendor,
based on pricing, delivery, warranty, policy fit, and past performance. Be concise and justify your choice with evidence.
""",
response_model=str,
)
print("Final Decision:")
print(final_decision.strip())
# Run the example
if __name__ == "__main__":
setup_logging(logging.ERROR)
asyncio.run(run_procurement_example())

View file

@ -0,0 +1,58 @@
import argparse
import asyncio
import cognee
from cognee import SearchType
from cognee.shared.logging_utils import setup_logging, ERROR
from cognee.api.v1.cognify.code_graph_pipeline import run_code_graph_pipeline
async def main(repo_path, include_docs):
run_status = False
async for run_status in run_code_graph_pipeline(repo_path, include_docs=include_docs):
run_status = run_status
# Test CODE search
search_results = await cognee.search(query_type=SearchType.CODE, query_text="test")
assert len(search_results) != 0, "The search results list is empty."
print("\n\nSearch results are:\n")
for result in search_results:
print(f"{result}\n")
return run_status
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("--repo_path", type=str, required=True, help="Path to the repository")
parser.add_argument(
"--include_docs",
type=lambda x: x.lower() in ("true", "1"),
default=False,
help="Whether or not to process non-code files",
)
parser.add_argument(
"--time",
type=lambda x: x.lower() in ("true", "1"),
default=True,
help="Whether or not to time the pipeline run",
)
return parser.parse_args()
if __name__ == "__main__":
logger = setup_logging(log_level=ERROR)
args = parse_args()
if args.time:
import time
start_time = time.time()
asyncio.run(main(args.repo_path, args.include_docs))
end_time = time.time()
print("\n" + "=" * 50)
print(f"Pipeline Execution Time: {end_time - start_time:.2f} seconds")
print("=" * 50 + "\n")
else:
asyncio.run(main(args.repo_path, args.include_docs))

View file

@ -0,0 +1,84 @@
import asyncio
import cognee
from cognee.modules.engine.operations.setup import setup
from cognee.modules.users.methods import get_default_user
from cognee.shared.logging_utils import setup_logging, INFO
from cognee.modules.pipelines import Task
from cognee.api.v1.search import SearchType
# Prerequisites:
# 1. Copy `.env.template` and rename it to `.env`.
# 2. Add your OpenAI API key to the `.env` file in the `LLM_API_KEY` field:
# LLM_API_KEY = "your_key_here"
async def main():
# Create a clean slate for cognee -- reset data and system state
print("Resetting cognee data...")
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
print("Data reset complete.\n")
# Create relational database and tables
await setup()
# cognee knowledge graph will be created based on this text
text = """
Natural language processing (NLP) is an interdisciplinary
subfield of computer science and information retrieval.
"""
print("Adding text to cognee:")
print(text.strip())
# Let's recreate the cognee add pipeline through the custom pipeline framework
from cognee.tasks.ingestion import ingest_data, resolve_data_directories
user = await get_default_user()
# Values for tasks need to be filled before calling the pipeline
add_tasks = [
Task(resolve_data_directories, include_subdirectories=True),
Task(
ingest_data,
"main_dataset",
user,
),
]
# Forward tasks to custom pipeline along with data and user information
await cognee.run_custom_pipeline(
tasks=add_tasks, data=text, user=user, dataset="main_dataset", pipeline_name="add_pipeline"
)
print("Text added successfully.\n")
# Use LLMs and cognee to create knowledge graph
from cognee.api.v1.cognify.cognify import get_default_tasks
cognify_tasks = await get_default_tasks(user=user)
print("Recreating existing cognify pipeline in custom pipeline to create knowledge graph...\n")
await cognee.run_custom_pipeline(
tasks=cognify_tasks, user=user, dataset="main_dataset", pipeline_name="cognify_pipeline"
)
print("Cognify process complete.\n")
query_text = "Tell me about NLP"
print(f"Searching cognee for insights with query: '{query_text}'")
# Query cognee for insights on the added text
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text=query_text
)
print("Search results:")
# Display results
for result_text in search_results:
print(result_text)
if __name__ == "__main__":
logger = setup_logging(log_level=INFO)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())

View file

@ -0,0 +1,212 @@
import asyncio
import cognee
from cognee.api.v1.search import SearchType
from cognee.shared.logging_utils import setup_logging, ERROR
job_1 = """
CV 1: Relevant
Name: Dr. Emily Carter
Contact Information:
Email: emily.carter@example.com
Phone: (555) 123-4567
Summary:
Senior Data Scientist with over 8 years of experience in machine learning and predictive analytics. Expertise in developing advanced algorithms and deploying scalable models in production environments.
Education:
Ph.D. in Computer Science, Stanford University (2014)
B.S. in Mathematics, University of California, Berkeley (2010)
Experience:
Senior Data Scientist, InnovateAI Labs (2016 Present)
Led a team in developing machine learning models for natural language processing applications.
Implemented deep learning algorithms that improved prediction accuracy by 25%.
Collaborated with cross-functional teams to integrate models into cloud-based platforms.
Data Scientist, DataWave Analytics (2014 2016)
Developed predictive models for customer segmentation and churn analysis.
Analyzed large datasets using Hadoop and Spark frameworks.
Skills:
Programming Languages: Python, R, SQL
Machine Learning: TensorFlow, Keras, Scikit-Learn
Big Data Technologies: Hadoop, Spark
Data Visualization: Tableau, Matplotlib
"""
job_2 = """
CV 2: Relevant
Name: Michael Rodriguez
Contact Information:
Email: michael.rodriguez@example.com
Phone: (555) 234-5678
Summary:
Data Scientist with a strong background in machine learning and statistical modeling. Skilled in handling large datasets and translating data into actionable business insights.
Education:
M.S. in Data Science, Carnegie Mellon University (2013)
B.S. in Computer Science, University of Michigan (2011)
Experience:
Senior Data Scientist, Alpha Analytics (2017 Present)
Developed machine learning models to optimize marketing strategies.
Reduced customer acquisition cost by 15% through predictive modeling.
Data Scientist, TechInsights (2013 2017)
Analyzed user behavior data to improve product features.
Implemented A/B testing frameworks to evaluate product changes.
Skills:
Programming Languages: Python, Java, SQL
Machine Learning: Scikit-Learn, XGBoost
Data Visualization: Seaborn, Plotly
Databases: MySQL, MongoDB
"""
job_3 = """
CV 3: Relevant
Name: Sarah Nguyen
Contact Information:
Email: sarah.nguyen@example.com
Phone: (555) 345-6789
Summary:
Data Scientist specializing in machine learning with 6 years of experience. Passionate about leveraging data to drive business solutions and improve product performance.
Education:
M.S. in Statistics, University of Washington (2014)
B.S. in Applied Mathematics, University of Texas at Austin (2012)
Experience:
Data Scientist, QuantumTech (2016 Present)
Designed and implemented machine learning algorithms for financial forecasting.
Improved model efficiency by 20% through algorithm optimization.
Junior Data Scientist, DataCore Solutions (2014 2016)
Assisted in developing predictive models for supply chain optimization.
Conducted data cleaning and preprocessing on large datasets.
Skills:
Programming Languages: Python, R
Machine Learning Frameworks: PyTorch, Scikit-Learn
Statistical Analysis: SAS, SPSS
Cloud Platforms: AWS, Azure
"""
job_4 = """
CV 4: Not Relevant
Name: David Thompson
Contact Information:
Email: david.thompson@example.com
Phone: (555) 456-7890
Summary:
Creative Graphic Designer with over 8 years of experience in visual design and branding. Proficient in Adobe Creative Suite and passionate about creating compelling visuals.
Education:
B.F.A. in Graphic Design, Rhode Island School of Design (2012)
Experience:
Senior Graphic Designer, CreativeWorks Agency (2015 Present)
Led design projects for clients in various industries.
Created branding materials that increased client engagement by 30%.
Graphic Designer, Visual Innovations (2012 2015)
Designed marketing collateral, including brochures, logos, and websites.
Collaborated with the marketing team to develop cohesive brand strategies.
Skills:
Design Software: Adobe Photoshop, Illustrator, InDesign
Web Design: HTML, CSS
Specialties: Branding and Identity, Typography
"""
job_5 = """
CV 5: Not Relevant
Name: Jessica Miller
Contact Information:
Email: jessica.miller@example.com
Phone: (555) 567-8901
Summary:
Experienced Sales Manager with a strong track record in driving sales growth and building high-performing teams. Excellent communication and leadership skills.
Education:
B.A. in Business Administration, University of Southern California (2010)
Experience:
Sales Manager, Global Enterprises (2015 Present)
Managed a sales team of 15 members, achieving a 20% increase in annual revenue.
Developed sales strategies that expanded customer base by 25%.
Sales Representative, Market Leaders Inc. (2010 2015)
Consistently exceeded sales targets and received the 'Top Salesperson' award in 2013.
Skills:
Sales Strategy and Planning
Team Leadership and Development
CRM Software: Salesforce, Zoho
Negotiation and Relationship Building
"""
async def main(enable_steps):
# Step 1: Reset data and system state
if enable_steps.get("prune_data"):
await cognee.prune.prune_data()
print("Data pruned.")
if enable_steps.get("prune_system"):
await cognee.prune.prune_system(metadata=True)
print("System pruned.")
# Step 2: Add text
if enable_steps.get("add_text"):
text_list = [job_1, job_2, job_3, job_4, job_5]
for text in text_list:
await cognee.add(text)
print(f"Added text: {text[:35]}...")
# Step 3: Create knowledge graph
if enable_steps.get("cognify"):
await cognee.cognify()
print("Knowledge graph created.")
# Step 4: Query insights
if enable_steps.get("retriever"):
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text="Who has experience in design tools?"
)
print(search_results)
if __name__ == "__main__":
logger = setup_logging(log_level=ERROR)
rebuild_kg = True
retrieve = True
steps_to_enable = {
"prune_data": rebuild_kg,
"prune_system": rebuild_kg,
"add_text": rebuild_kg,
"cognify": rebuild_kg,
"graph_metrics": rebuild_kg,
"retriever": retrieve,
}
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main(steps_to_enable))
finally:
loop.run_until_complete(loop.shutdown_asyncgens())

View file

@ -0,0 +1,110 @@
import asyncio
import pathlib
import os
import cognee
from cognee import memify
from cognee.api.v1.visualize.visualize import visualize_graph
from cognee.shared.logging_utils import setup_logging, ERROR
from cognee.modules.pipelines.tasks.task import Task
from cognee.tasks.memify.extract_subgraph_chunks import extract_subgraph_chunks
from cognee.tasks.codingagents.coding_rule_associations import add_rule_associations
# Prerequisites:
# 1. Copy `.env.template` and rename it to `.env`.
# 2. Add your OpenAI API key to the `.env` file in the `LLM_API_KEY` field:
# LLM_API_KEY = "your_key_here"
async def main():
# Create a clean slate for cognee -- reset data and system state
print("Resetting cognee data...")
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
print("Data reset complete.\n")
print("Adding conversation about rules to cognee:\n")
coding_rules_chat_from_principal_engineer = """
We want code to be formatted by PEP8 standards.
Typing and Docstrings must be added.
Please also make sure to write NOTE: on all more complex code segments.
If there is any duplicate code, try to handle it in one function to avoid code duplication.
Susan should also always review new code changes before merging to main.
New releases should not happen on Friday so we don't have to fix them during the weekend.
"""
print(
f"Coding rules conversation with principal engineer: {coding_rules_chat_from_principal_engineer}"
)
coding_rules_chat_from_manager = """
Susan should always review new code changes before merging to main.
New releases should not happen on Friday so we don't have to fix them during the weekend.
"""
print(f"Coding rules conversation with manager: {coding_rules_chat_from_manager}")
# Add the text, and make it available for cognify
await cognee.add([coding_rules_chat_from_principal_engineer, coding_rules_chat_from_manager])
print("Text added successfully.\n")
# Use LLMs and cognee to create knowledge graph
await cognee.cognify()
print("Cognify process complete.\n")
# Visualize graph after cognification
file_path = os.path.join(
pathlib.Path(__file__).parent, ".artifacts", "graph_visualization_only_cognify.html"
)
await visualize_graph(file_path)
print(f"Open file to see graph visualization only after cognification: {file_path}\n")
# After graph is created, create a second pipeline that will go through the graph and enchance it with specific
# coding rule nodes
# extract_subgraph_chunks is a function that returns all document chunks from specified subgraphs (if no subgraph is specifed the whole graph will be sent through memify)
subgraph_extraction_tasks = [Task(extract_subgraph_chunks)]
# add_rule_associations is a function that handles processing coding rules from chunks and keeps track of
# existing rules so duplicate rules won't be created. As the result of this processing new Rule nodes will be created
# in the graph that specify coding rules found in conversations.
coding_rules_association_tasks = [
Task(
add_rule_associations,
rules_nodeset_name="coding_agent_rules",
task_config={"batch_size": 1},
),
]
# Memify accepts these tasks and orchestrates forwarding of graph data through these tasks (if data is not specified).
# If data is explicitely specified in the arguments this specified data will be forwarded through the tasks instead
await memify(
extraction_tasks=subgraph_extraction_tasks,
enrichment_tasks=coding_rules_association_tasks,
)
# Find the new specific coding rules added to graph through memify (created based on chat conversation between team members)
coding_rules = await cognee.search(
query_text="List me the coding rules",
query_type=cognee.SearchType.CODING_RULES,
node_name=["coding_agent_rules"],
)
print("Coding rules created by memify:")
for coding_rule in coding_rules:
print("- " + coding_rule)
# Visualize new graph with added memify context
file_path = os.path.join(
pathlib.Path(__file__).parent, ".artifacts", "graph_visualization_after_memify.html"
)
await visualize_graph(file_path)
print(f"\nOpen file to see graph visualization after memify enhancment: {file_path}")
if __name__ == "__main__":
logger = setup_logging(log_level=ERROR)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())

View file

@ -0,0 +1,39 @@
[
{
"name": "TechNova Inc.",
"departments": [
"Engineering",
"Marketing"
]
},
{
"name": "GreenFuture Solutions",
"departments": [
"Research & Development",
"Sales",
"Customer Support"
]
},
{
"name": "Skyline Financials",
"departments": [
"Accounting"
]
},
{
"name": "MediCare Plus",
"departments": [
"Healthcare",
"Administration"
]
},
{
"name": "NextGen Robotics",
"departments": [
"AI Development",
"Manufacturing",
"HR"
]
}
]

View file

@ -0,0 +1,53 @@
[
{
"name": "John Doe",
"company": "TechNova Inc.",
"department": "Engineering"
},
{
"name": "Jane Smith",
"company": "TechNova Inc.",
"department": "Marketing"
},
{
"name": "Alice Johnson",
"company": "GreenFuture Solutions",
"department": "Sales"
},
{
"name": "Bob Williams",
"company": "GreenFuture Solutions",
"department": "Customer Support"
},
{
"name": "Michael Brown",
"company": "Skyline Financials",
"department": "Accounting"
},
{
"name": "Emily Davis",
"company": "MediCare Plus",
"department": "Healthcare"
},
{
"name": "David Wilson",
"company": "MediCare Plus",
"department": "Administration"
},
{
"name": "Emma Thompson",
"company": "NextGen Robotics",
"department": "AI Development"
},
{
"name": "Chris Martin",
"company": "NextGen Robotics",
"department": "Manufacturing"
},
{
"name": "Sophia White",
"company": "NextGen Robotics",
"department": "HR"
}
]

View file

@ -0,0 +1,119 @@
import os
import json
import asyncio
from typing import List, Any
from cognee import prune
from cognee import visualize_graph
from cognee.low_level import setup, DataPoint
from cognee.modules.data.methods import load_or_create_datasets
from cognee.modules.users.methods import get_default_user
from cognee.pipelines import run_tasks, Task
from cognee.tasks.storage import add_data_points
class Person(DataPoint):
name: str
# Metadata "index_fields" specifies which DataPoint fields should be embedded for vector search
metadata: dict = {"index_fields": ["name"]}
class Department(DataPoint):
name: str
employees: list[Person]
# Metadata "index_fields" specifies which DataPoint fields should be embedded for vector search
metadata: dict = {"index_fields": ["name"]}
class CompanyType(DataPoint):
name: str = "Company"
# Metadata "index_fields" specifies which DataPoint fields should be embedded for vector search
metadata: dict = {"index_fields": ["name"]}
class Company(DataPoint):
name: str
departments: list[Department]
is_type: CompanyType
# Metadata "index_fields" specifies which DataPoint fields should be embedded for vector search
metadata: dict = {"index_fields": ["name"]}
def ingest_files(data: List[Any]):
people_data_points = {}
departments_data_points = {}
companies_data_points = {}
for data_item in data:
people = data_item["people"]
companies = data_item["companies"]
for person in people:
new_person = Person(name=person["name"])
people_data_points[person["name"]] = new_person
if person["department"] not in departments_data_points:
departments_data_points[person["department"]] = Department(
name=person["department"], employees=[new_person]
)
else:
departments_data_points[person["department"]].employees.append(new_person)
# Create a single CompanyType node, so we connect all companies to it.
companyType = CompanyType()
for company in companies:
new_company = Company(name=company["name"], departments=[], is_type=companyType)
companies_data_points[company["name"]] = new_company
for department_name in company["departments"]:
if department_name not in departments_data_points:
departments_data_points[department_name] = Department(
name=department_name, employees=[]
)
new_company.departments.append(departments_data_points[department_name])
return list(companies_data_points.values())
async def main():
await prune.prune_data()
await prune.prune_system(metadata=True)
# Create relational database tables
await setup()
# If no user is provided use default user
user = await get_default_user()
# Create dataset object to keep track of pipeline status
datasets = await load_or_create_datasets(["test_dataset"], [], user)
# Prepare data for pipeline
companies_file_path = os.path.join(os.path.dirname(__file__), "data", "companies.json")
companies = json.loads(open(companies_file_path, "r").read())
people_file_path = os.path.join(os.path.dirname(__file__), "data", "people.json")
people = json.loads(open(people_file_path, "r").read())
# Run tasks expects a list of data even if it is just one document
data = [{"companies": companies, "people": people}]
pipeline = run_tasks(
[Task(ingest_files), Task(add_data_points)],
dataset_id=datasets[0].id,
data=data,
incremental_loading=False,
)
async for status in pipeline:
print(status)
# Or use our simple graph preview
graph_file_path = str(
os.path.join(os.path.dirname(__file__), ".artifacts/graph_visualization.html")
)
await visualize_graph(graph_file_path)
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,266 @@
"""Cognee demo with simplified structure."""
from __future__ import annotations
import asyncio
import json
import logging
from collections import defaultdict
from pathlib import Path
from typing import Any, Iterable, List, Mapping
from cognee import config, prune, search, SearchType, visualize_graph
from cognee.low_level import setup, DataPoint
from cognee.pipelines import run_tasks, Task
from cognee.tasks.storage import add_data_points
from cognee.tasks.storage.index_graph_edges import index_graph_edges
from cognee.modules.users.methods import get_default_user
from cognee.modules.data.methods import load_or_create_datasets
class Person(DataPoint):
"""Represent a person."""
name: str
metadata: dict = {"index_fields": ["name"]}
class Department(DataPoint):
"""Represent a department."""
name: str
employees: list[Person]
metadata: dict = {"index_fields": ["name"]}
class CompanyType(DataPoint):
"""Represent a company type."""
name: str = "Company"
class Company(DataPoint):
"""Represent a company."""
name: str
departments: list[Department]
is_type: CompanyType
metadata: dict = {"index_fields": ["name"]}
ROOT = Path(__file__).resolve().parent
DATA_DIR = ROOT / "data"
COGNEE_DIR = ROOT / ".cognee_system"
ARTIFACTS_DIR = ROOT / ".artifacts"
GRAPH_HTML = ARTIFACTS_DIR / "graph_visualization.html"
COMPANIES_JSON = DATA_DIR / "companies.json"
PEOPLE_JSON = DATA_DIR / "people.json"
def load_json_file(path: Path) -> Any:
"""Load a JSON file."""
if not path.exists():
raise FileNotFoundError(f"Missing required file: {path}")
return json.loads(path.read_text(encoding="utf-8"))
def remove_duplicates_preserve_order(seq: Iterable[Any]) -> list[Any]:
"""Return list with duplicates removed while preserving order."""
seen = set()
out = []
for x in seq:
if x in seen:
continue
seen.add(x)
out.append(x)
return out
def collect_people(payloads: Iterable[Mapping[str, Any]]) -> list[Mapping[str, Any]]:
"""Collect people from payloads."""
people = [person for payload in payloads for person in payload.get("people", [])]
return people
def collect_companies(payloads: Iterable[Mapping[str, Any]]) -> list[Mapping[str, Any]]:
"""Collect companies from payloads."""
companies = [company for payload in payloads for company in payload.get("companies", [])]
return companies
def build_people_nodes(people: Iterable[Mapping[str, Any]]) -> dict:
"""Build person nodes keyed by name."""
nodes = {p["name"]: Person(name=p["name"]) for p in people if p.get("name")}
return nodes
def group_people_by_department(people: Iterable[Mapping[str, Any]]) -> dict:
"""Group person names by department."""
groups = defaultdict(list)
for person in people:
name = person.get("name")
if not name:
continue
dept = person.get("department", "Unknown")
groups[dept].append(name)
return groups
def collect_declared_departments(
groups: Mapping[str, list[str]], companies: Iterable[Mapping[str, Any]]
) -> set:
"""Collect department names referenced anywhere."""
names = set(groups)
for company in companies:
for dept in company.get("departments", []):
names.add(dept)
return names
def build_department_nodes(dept_names: Iterable[str]) -> dict:
"""Build department nodes keyed by name."""
nodes = {name: Department(name=name, employees=[]) for name in dept_names}
return nodes
def build_company_nodes(companies: Iterable[Mapping[str, Any]], company_type: CompanyType) -> dict:
"""Build company nodes keyed by name."""
nodes = {
c["name"]: Company(name=c["name"], departments=[], is_type=company_type)
for c in companies
if c.get("name")
}
return nodes
def iterate_company_department_pairs(companies: Iterable[Mapping[str, Any]]):
"""Yield (company_name, department_name) pairs."""
for company in companies:
comp_name = company.get("name")
if not comp_name:
continue
for dept in company.get("departments", []):
yield comp_name, dept
def attach_departments_to_companies(
companies: Iterable[Mapping[str, Any]],
dept_nodes: Mapping[str, Department],
company_nodes: Mapping[str, Company],
) -> None:
"""Attach department nodes to companies."""
for comp_name in company_nodes:
company_nodes[comp_name].departments = []
for comp_name, dept_name in iterate_company_department_pairs(companies):
dept = dept_nodes.get(dept_name)
company = company_nodes.get(comp_name)
if not dept or not company:
continue
company.departments.append(dept)
def attach_employees_to_departments(
groups: Mapping[str, list[str]],
people_nodes: Mapping[str, Person],
dept_nodes: Mapping[str, Department],
) -> None:
"""Attach employees to departments."""
for dept in dept_nodes.values():
dept.employees = []
for dept_name, names in groups.items():
unique_names = remove_duplicates_preserve_order(names)
target = dept_nodes.get(dept_name)
if not target:
continue
employees = [people_nodes[n] for n in unique_names if n in people_nodes]
target.employees = employees
def build_companies(payloads: Iterable[Mapping[str, Any]]) -> list[Company]:
"""Build company nodes from payloads."""
people = collect_people(payloads)
companies = collect_companies(payloads)
people_nodes = build_people_nodes(people)
groups = group_people_by_department(people)
dept_names = collect_declared_departments(groups, companies)
dept_nodes = build_department_nodes(dept_names)
company_type = CompanyType()
company_nodes = build_company_nodes(companies, company_type)
attach_departments_to_companies(companies, dept_nodes, company_nodes)
attach_employees_to_departments(groups, people_nodes, dept_nodes)
result = list(company_nodes.values())
return result
def load_default_payload() -> list[Mapping[str, Any]]:
"""Load the default payload from data files."""
companies = load_json_file(COMPANIES_JSON)
people = load_json_file(PEOPLE_JSON)
payload = [{"companies": companies, "people": people}]
return payload
def ingest_payloads(data: List[Any] | None) -> list[Company]:
"""Ingest payloads and build company nodes."""
if not data or data == [None]:
data = load_default_payload()
companies = build_companies(data)
return companies
async def execute_pipeline() -> None:
"""Execute Cognee pipeline."""
# Configure system paths
logging.info("Configuring Cognee directories at %s", COGNEE_DIR)
config.system_root_directory(str(COGNEE_DIR))
ARTIFACTS_DIR.mkdir(parents=True, exist_ok=True)
# Reset state and initialize
await prune.prune_system(metadata=True)
await setup()
# Get user and dataset
user = await get_default_user()
datasets = await load_or_create_datasets(["demo_dataset"], [], user)
dataset_id = datasets[0].id
# Build and run pipeline
tasks = [Task(ingest_payloads), Task(add_data_points)]
pipeline = run_tasks(tasks, dataset_id, None, user, "demo_pipeline")
async for status in pipeline:
logging.info("Pipeline status: %s", status)
# Post-process: index graph edges and visualize
await index_graph_edges()
await visualize_graph(str(GRAPH_HTML))
# Run query against graph
completion = await search(
query_text="Who works for GreenFuture Solutions?",
query_type=SearchType.GRAPH_COMPLETION,
)
result = completion
logging.info("Graph completion result: %s", result)
def configure_logging() -> None:
"""Configure logging."""
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s | %(levelname)s | %(message)s",
)
async def main() -> None:
"""Run main function."""
configure_logging()
try:
await execute_pipeline()
except Exception:
logging.exception("Run failed")
raise
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,109 @@
[{
"id": "customer_1",
"name": "John Doe",
"preferences": [{
"id": "preference_1",
"name": "ShoeSize",
"value": "40.5"
}, {
"id": "preference_2",
"name": "Color",
"value": "Navy Blue"
}, {
"id": "preference_3",
"name": "Color",
"value": "White"
}, {
"id": "preference_4",
"name": "ShoeType",
"value": "Regular Sneakers"
}],
"products": [{
"id": "product_1",
"name": "Sneakers",
"price": 79.99,
"colors": ["Blue", "Brown"],
"type": "Regular Sneakers",
"action": "purchased"
}, {
"id": "product_2",
"name": "Shirt",
"price": 19.99,
"colors": ["Black"],
"type": "T-Shirt",
"action": "liked"
}, {
"id": "product_3",
"name": "Jacket",
"price": 59.99,
"colors": ["Gray", "White"],
"type": "Jacket",
"action": "purchased"
}, {
"id": "product_4",
"name": "Shoes",
"price": 129.99,
"colors": ["Red", "Black"],
"type": "Formal Shoes",
"action": "liked"
}]
}, {
"id": "customer_2",
"name": "Jane Smith",
"preferences": [{
"id": "preference_5",
"name": "ShoeSize",
"value": "38.5"
}, {
"id": "preference_6",
"name": "Color",
"value": "Black"
}, {
"id": "preference_7",
"name": "ShoeType",
"value": "Slip-On"
}],
"products": [{
"id": "product_5",
"name": "Sneakers",
"price": 69.99,
"colors": ["Blue", "White"],
"type": "Slip-On",
"action": "purchased"
}, {
"id": "product_6",
"name": "Shirt",
"price": 14.99,
"colors": ["Red", "Blue"],
"type": "T-Shirt",
"action": "purchased"
}, {
"id": "product_7",
"name": "Jacket",
"price": 49.99,
"colors": ["Gray", "Black"],
"type": "Jacket",
"action": "liked"
}]
}, {
"id": "customer_3",
"name": "Michael Johnson",
"preferences": [{
"id": "preference_8",
"name": "Color",
"value": "Red"
}, {
"id": "preference_9",
"name": "ShoeType",
"value": "Boots"
}],
"products": [{
"id": "product_8",
"name": "Cowboy Boots",
"price": 299.99,
"colors": ["Red", "White"],
"type": "Cowboy Boots",
"action": "purchased"
}]
}]

View file

@ -0,0 +1,193 @@
import os
import json
import asyncio
from neo4j import exceptions
from cognee import prune
# from cognee import visualize_graph
from cognee.infrastructure.databases.graph import get_graph_engine
from cognee.low_level import setup, DataPoint
from cognee.pipelines import run_tasks, Task
from cognee.tasks.storage import add_data_points
class Products(DataPoint):
name: str = "Products"
products_aggregator_node = Products()
class Product(DataPoint):
id: str
name: str
type: str
price: float
colors: list[str]
is_type: Products = products_aggregator_node
class Preferences(DataPoint):
name: str = "Preferences"
preferences_aggregator_node = Preferences()
class Preference(DataPoint):
id: str
name: str
value: str
is_type: Preferences = preferences_aggregator_node
class Customers(DataPoint):
name: str = "Customers"
customers_aggregator_node = Customers()
class Customer(DataPoint):
id: str
name: str
has_preference: list[Preference]
purchased: list[Product]
liked: list[Product]
is_type: Customers = customers_aggregator_node
def ingest_files():
customers_file_path = os.path.join(os.path.dirname(__file__), "data/customers.json")
customers = json.loads(open(customers_file_path, "r").read())
customers_data_points = {}
products_data_points = {}
preferences_data_points = {}
for customer in customers:
new_customer = Customer(
id=customer["id"],
name=customer["name"],
liked=[],
purchased=[],
has_preference=[],
)
customers_data_points[customer["name"]] = new_customer
for product in customer["products"]:
if product["id"] not in products_data_points:
products_data_points[product["id"]] = Product(
id=product["id"],
type=product["type"],
name=product["name"],
price=product["price"],
colors=product["colors"],
)
new_product = products_data_points[product["id"]]
if product["action"] == "purchased":
new_customer.purchased.append(new_product)
elif product["action"] == "liked":
new_customer.liked.append(new_product)
for preference in customer["preferences"]:
if preference["id"] not in preferences_data_points:
preferences_data_points[preference["id"]] = Preference(
id=preference["id"],
name=preference["name"],
value=preference["value"],
)
new_preference = preferences_data_points[preference["id"]]
new_customer.has_preference.append(new_preference)
return customers_data_points.values()
async def main():
await prune.prune_data()
await prune.prune_system(metadata=True)
await setup()
pipeline = run_tasks([Task(ingest_files), Task(add_data_points)])
async for status in pipeline:
print(status)
graph_engine = await get_graph_engine()
products_results = await graph_engine.query(
"""
// Step 1: Use new customers's preferences from input
UNWIND $preferences AS pref_input
// Step 2: Find other customers who have these preferences
MATCH (other_customer:Customer)-[:has_preference]->(preference:Preference)
WHERE preference.value = pref_input
WITH other_customer, count(preference) AS similarity_score
// Step 3: Limit to the top-N most similar customers
ORDER BY similarity_score DESC
LIMIT 5
// Step 4: Get products that these similar customers have purchased
MATCH (other_customer)-[:purchased]->(product:Product)
// Step 5: Rank products based on frequency
RETURN product, count(*) AS recommendation_score
ORDER BY recommendation_score DESC
LIMIT 10
""",
{
"preferences": ["White", "Navy Blue", "Regular Sneakers"],
},
)
print("Top 10 recommended products:")
for result in products_results:
print(f"{result['product']['id']}: {result['product']['name']}")
try:
await graph_engine.query(
"""
// Match the customer and their stored shoe size preference
MATCH (customer:Customer {id: $customer_id})
OPTIONAL MATCH (customer)-[:has_preference]->(preference:Preference {name: 'ShoeSize'})
// Assume the new shoe size is passed as a parameter $new_size
WITH customer, preference, $new_size AS new_size
// If a stored preference exists and it does not match the new value,
// raise an error using APOC's utility procedure.
CALL apoc.util.validate(
preference IS NOT NULL AND preference.value <> new_size,
"Conflicting shoe size preference: existing size is " + preference.value + " and new size is " + new_size,
[]
)
// If no conflict, continue with the update or further processing
// ...
RETURN customer
""",
{
"customer_id": "customer_1",
"new_size": "42",
},
)
except exceptions.ClientError as error:
print(f"Anomaly detected: {str(error.message)}")
# # Or use our simple graph preview
# graph_file_path = str(
# os.path.join(os.path.dirname(__file__), ".artifacts/graph_visualization.html")
# )
# await visualize_graph(graph_file_path)
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,110 @@
from pathlib import Path
import asyncio
import os
import cognee
from cognee.infrastructure.databases.relational.config import get_migration_config
from cognee.infrastructure.databases.graph import get_graph_engine
from cognee.api.v1.visualize.visualize import visualize_graph
from cognee.infrastructure.databases.relational import (
get_migration_relational_engine,
)
from cognee.modules.search.types import SearchType
from cognee.infrastructure.databases.relational import (
create_db_and_tables as create_relational_db_and_tables,
)
from cognee.infrastructure.databases.vector.pgvector import (
create_db_and_tables as create_vector_db_and_tables,
)
# Prerequisites:
# 1. Copy `.env.template` and rename it to `.env`.
# 2. Add your OpenAI API key to the `.env` file in the `LLM_API_KEY` field:
# LLM_API_KEY = "your_key_here"
# 3. Fill all relevant MIGRATION_DB information for the database you want to migrate to graph / Cognee
# NOTE: If you don't have a DB you want to migrate you can try it out with our
# test database at the following location:
# MIGRATION_DB_PATH="/{path_to_your_local_cognee}/cognee/tests/test_data"
# MIGRATION_DB_NAME="migration_database.sqlite"
# MIGRATION_DB_PROVIDER="sqlite"
async def main():
# Clean all data stored in Cognee
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
# Needed to create appropriate database tables only on the Cognee side
await create_relational_db_and_tables()
await create_vector_db_and_tables()
# In case environment variables are not set use the example database from the Cognee repo
migration_db_provider = os.environ.get("MIGRATION_DB_PROVIDER", "sqlite")
migration_db_path = os.environ.get(
"MIGRATION_DB_PATH",
os.path.join(Path(__file__).resolve().parent.parent.parent, "cognee/tests/test_data"),
)
migration_db_name = os.environ.get("MIGRATION_DB_NAME", "migration_database.sqlite")
migration_config = get_migration_config()
migration_config.migration_db_provider = migration_db_provider
migration_config.migration_db_path = migration_db_path
migration_config.migration_db_name = migration_db_name
engine = get_migration_relational_engine()
print("\nExtracting schema of database to migrate.")
schema = await engine.extract_schema()
print(f"Migrated database schema:\n{schema}")
graph = await get_graph_engine()
print("Migrating relational database to graph database based on schema.")
from cognee.tasks.ingestion import migrate_relational_database
await migrate_relational_database(graph, schema=schema)
print("Relational database migration complete.")
# Make sure to set top_k at a high value for a broader search, the default value is only 10!
# top_k represent the number of graph tripplets to supply to the LLM to answer your question
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text="What kind of data do you contain?",
top_k=200,
)
print(f"Search results: {search_results}")
# Having a top_k value set to too high might overwhelm the LLM context when specific questions need to be answered.
# For this kind of question we've set the top_k to 50
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text="What invoices are related to Leonie Köhler?",
top_k=50,
)
print(f"Search results: {search_results}")
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text="What invoices are related to Luís Gonçalves?",
top_k=50,
)
print(f"Search results: {search_results}")
# If you check the relational database for this example you can see that the search results successfully found all
# the invoices related to the two customers, without any hallucinations or additional information
# Define location where to store html visualization of graph of the migrated database
home_dir = os.path.expanduser("~")
destination_file_path = os.path.join(home_dir, "graph_visualization.html")
print("Adding html visualization of graph database after migration.")
await visualize_graph(destination_file_path)
print(f"Visualization can be found at: {destination_file_path}")
if __name__ == "__main__":
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())

View file

@ -0,0 +1,98 @@
import asyncio
import cognee
from cognee import visualize_graph
from cognee.memify_pipelines.persist_sessions_in_knowledge_graph import (
persist_sessions_in_knowledge_graph_pipeline,
)
from cognee.modules.search.types import SearchType
from cognee.modules.users.methods import get_default_user
from cognee.shared.logging_utils import get_logger
logger = get_logger("conversation_session_persistence_example")
async def main():
# NOTE: CACHING has to be enabled for this example to work
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
text_1 = "Cognee is a solution that can build knowledge graph from text, creating an AI memory system"
text_2 = "Germany is a country located next to the Netherlands"
await cognee.add([text_1, text_2])
await cognee.cognify()
question = "What can I use to create a knowledge graph?"
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text=question,
)
print("\nSession ID: default_session")
print(f"Question: {question}")
print(f"Answer: {search_results}\n")
question = "You sure about that?"
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text=question
)
print("\nSession ID: default_session")
print(f"Question: {question}")
print(f"Answer: {search_results}\n")
question = "This is awesome!"
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text=question
)
print("\nSession ID: default_session")
print(f"Question: {question}")
print(f"Answer: {search_results}\n")
question = "Where is Germany?"
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text=question,
session_id="different_session",
)
print("\nSession ID: different_session")
print(f"Question: {question}")
print(f"Answer: {search_results}\n")
question = "Next to which country again?"
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text=question,
session_id="different_session",
)
print("\nSession ID: different_session")
print(f"Question: {question}")
print(f"Answer: {search_results}\n")
question = "So you remember everything I asked from you?"
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text=question,
session_id="different_session",
)
print("\nSession ID: different_session")
print(f"Question: {question}")
print(f"Answer: {search_results}\n")
session_ids_to_persist = ["default_session", "different_session"]
default_user = await get_default_user()
await persist_sessions_in_knowledge_graph_pipeline(
user=default_user,
session_ids=session_ids_to_persist,
)
await visualize_graph()
if __name__ == "__main__":
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())

View file

@ -0,0 +1,6 @@
"""
Core Features Getting Started Example
Reference: https://colab.research.google.com/drive/12Vi9zID-M3fpKpKiaqDBvkk98ElkRPWy?usp=sharing
"""

View file

@ -0,0 +1,87 @@
import os
import asyncio
import pathlib
from cognee import config, add, cognify, search, SearchType, prune, visualize_graph
from cognee.low_level import DataPoint
async def main():
data_directory_path = str(
pathlib.Path(os.path.join(pathlib.Path(__file__).parent, ".data_storage")).resolve()
)
# Set up the data directory. Cognee will store files here.
config.data_root_directory(data_directory_path)
cognee_directory_path = str(
pathlib.Path(os.path.join(pathlib.Path(__file__).parent, ".cognee_system")).resolve()
)
# Set up the Cognee system directory. Cognee will store system files and databases here.
config.system_root_directory(cognee_directory_path)
# Prune data and system metadata before running, only if we want "fresh" state.
await prune.prune_data()
await prune.prune_system(metadata=True)
text = "The Python programming language is widely used in data analysis, web development, and machine learning."
# Add the text data to Cognee.
await add(text)
# Define a custom graph model for programming languages.
class FieldType(DataPoint):
name: str = "Field"
class Field(DataPoint):
name: str
is_type: FieldType
metadata: dict = {"index_fields": ["name"]}
class ProgrammingLanguageType(DataPoint):
name: str = "Programming Language"
class ProgrammingLanguage(DataPoint):
name: str
used_in: list[Field] = []
is_type: ProgrammingLanguageType
metadata: dict = {"index_fields": ["name"]}
# Cognify the text data.
await cognify(graph_model=ProgrammingLanguage)
# Or use our simple graph preview
graph_file_path = str(
pathlib.Path(
os.path.join(pathlib.Path(__file__).parent, ".artifacts/graph_visualization.html")
).resolve()
)
await visualize_graph(graph_file_path)
# Completion query that uses graph data to form context.
graph_completion = await search(
query_text="What is python?", query_type=SearchType.GRAPH_COMPLETION
)
print("Graph completion result is:")
print(graph_completion)
# Completion query that uses document chunks to form context.
rag_completion = await search(
query_text="What is Python?", query_type=SearchType.RAG_COMPLETION
)
print("Completion result is:")
print(rag_completion)
# Query all summaries related to query.
summaries = await search(query_text="Python", query_type=SearchType.SUMMARIES)
print("Summary results are:")
for summary in summaries:
print(summary)
chunks = await search(query_text="Python", query_type=SearchType.CHUNKS)
print("Chunk results are:")
for chunk in chunks:
print(chunk)
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,7 @@
"""
Custom Prompt Example
Reference: https://docs.cognee.ai/guides/custom-prompts
"""

View file

@ -0,0 +1,4 @@
"""
Direct LLM Call for Structured Output Example
Reference: https://docs.cognee.ai/guides/low-level-llm
"""

View file

@ -0,0 +1,115 @@
import asyncio
from os import path
from typing import Any
from pydantic import SkipValidation
from cognee.api.v1.visualize.visualize import visualize_graph
from cognee.infrastructure.engine import DataPoint
from cognee.infrastructure.engine.models.Edge import Edge
from cognee.tasks.storage import add_data_points
import cognee
class Employee(DataPoint):
name: str
role: str
class Company(DataPoint):
name: str
industry: str
employs: SkipValidation[Any] # Mixed list: employees with/without weights
async def main():
# Clear the database for a clean state
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
# Create employees
michael = Employee(name="Michael", role="Regional Manager")
dwight = Employee(name="Dwight", role="Assistant to the Regional Manager")
jim = Employee(name="Jim", role="Sales Representative")
pam = Employee(name="Pam", role="Receptionist")
kevin = Employee(name="Kevin", role="Accountant")
angela = Employee(name="Angela", role="Senior Accountant")
oscar = Employee(name="Oscar", role="Accountant")
stanley = Employee(name="Stanley", role="Sales Representative")
phyllis = Employee(name="Phyllis", role="Sales Representative")
# Create Dunder Mifflin with mixed employee relationships
dunder_mifflin = Company(
name="Dunder Mifflin Paper Company",
industry="Paper Sales",
employs=[
# Manager with high authority weight
(Edge(weight=0.9, relationship_type="manager"), michael),
# Sales team with performance weights
(
Edge(weights={"sales_performance": 0.8, "loyalty": 0.9}, relationship_type="sales"),
dwight,
),
(
Edge(
weights={"sales_performance": 0.7, "creativity": 0.8}, relationship_type="sales"
),
jim,
),
(
Edge(
weights={"sales_performance": 0.6, "customer_service": 0.9},
relationship_type="sales",
),
phyllis,
),
(
Edge(
weights={"sales_performance": 0.5, "experience": 0.8}, relationship_type="sales"
),
stanley,
),
# Accounting department as a group
(
Edge(
weights={"department_efficiency": 0.8, "team_cohesion": 0.9},
relationship_type="accounting",
),
[oscar, kevin, angela],
),
# Admin staff without weights (simple relationships)
pam,
],
)
all_data_points = [
michael,
dwight,
jim,
pam,
kevin,
angela,
oscar,
stanley,
phyllis,
dunder_mifflin,
]
# Add data points to the graph
await add_data_points(all_data_points)
# Visualize the graph
graph_visualization_path = path.join(path.dirname(__file__), "dunder_mifflin_graph.html")
await visualize_graph(graph_visualization_path)
print("Dynamic multiple edges graph has been created and visualized!")
print(f"Visualization saved to: {graph_visualization_path}")
print("\nTechnical features demonstrated:")
print("- Mixed list support: weighted and unweighted relationships in single field")
print("- Single weight edges with relationship types")
print("- Multiple weight edges with custom metrics")
print("- Group relationships: single edge connecting multiple nodes")
print("- Simple relationships without edge metadata")
print("- Flexible edge extraction from heterogeneous data structures")
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,81 @@
import asyncio
import cognee
from cognee.api.v1.search import SearchType
from cognee.modules.pipelines.tasks.task import Task
from cognee.tasks.graph import extract_graph_from_data
from cognee.tasks.storage import add_data_points
from cognee.shared.data_models import KnowledgeGraph
from cognee.tasks.feedback.extract_feedback_interactions import extract_feedback_interactions
from cognee.tasks.feedback.generate_improved_answers import generate_improved_answers
from cognee.tasks.feedback.create_enrichments import create_enrichments
from cognee.tasks.feedback.link_enrichments_to_feedback import link_enrichments_to_feedback
CONVERSATION = [
"Alice: Hey, Bob. Did you talk to Mallory?",
"Bob: Yeah, I just saw her before coming here.",
"Alice: Then she told you to bring my documents, right?",
"Bob: Uh… not exactly. She said you wanted me to bring you donuts. Which sounded kind of odd…",
"Alice: Ugh, shes so annoying. Thanks for the donuts anyway!",
]
async def initialize_conversation_and_graph(conversation):
"""Prune data/system, add conversation, cognify."""
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
await cognee.add(conversation)
await cognee.cognify()
async def run_question_and_submit_feedback(question_text: str) -> bool:
"""Ask question, submit feedback based on correctness, and return correctness flag."""
result = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text=question_text,
save_interaction=True,
)
answer_text = str(result).lower()
mentions_mallory = "mallory" in answer_text
feedback_text = (
"Great answers, very helpful!"
if mentions_mallory
else "The answer about Bob and donuts was wrong."
)
await cognee.search(
query_type=SearchType.FEEDBACK,
query_text=feedback_text,
last_k=1,
)
return mentions_mallory
async def run_feedback_enrichment_memify(last_n: int = 5):
"""Execute memify with extraction, answer improvement, enrichment creation, and graph processing tasks."""
# Instantiate tasks with their own kwargs
extraction_tasks = [Task(extract_feedback_interactions, last_n=last_n)]
enrichment_tasks = [
Task(generate_improved_answers, top_k=20),
Task(create_enrichments),
Task(extract_graph_from_data, graph_model=KnowledgeGraph, task_config={"batch_size": 10}),
Task(add_data_points, task_config={"batch_size": 10}),
Task(link_enrichments_to_feedback),
]
await cognee.memify(
extraction_tasks=extraction_tasks,
enrichment_tasks=enrichment_tasks,
data=[{}], # A placeholder to prevent fetching the entire graph
)
async def main():
await initialize_conversation_and_graph(CONVERSATION)
is_correct = await run_question_and_submit_feedback("Who told Bob to bring the donuts?")
if not is_correct:
await run_feedback_enrichment_memify(last_n=5)
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,5 @@
"""
Graph Visualization Example
Reference: https://docs.cognee.ai/guides/graph-visualization
"""

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

View file

@ -0,0 +1,55 @@
import os
import asyncio
import pathlib
from cognee.shared.logging_utils import setup_logging, ERROR
import cognee
from cognee.api.v1.search import SearchType
# Prerequisites:
# 1. Copy `.env.template` and rename it to `.env`.
# 2. Add your OpenAI API key to the `.env` file in the `LLM_API_KEY` field:
# LLM_API_KEY = "your_key_here"
async def main():
# Create a clean slate for cognee -- reset data and system state
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
# cognee knowledge graph will be created based on the text
# and description of these files
mp3_file_path = os.path.join(
pathlib.Path(__file__).parent,
"data/text_to_speech.mp3",
)
png_file_path = os.path.join(
pathlib.Path(__file__).parent,
"data/example.png",
)
# Add the files, and make it available for cognify
await cognee.add([mp3_file_path, png_file_path])
# Use LLMs and cognee to create knowledge graph
await cognee.cognify()
# Query cognee for summaries of the data in the multimedia files
search_results = await cognee.search(
query_type=SearchType.SUMMARIES,
query_text="What is in the multimedia files?",
)
# Display search results
for result_text in search_results:
print(result_text)
if __name__ == "__main__":
logger = setup_logging(log_level=ERROR)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())

View file

@ -0,0 +1,44 @@
import os
import asyncio
import cognee
from cognee.api.v1.visualize.visualize import visualize_graph
from cognee.shared.logging_utils import setup_logging, ERROR
text_a = """
AI is revolutionizing financial services through intelligent fraud detection
and automated customer service platforms.
"""
text_b = """
Advances in AI are enabling smarter systems that learn and adapt over time.
"""
text_c = """
MedTech startups have seen significant growth in recent years, driven by innovation
in digital health and medical devices.
"""
node_set_a = ["AI", "FinTech"]
node_set_b = ["AI"]
node_set_c = ["MedTech"]
async def main():
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
await cognee.add(text_a, node_set=node_set_a)
await cognee.add(text_b, node_set=node_set_b)
await cognee.add(text_c, node_set=node_set_c)
await cognee.cognify()
visualization_path = os.path.join(
os.path.dirname(__file__), "./.artifacts/graph_visualization.html"
)
await visualize_graph(visualization_path)
if __name__ == "__main__":
logger = setup_logging(log_level=ERROR)
loop = asyncio.new_event_loop()
asyncio.run(main())

View file

@ -0,0 +1,313 @@
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
xmlns:ns1="http://example.org/ontology#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
>
<rdf:Description rdf:about="http://example.org/ontology#Type2Diabetes">
<rdf:type rdf:resource="http://example.org/ontology#Disease"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdfs:comment>A chronic condition that affects how the body processes glucose.</rdfs:comment>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Obesity"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#PoorDiet"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#SedentaryLifestyle"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Genetics"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#WeightLoss"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#Exercise"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#ModerateCoffeeConsumption"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#IncreasedThirst"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#FrequentUrination"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Fatigue"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#InsulinResistance">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#HighBloodSugar">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#hasPreventiveFactor">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/>
<rdfs:domain rdf:resource="http://example.org/ontology#Disease"/>
<rdfs:range rdf:resource="http://example.org/ontology#PreventiveFactor"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Hypertension">
<rdf:type rdf:resource="http://example.org/ontology#Disease"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdfs:comment>A condition where the force of blood against artery walls is too high.</rdfs:comment>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#HighSaltIntake"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Obesity"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Stress"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Genetics"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#LowSodiumDiet"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#Exercise"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#ModerateCoffeeConsumption"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Headache"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Dizziness"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#BlurredVision"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Cancer">
<rdf:type rdf:resource="http://example.org/ontology#Disease"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdfs:comment>A disease of abnormal cell growth with potential to invade or spread.</rdfs:comment>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Smoking"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Radiation"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Infections"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#GeneticMutations"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#Screening"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#HealthyDiet"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#ModerateCoffeeConsumption"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#UnexplainedWeightLoss"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Fatigue"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#LumpFormation"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#hasRiskFactor">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/>
<rdfs:domain rdf:resource="http://example.org/ontology#Disease"/>
<rdfs:range rdf:resource="http://example.org/ontology#RiskFactor"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#LowSodiumDiet">
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#MetabolicSyndrome">
<rdf:type rdf:resource="http://example.org/ontology#Disease"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdfs:comment>A cluster of conditions increasing the risk of heart disease and diabetes.</rdfs:comment>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Obesity"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#InsulinResistance"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Hypertension"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#HealthyDiet"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#PhysicalActivity"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#GreenCoffeeBlend"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#IncreasedWaistCircumference"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#HighBloodSugar"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Fatigue"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#ShortnessofBreath">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Disease">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
<rdfs:comment>Disease is a concept used to classify relevant medical terms.</rdfs:comment>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#HeartDisease">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Screening">
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#AtrialFibrillation">
<rdf:type rdf:resource="http://example.org/ontology#Disease"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdfs:comment>An irregular and often rapid heart rhythm that may cause blood clots.</rdfs:comment>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#HeartDisease"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#HighBloodPressure"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#AlcoholUse"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#HealthyDiet"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#Exercise"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#ModerateCoffeeConsumption"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Palpitations"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Weakness"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#ShortnessofBreath"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Genetics">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#ChestPain">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#PhysicalActivity">
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#CardiovascularDisease">
<rdf:type rdf:resource="http://example.org/ontology#Disease"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdfs:comment>A class of diseases that involve the heart or blood vessels.</rdfs:comment>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Smoking"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#HighBloodPressure"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#HighCholesterol"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Diabetes"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Obesity"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#PhysicalActivity"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#MediterraneanDiet"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#ModerateCoffeeConsumption"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#ChestPain"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#ShortnessofBreath"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Fatigue"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#HealthyDiet">
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#HighSaltIntake">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Diabetes">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Palpitations">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Headache">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#BloodPressureControl">
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#PreventiveFactor">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
<rdfs:comment>PreventiveFactor is a concept used to classify relevant medical terms.</rdfs:comment>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Symptom">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
<rdfs:comment>Symptom is a concept used to classify relevant medical terms.</rdfs:comment>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#MediterraneanDiet">
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#HighBloodPressure">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#IncreasedThirst">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Swelling">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#BlurredVision">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#HeartFailure">
<rdf:type rdf:resource="http://example.org/ontology#Disease"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdfs:comment>A condition in which the heart is unable to pump sufficiently.</rdfs:comment>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#CoronaryArteryDisease"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Hypertension"/>
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Diabetes"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#BloodPressureControl"/>
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#HealthyLifestyle"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#ShortnessofBreath"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Swelling"/>
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Fatigue"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#UnexplainedWeightLoss">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Fatigue">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Infections">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#IncreasedWaistCircumference">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#HealthyLifestyle">
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#SedentaryLifestyle">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#GreenCoffeeBlend">
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#ModerateCoffeeConsumption">
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Obesity">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#HighCholesterol">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#GeneticMutations">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#AlcoholUse">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Dizziness">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Radiation">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#LumpFormation">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#PoorDiet">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Smoking">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Stress">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#hasSymptom">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/>
<rdfs:domain rdf:resource="http://example.org/ontology#Disease"/>
<rdfs:range rdf:resource="http://example.org/ontology#Symptom"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#CoronaryArteryDisease">
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Exercise">
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Weakness">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#WeightLoss">
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#FrequentUrination">
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#RiskFactor">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
<rdfs:comment>RiskFactor is a concept used to classify relevant medical terms.</rdfs:comment>
</rdf:Description>
</rdf:RDF>

View file

@ -0,0 +1,110 @@
import cognee
import asyncio
from cognee.shared.logging_utils import setup_logging
import os
import textwrap
from cognee.api.v1.search import SearchType
from cognee.api.v1.visualize.visualize import visualize_graph
from cognee.modules.ontology.rdf_xml.RDFLibOntologyResolver import RDFLibOntologyResolver
from cognee.modules.ontology.ontology_config import Config
async def run_pipeline(ontology_path=None):
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
scientific_papers_dir = os.path.join(
os.path.dirname(os.path.abspath(__file__)), "data/scientific_papers/"
)
await cognee.add(scientific_papers_dir)
config: Config = {
"ontology_config": {
"ontology_resolver": RDFLibOntologyResolver(ontology_file=ontology_path)
}
}
pipeline_run = await cognee.cognify(config=config)
return pipeline_run
async def query_pipeline(questions):
answers = []
for question in questions:
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text=question,
)
answers.append(search_results)
return answers
def print_comparison_table(questions, answers_with, answers_without, col_width=45):
separator = "-" * (col_width * 3 + 6)
header = f"{'Question'.ljust(col_width)} | {'WITH Ontology (owl grounded facts)'.ljust(col_width)} | {'WITHOUT Ontology'.ljust(col_width)}"
logger.info(separator)
logger.info(header)
logger.info(separator)
for q, with_o, without_o in zip(questions, answers_with, answers_without):
q_wrapped = textwrap.fill(q, width=col_width)
with_o_wrapped = textwrap.fill(str(with_o), width=col_width)
without_o_wrapped = textwrap.fill(str(without_o), width=col_width)
q_lines = q_wrapped.split("\n")
with_lines = with_o_wrapped.split("\n")
without_lines = without_o_wrapped.split("\n")
max_lines = max(len(q_lines), len(with_lines), len(without_lines))
for i in range(max_lines):
q_line = q_lines[i] if i < len(q_lines) else ""
with_line = with_lines[i] if i < len(with_lines) else ""
without_line = without_lines[i] if i < len(without_lines) else ""
logger.info(
f"{q_line.ljust(col_width)} | {with_line.ljust(col_width)} | {without_line.ljust(col_width)}"
)
logger.info(separator)
async def main():
questions = [
"What are common risk factors for Type 2 Diabetes?",
"What preventive measures reduce the risk of Hypertension?",
"What symptoms indicate possible Cardiovascular Disease?",
"I have blurred vision and a headache. What diease do I have?",
"What diseases are associated with Obesity?",
]
ontology_path = os.path.join(
os.path.dirname(os.path.abspath(__file__)),
"data/enriched_medical_ontology_with_classes.owl",
)
logger.info("\n--- Generating answers WITH ontology ---\n")
await run_pipeline(ontology_path=ontology_path)
answers_with_ontology = await query_pipeline(questions)
logger.info("\n--- Generating answers WITHOUT ontology ---\n")
await run_pipeline()
answers_without_ontology = await query_pipeline(questions)
print_comparison_table(questions, answers_with_ontology, answers_without_ontology)
await visualize_graph()
if __name__ == "__main__":
logger = setup_logging()
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())

View file

@ -0,0 +1,290 @@
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
xmlns:ns1="http://example.org/ontology#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
>
<rdf:Description rdf:about="http://example.org/ontology#Volkswagen">
<rdfs:comment>Created for making cars accessible to everyone.</rdfs:comment>
<ns1:produces rdf:resource="http://example.org/ontology#VW_Golf"/>
<ns1:produces rdf:resource="http://example.org/ontology#VW_ID4"/>
<ns1:produces rdf:resource="http://example.org/ontology#VW_Touareg"/>
<rdf:type rdf:resource="http://example.org/ontology#CarManufacturer"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Azure">
<rdf:type rdf:resource="http://example.org/ontology#CloudServiceProvider"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Porsche">
<ns1:produces rdf:resource="http://example.org/ontology#Porsche_Cayenne"/>
<ns1:produces rdf:resource="http://example.org/ontology#Porsche_Taycan"/>
<ns1:produces rdf:resource="http://example.org/ontology#Porsche_911"/>
<rdf:type rdf:resource="http://example.org/ontology#CarManufacturer"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdfs:comment>Famous for high-performance sports cars.</rdfs:comment>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Meta">
<rdf:type rdf:resource="http://example.org/ontology#TechnologyCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<ns1:develops rdf:resource="http://example.org/ontology#Instagram"/>
<ns1:develops rdf:resource="http://example.org/ontology#Facebook"/>
<ns1:develops rdf:resource="http://example.org/ontology#Oculus"/>
<ns1:develops rdf:resource="http://example.org/ontology#WhatsApp"/>
<rdfs:comment>Pioneering social media and virtual reality technology.</rdfs:comment>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#TechnologyCompany">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Apple">
<rdf:type rdf:resource="http://example.org/ontology#TechnologyCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdfs:comment>Known for its innovative consumer electronics and software.</rdfs:comment>
<ns1:develops rdf:resource="http://example.org/ontology#iPad"/>
<ns1:develops rdf:resource="http://example.org/ontology#iPhone"/>
<ns1:develops rdf:resource="http://example.org/ontology#AppleWatch"/>
<ns1:develops rdf:resource="http://example.org/ontology#MacBook"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Audi">
<ns1:produces rdf:resource="http://example.org/ontology#Audi_eTron"/>
<ns1:produces rdf:resource="http://example.org/ontology#Audi_R8"/>
<ns1:produces rdf:resource="http://example.org/ontology#Audi_A8"/>
<rdf:type rdf:resource="http://example.org/ontology#CarManufacturer"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdfs:comment>Known for its modern designs and technology.</rdfs:comment>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#AmazonEcho">
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Porsche_Taycan">
<rdf:type rdf:resource="http://example.org/ontology#ElectricCar"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#BMW">
<ns1:produces rdf:resource="http://example.org/ontology#BMW_7Series"/>
<ns1:produces rdf:resource="http://example.org/ontology#BMW_M4"/>
<ns1:produces rdf:resource="http://example.org/ontology#BMW_iX"/>
<rdf:type rdf:resource="http://example.org/ontology#CarManufacturer"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdfs:comment>Focused on performance and driving pleasure.</rdfs:comment>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#VW_Touareg">
<rdf:type rdf:resource="http://example.org/ontology#SUV"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#SportsCar">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
<rdfs:subClassOf rdf:resource="http://example.org/ontology#Car"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#ElectricCar">
<rdfs:subClassOf rdf:resource="http://example.org/ontology#Car"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Google">
<ns1:develops rdf:resource="http://example.org/ontology#GooglePixel"/>
<ns1:develops rdf:resource="http://example.org/ontology#GoogleCloud"/>
<ns1:develops rdf:resource="http://example.org/ontology#Android"/>
<ns1:develops rdf:resource="http://example.org/ontology#GoogleSearch"/>
<rdf:type rdf:resource="http://example.org/ontology#TechnologyCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdfs:comment>Started as a search engine and expanded into cloud computing and AI.</rdfs:comment>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#AmazonPrime">
<rdf:type rdf:resource="http://example.org/ontology#SoftwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Car">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#WindowsOS">
<rdf:type rdf:resource="http://example.org/ontology#SoftwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Android">
<rdf:type rdf:resource="http://example.org/ontology#SoftwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Oculus">
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#GoogleCloud">
<rdf:type rdf:resource="http://example.org/ontology#CloudServiceProvider"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Microsoft">
<ns1:develops rdf:resource="http://example.org/ontology#Surface"/>
<ns1:develops rdf:resource="http://example.org/ontology#WindowsOS"/>
<ns1:develops rdf:resource="http://example.org/ontology#Azure"/>
<ns1:develops rdf:resource="http://example.org/ontology#Xbox"/>
<rdf:type rdf:resource="http://example.org/ontology#TechnologyCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdfs:comment>Dominant in software, cloud computing, and gaming.</rdfs:comment>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#GoogleSearch">
<rdf:type rdf:resource="http://example.org/ontology#SoftwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Mercedes_SClass">
<rdf:type rdf:resource="http://example.org/ontology#LuxuryCar"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Audi_A8">
<rdf:type rdf:resource="http://example.org/ontology#LuxuryCar"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Sedan">
<rdfs:subClassOf rdf:resource="http://example.org/ontology#Car"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#VW_Golf">
<rdf:type rdf:resource="http://example.org/ontology#Sedan"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Facebook">
<rdf:type rdf:resource="http://example.org/ontology#SoftwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#WhatsApp">
<rdf:type rdf:resource="http://example.org/ontology#SoftwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#produces">
<rdfs:domain rdf:resource="http://example.org/ontology#CarManufacturer"/>
<rdfs:range rdf:resource="http://example.org/ontology#Car"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#BMW_7Series">
<rdf:type rdf:resource="http://example.org/ontology#LuxuryCar"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#BMW_M4">
<rdf:type rdf:resource="http://example.org/ontology#SportsCar"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Audi_eTron">
<rdf:type rdf:resource="http://example.org/ontology#ElectricCar"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Kindle">
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#BMW_iX">
<rdf:type rdf:resource="http://example.org/ontology#ElectricCar"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#SoftwareCompany">
<rdfs:subClassOf rdf:resource="http://example.org/ontology#TechnologyCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Audi_R8">
<rdf:type rdf:resource="http://example.org/ontology#SportsCar"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Xbox">
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Technology">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Mercedes_EQS">
<rdf:type rdf:resource="http://example.org/ontology#ElectricCar"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Porsche_911">
<rdf:type rdf:resource="http://example.org/ontology#SportsCar"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#HardwareCompany">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
<rdfs:subClassOf rdf:resource="http://example.org/ontology#TechnologyCompany"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#MercedesBenz">
<ns1:produces rdf:resource="http://example.org/ontology#Mercedes_SClass"/>
<ns1:produces rdf:resource="http://example.org/ontology#Mercedes_EQS"/>
<ns1:produces rdf:resource="http://example.org/ontology#Mercedes_AMG_GT"/>
<rdfs:comment>Synonymous with luxury and quality.</rdfs:comment>
<rdf:type rdf:resource="http://example.org/ontology#CarManufacturer"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Amazon">
<rdf:type rdf:resource="http://example.org/ontology#TechnologyCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<ns1:develops rdf:resource="http://example.org/ontology#Kindle"/>
<ns1:develops rdf:resource="http://example.org/ontology#AmazonEcho"/>
<ns1:develops rdf:resource="http://example.org/ontology#AWS"/>
<ns1:develops rdf:resource="http://example.org/ontology#AmazonPrime"/>
<rdfs:comment>From e-commerce to cloud computing giant with AWS.</rdfs:comment>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Instagram">
<rdf:type rdf:resource="http://example.org/ontology#SoftwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#AWS">
<rdf:type rdf:resource="http://example.org/ontology#CloudServiceProvider"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#SUV">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
<rdfs:subClassOf rdf:resource="http://example.org/ontology#Car"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#VW_ID4">
<rdf:type rdf:resource="http://example.org/ontology#ElectricCar"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#CloudServiceProvider">
<rdfs:subClassOf rdf:resource="http://example.org/ontology#TechnologyCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Surface">
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#iPad">
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#iPhone">
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Mercedes_AMG_GT">
<rdf:type rdf:resource="http://example.org/ontology#SportsCar"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#MacBook">
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#develops">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/>
<rdfs:range rdf:resource="http://example.org/ontology#Technology"/>
<rdfs:domain rdf:resource="http://example.org/ontology#TechnologyCompany"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#LuxuryCar">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
<rdfs:subClassOf rdf:resource="http://example.org/ontology#Car"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#AppleWatch">
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Porsche_Cayenne">
<rdf:type rdf:resource="http://example.org/ontology#SUV"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#GooglePixel">
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#Company">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/ontology#CarManufacturer">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
<rdfs:subClassOf rdf:resource="http://example.org/ontology#Company"/>
</rdf:Description>
</rdf:RDF>

View file

@ -0,0 +1,93 @@
import asyncio
import os
import cognee
from cognee.api.v1.search import SearchType
from cognee.api.v1.visualize.visualize import visualize_graph
from cognee.shared.logging_utils import setup_logging
from cognee.modules.ontology.rdf_xml.RDFLibOntologyResolver import RDFLibOntologyResolver
from cognee.modules.ontology.ontology_config import Config
text_1 = """
1. Audi
Audi is known for its modern designs and advanced technology. Founded in the early 1900s, the brand has earned a reputation for precision engineering and innovation. With features like the Quattro all-wheel-drive system, Audi offers a range of vehicles from stylish sedans to high-performance sports cars.
2. BMW
BMW, short for Bayerische Motoren Werke, is celebrated for its focus on performance and driving pleasure. The company's vehicles are designed to provide a dynamic and engaging driving experience, and their slogan, "The Ultimate Driving Machine," reflects that commitment. BMW produces a variety of cars that combine luxury with sporty performance.
3. Mercedes-Benz
Mercedes-Benz is synonymous with luxury and quality. With a history dating back to the early 20th century, the brand is known for its elegant designs, innovative safety features, and high-quality engineering. Mercedes-Benz manufactures not only luxury sedans but also SUVs, sports cars, and commercial vehicles, catering to a wide range of needs.
4. Porsche
Porsche is a name that stands for high-performance sports cars. Founded in 1931, the brand has become famous for models like the iconic Porsche 911. Porsche cars are celebrated for their speed, precision, and distinctive design, appealing to car enthusiasts who value both performance and style.
5. Volkswagen
Volkswagen, which means "people's car" in German, was established with the idea of making affordable and reliable vehicles accessible to everyone. Over the years, Volkswagen has produced several iconic models, such as the Beetle and the Golf. Today, it remains one of the largest car manufacturers in the world, offering a wide range of vehicles that balance practicality with quality.
Each of these car manufacturer contributes to Germany's reputation as a leader in the global automotive industry, showcasing a blend of innovation, performance, and design excellence.
"""
text_2 = """
1. Apple
Apple is renowned for its innovative consumer electronics and software. Its product lineup includes the iPhone, iPad, Mac computers, and wearables like the Apple Watch. Known for its emphasis on sleek design and user-friendly interfaces, Apple has built a loyal customer base and created a seamless ecosystem that integrates hardware, software, and services.
2. Google
Founded in 1998, Google started as a search engine and quickly became the go-to resource for finding information online. Over the years, the company has diversified its offerings to include digital advertising, cloud computing, mobile operating systems (Android), and various web services like Gmail and Google Maps. Google's innovations have played a major role in shaping the internet landscape.
3. Microsoft
Microsoft Corporation has been a dominant force in software for decades. Its Windows operating system and Microsoft Office suite are staples in both business and personal computing. In recent years, Microsoft has expanded into cloud computing with Azure, gaming with the Xbox platform, and even hardware through products like the Surface line. This evolution has helped the company maintain its relevance in a rapidly changing tech world.
4. Amazon
What began as an online bookstore has grown into one of the largest e-commerce platforms globally. Amazon is known for its vast online marketplace, but its influence extends far beyond retail. With Amazon Web Services (AWS), the company has become a leader in cloud computing, offering robust solutions that power websites, applications, and businesses around the world. Amazon's constant drive for innovation continues to reshape both retail and technology sectors.
5. Meta
Meta, originally known as Facebook, revolutionized social media by connecting billions of people worldwide. Beyond its core social networking service, Meta is investing in the next generation of digital experiences through virtual and augmented reality technologies, with projects like Oculus. The company's efforts signal a commitment to evolving digital interaction and building the metaverse—a shared virtual space where users can connect and collaborate.
Each of these companies has significantly impacted the technology landscape, driving innovation and transforming everyday life through their groundbreaking products and services.
"""
async def main():
# Step 1: Reset data and system state
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
# Step 2: Add text
text_list = [text_1, text_2]
await cognee.add(text_list)
# Step 3: Create knowledge graph
ontology_path = os.path.join(
os.path.dirname(os.path.abspath(__file__)), "data/basic_ontology.owl"
)
# Create full config structure manually
config: Config = {
"ontology_config": {
"ontology_resolver": RDFLibOntologyResolver(ontology_file=ontology_path)
}
}
await cognee.cognify(config=config)
print("Knowledge with ontology created.")
# Step 4: Query insights
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION,
query_text="What are the exact cars and their types produced by Audi?",
)
print(search_results)
await visualize_graph()
if __name__ == "__main__":
logger = setup_logging()
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())

View file

@ -0,0 +1,5 @@
"""
Retrievers and Search Examples
Reference: https://docs.cognee.ai/guides/search-basics
"""

View file

@ -0,0 +1,70 @@
import asyncio
import cognee
from cognee.shared.logging_utils import setup_logging, ERROR
from cognee.api.v1.search import SearchType
# Prerequisites:
# 1. Copy `.env.template` and rename it to `.env`.
# 2. Add your OpenAI API key to the `.env` file in the `LLM_API_KEY` field:
# LLM_API_KEY = "your_key_here"
async def main():
# Create a clean slate for cognee -- reset data and system state
print("Resetting cognee data...")
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
print("Data reset complete.\n")
# cognee knowledge graph will be created based on this text
text = """
Natural language processing (NLP) is an interdisciplinary
subfield of computer science and information retrieval.
"""
print("Adding text to cognee:")
print(text.strip())
# Add the text, and make it available for cognify
await cognee.add(text)
print("Text added successfully.\n")
print("Running cognify to create knowledge graph...\n")
print("Cognify process steps:")
print("1. Classifying the document: Determining the type and category of the input text.")
print(
"2. Checking permissions: Ensuring the user has the necessary rights to process the text."
)
print(
"3. Extracting text chunks: Breaking down the text into sentences or phrases for analysis."
)
print("4. Adding data points: Storing the extracted chunks for processing.")
print(
"5. Generating knowledge graph: Extracting entities and relationships to form a knowledge graph."
)
print("6. Summarizing text: Creating concise summaries of the content for quick insights.\n")
# Use LLMs and cognee to create knowledge graph
await cognee.cognify()
print("Cognify process complete.\n")
query_text = "Tell me about NLP"
print(f"Searching cognee for insights with query: '{query_text}'")
# Query cognee for insights on the added text
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text=query_text
)
print("Search results:")
# Display results
for result_text in search_results:
print(result_text)
if __name__ == "__main__":
logger = setup_logging(log_level=ERROR)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,37 @@
import asyncio
import cognee
import os
# By default cognee uses OpenAI's gpt-5-mini LLM model
# Provide your OpenAI LLM API KEY
os.environ["LLM_API_KEY"] = ""
async def cognee_demo():
# Get file path to document to process
from pathlib import Path
current_directory = Path(__file__).resolve().parent
file_path = os.path.join(current_directory, "data", "alice_in_wonderland.txt")
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
# Call Cognee to process document
await cognee.add(file_path)
await cognee.cognify()
# Query Cognee for information from provided document
answer = await cognee.search("List me all the important characters in Alice in Wonderland.")
print(answer)
answer = await cognee.search("How did Alice end up in Wonderland?")
print(answer)
answer = await cognee.search("Tell me about Alice's personality.")
print(answer)
# Cognee is an async library, it has to be called in an async context
asyncio.run(cognee_demo())

View file

@ -0,0 +1,60 @@
#!/usr/bin/env python3
"""
Example showing how to use cognee.start_ui() to launch the frontend.
This demonstrates the new UI functionality that works similar to DuckDB's start_ui().
"""
import asyncio
import cognee
import time
async def main():
# First, let's add some data to cognee for the UI to display
print("Adding sample data to cognee...")
await cognee.add(
"Natural language processing (NLP) is an interdisciplinary subfield of computer science and information retrieval."
)
await cognee.add(
"Machine learning (ML) is a subset of artificial intelligence that focuses on algorithms and statistical models."
)
# Generate the knowledge graph
print("Generating knowledge graph...")
await cognee.cognify()
print("\n" + "=" * 60)
print("Starting cognee UI...")
print("=" * 60)
# Start the UI server
def dummy_callback(pid):
pass
server = cognee.start_ui(
pid_callback=dummy_callback,
port=3000,
open_browser=True, # This will automatically open your browser
)
if server:
print("UI server started successfully!")
print("The interface will be available at: http://localhost:3000")
print("\nPress Ctrl+C to stop the server when you're done...")
try:
# Keep the server running
while server.poll() is None: # While process is still running
time.sleep(1)
except KeyboardInterrupt:
print("\nStopping UI server...")
server.terminate()
server.wait() # Wait for process to finish
print("UI server stopped.")
else:
print("Failed to start UI server. Check the logs above for details.")
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,101 @@
import asyncio
import cognee
from cognee.shared.logging_utils import setup_logging, INFO
from cognee.api.v1.search import SearchType
biography_1 = """
Attaphol Buspakom Attaphol Buspakom ( ; ) , nicknamed Tak ( ; ) ; 1 October 1962 16 April 2015 ) was a Thai national and football coach . He was given the role at Muangthong United and Buriram United after TTM Samut Sakhon folded after the 2009 season . He played for the Thailand national football team , appearing in several FIFA World Cup qualifying matches .
Club career .
Attaphol began his career as a player at Thai Port FC Authority of Thailand in 1985 . In his first year , he won his first championship with the club . He played for the club until 1989 and in 1987 also won the Queens Cup . He then moved to Malaysia for two seasons for Pahang FA , then return to Thailand to his former club . His time from 1991 to 1994 was marked by less success than in his first stay at Port Authority . From 1994 to 1996 he played for Pahang again and this time he was able to win with the club , the Malaysia Super League and also reached the final of the Malaysia Cup and the Malaysia FA Cup . Both cup finals but lost . Back in Thailand , he let end his playing career at FC Stock Exchange of Thailand , with which he once again runner-up in 1996-97 . In 1998 , he finished his career .
International career .
For the Thailand national football team Attaphol played between 1985 and 1998 a total of 85 games and scored 13 results . In 1992 , he participated with the team in the finals of the Asian Cup . He also stood in various cadres to qualifications to FIFA World Cup .
Coaching career .
Bec Tero Sasana .
In BEC Tero Sasana F.C . began his coaching career in 2001 for him , first as assistant coach . He took over the reigning champions of the Thai League T1 , after his predecessor Pichai Pituwong resigned from his post . It was his first coach station and he had the difficult task of leading the club through the new AFC Champions League . He could accomplish this task with flying colors and even led the club to the finals . The finale , then still played in home and away matches , was lost with 1:2 at the end against Al Ain FC . Attaphol is and was next to Charnwit Polcheewin the only coach who managed a club from Thailand to lead to the final of the AFC Champions League . 2002-03 and 2003-04 he won with the club also two runner-up . In his team , which reached the final of the Champions League , were a number of exceptional players like Therdsak Chaiman , Worrawoot Srimaka , Dusit Chalermsan and Anurak Srikerd .
Geylang United / Krung Thai Bank .
In 2006 , he went to Singapore in the S-League to Geylang United He was released after a few months due to lack of success . In 2008 , he took over as coach at Krung Thai Bank F.C. , where he had almost a similar task , as a few years earlier by BEC-Tero . As vice-champion of the club was also qualified for the AFC Champions League . However , he failed to lead the team through the group stage of the season 2008 and beyond . With the Kashima Antlers of Japan and Beijing Guoan F.C . athletic competition was too great . One of the highlights was put under his leadership , yet the club . In the group match against the Vietnam club Nam Dinh F.C . his team won with 9-1 , but also lost four weeks later with 1-8 against Kashima Antlers . At the end of the National Football League season , he reached the Krung Thai 6th Table space . The Erstligalizenz the club was sold at the end of the season at the Bangkok Glass F.C. . Attaphol finished his coaching career with the club and accepted an offer of TTM Samutsakorn . After only a short time in office
Muangthong United .
In 2009 , he received an offer from Muangthong United F.C. , which he accepted and changed . He can champion Muang Thong United for 2009 Thai Premier League and Attaphol won Coach of The year for Thai Premier League and he was able to lead Muang Thong United to play AFC Champions League qualifying play-off for the first in the clubs history .
Buriram United .
In 2010 Buspakom moved from Muangthong United to Buriram United F.C. . He received Coach of the Month in Thai Premier League 2 time in June and October . In 2011 , he led Buriram United win 2011 Thai Premier League second time for club and set a record with the most points in the Thai League T1 for 85 point and He led Buriram win 2011 Thai FA Cup by beat Muangthong United F.C . 1-0 and he led Buriram win 2011 Thai League Cup by beat Thai Port F.C . 2-0 . In 2012 , he led Buriram United to the 2012 AFC Champions League group stage . Buriram along with Guangzhou Evergrande F.C . from China , Kashiwa Reysol from Japan and Jeonbuk Hyundai Motors which are all champions from their country . In the first match of Buriram they beat Kashiwa 3-2 and Second Match they beat Guangzhou 1-2 at the Tianhe Stadium . Before losing to Jeonbuk 0-2 and 3-2 with lose Kashiwa and Guangzhou 1-0 and 1-2 respectively and Thai Premier League Attaphol lead Buriram end 4th for table with win 2012 Thai FA Cup and 2012 Thai League Cup .
Bangkok Glass .
In 2013 , he moved from Buriram United to Bangkok Glass F.C. .
Individual
- Thai Premier League Coach of the Year ( 3 ) : 2001-02 , 2009 , 2013
"""
biography_2 = """
Arnulf Øverland Ole Peter Arnulf Øverland ( 27 April 1889 25 March 1968 ) was a Norwegian poet and artist . He is principally known for his poetry which served to inspire the Norwegian resistance movement during the German occupation of Norway during World War II .
Biography .
Øverland was born in Kristiansund and raised in Bergen . His parents were Peter Anton Øverland ( 18521906 ) and Hanna Hage ( 18541939 ) . The early death of his father , left the family economically stressed . He was able to attend Bergen Cathedral School and in 1904 Kristiania Cathedral School . He graduated in 1907 and for a time studied philology at University of Kristiania . Øverland published his first collection of poems ( 1911 ) .
Øverland became a communist sympathizer from the early 1920s and became a member of Mot Dag . He also served as chairman of the Norwegian Students Society 192328 . He changed his stand in 1937 , partly as an expression of dissent against the ongoing Moscow Trials . He was an avid opponent of Nazism and in 1936 he wrote the poem Du ikke sove which was printed in the journal Samtiden . It ends with . ( I thought: : Something is imminent . Our era is over Europes on fire! ) . Probably the most famous line of the poem is ( You mustnt endure so well the injustice that doesnt affect you yourself! )
During the German occupation of Norway from 1940 in World War II , he wrote to inspire the Norwegian resistance movement . He wrote a series of poems which were clandestinely distributed , leading to the arrest of both him and his future wife Margrete Aamot Øverland in 1941 . Arnulf Øverland was held first in the prison camp of Grini before being transferred to Sachsenhausen concentration camp in Germany . He spent a four-year imprisonment until the liberation of Norway in 1945 . His poems were later collected in Vi overlever alt and published in 1945 .
Øverland played an important role in the Norwegian language struggle in the post-war era . He became a noted supporter for the conservative written form of Norwegian called Riksmål , he was president of Riksmålsforbundet ( an organization in support of Riksmål ) from 1947 to 1956 . In addition , Øverland adhered to the traditionalist style of writing , criticising modernist poetry on several occasions . His speech Tungetale fra parnasset , published in Arbeiderbladet in 1954 , initiated the so-called Glossolalia debate .
Personal life .
In 1918 he had married the singer Hildur Arntzen ( 18881957 ) . Their marriage was dissolved in 1939 . In 1940 , he married Bartholine Eufemia Leganger ( 19031995 ) . They separated shortly after , and were officially divorced in 1945 . Øverland was married to journalist Margrete Aamot Øverland ( 19131978 ) during June 1945 . In 1946 , the Norwegian Parliament arranged for Arnulf and Margrete Aamot Øverland to reside at the Grotten . He lived there until his death in 1968 and she lived there for another ten years until her death in 1978 . Arnulf Øverland was buried at Vår Frelsers Gravlund in Oslo . Joseph Grimeland designed the bust of Arnulf Øverland ( bronze , 1970 ) at his grave site .
Selected Works .
- Den ensomme fest ( 1911 )
- Berget det blå ( 1927 )
- En Hustavle ( 1929 )
- Den røde front ( 1937 )
- Vi overlever alt ( 1945 )
- Sverdet bak døren ( 1956 )
- Livets minutter ( 1965 )
Awards .
- Gyldendals Endowment ( 1935 )
- Dobloug Prize ( 1951 )
- Mads Wiel Nygaards legat ( 1961 )
"""
async def main():
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
await cognee.add([biography_1, biography_2])
await cognee.cognify(temporal_cognify=True)
queries = [
"What happened before 1980?",
"What happened after 2010?",
"What happened between 2000 and 2006?",
"What happened between 1903 and 1995, I am interested in the Selected Works of Arnulf Øverland Ole Peter Arnulf Øverland?",
"Who is Attaphol Buspakom Attaphol Buspakom?",
"Who was Arnulf Øverland?",
]
for query_text in queries:
search_results = await cognee.search(
query_type=SearchType.TEMPORAL,
query_text=query_text,
top_k=15,
)
print(f"Query: {query_text}")
print(f"Results: {search_results}\n")
if __name__ == "__main__":
logger = setup_logging(log_level=INFO)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())

View file

@ -0,0 +1,37 @@
import asyncio
import cognee
async def main():
await cognee.prune.prune_data()
print("Data pruned.")
await cognee.prune.prune_system(metadata=True)
extraction_rules = {
"title": {"selector": "title"},
"headings": {"selector": "h1, h2, h3", "all": True},
"links": {
"selector": "a",
"attr": "href",
"all": True,
},
"paragraphs": {"selector": "p", "all": True},
}
await cognee.add(
"https://en.wikipedia.org/wiki/Large_language_model",
incremental_loading=False,
preferred_loaders={"beautiful_soup_loader": {"extraction_rules": extraction_rules}},
)
await cognee.cognify()
print("Knowledge graph created.")
await cognee.visualize_graph()
print("Data visualized")
if __name__ == "__main__":
asyncio.run(main())

View file

@ -0,0 +1,137 @@
import asyncio
from os import path
from typing import Any
from pydantic import SkipValidation
from cognee.api.v1.visualize.visualize import visualize_graph
from cognee.infrastructure.engine import DataPoint
from cognee.infrastructure.engine.models.Edge import Edge
from cognee.tasks.storage import add_data_points
import cognee
class Clothes(DataPoint):
name: str
description: str
class Object(DataPoint):
name: str
description: str
has_clothes: list[Clothes]
class Person(DataPoint):
name: str
description: str
has_items: SkipValidation[Any] # (Edge, list[Clothes])
has_objects: SkipValidation[Any] # (Edge, list[Object])
knows: SkipValidation[Any] # (Edge, list["Person"])
async def main():
# Clear the database for a clean state
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
# Create clothes items
item1 = Clothes(name="Shirt", description="A blue shirt")
item2 = Clothes(name="Pants", description="Black pants")
item3 = Clothes(name="Jacket", description="Leather jacket")
# Create object with simple relationship to clothes
object1 = Object(
name="Closet", description="A wooden closet", has_clothes=[item1, item2, item3]
)
# Create people with various weighted relationships
person1 = Person(
name="John",
description="A software engineer",
# Single weight (backward compatible)
has_items=(Edge(weight=0.8, relationship_type="owns"), [item1, item2]),
# Simple relationship without weights
has_objects=(Edge(relationship_type="stores_in"), [object1]),
knows=[],
)
person2 = Person(
name="Alice",
description="A designer",
# Multiple weights on edge
has_items=(
Edge(
weights={
"ownership": 0.9,
"frequency_of_use": 0.7,
"emotional_attachment": 0.8,
"monetary_value": 0.6,
},
relationship_type="owns",
),
[item3],
),
has_objects=(Edge(relationship_type="uses"), [object1]),
knows=[],
)
person3 = Person(
name="Bob",
description="A friend",
# Mixed: single weight + multiple weights
has_items=(
Edge(
weight=0.5, # Default weight
weights={"trust_level": 0.9, "communication_frequency": 0.6},
relationship_type="borrows",
),
[item1],
),
has_objects=[],
knows=[],
)
# Create relationships between people with multiple weights
person1.knows = (
Edge(
weights={
"friendship_strength": 0.9,
"trust_level": 0.8,
"years_known": 0.7,
"shared_interests": 0.6,
},
relationship_type="friend",
),
[person2, person3],
)
person2.knows = (
Edge(
weights={"professional_collaboration": 0.8, "personal_friendship": 0.6},
relationship_type="colleague",
),
[person1],
)
all_data_points = [item1, item2, item3, object1, person1, person2, person3]
# Add data points to the graph
await add_data_points(all_data_points)
# Visualize the graph
graph_visualization_path = path.join(
path.dirname(__file__), "weighted_graph_visualization.html"
)
await visualize_graph(graph_visualization_path)
print("Graph with multiple weighted edges has been created and visualized!")
print(f"Visualization saved to: {graph_visualization_path}")
print("\nFeatures demonstrated:")
print("- Single weight edges (backward compatible)")
print("- Multiple weights on single edges")
print("- Mixed single + multiple weights")
print("- Hover over edges to see all weight information")
print("- Different visual styling for single vs. multiple weighted edges")
if __name__ == "__main__":
asyncio.run(main())