refactor: restructure examples and starter kit into new-examples (#1862)
<!-- .github/pull_request_template.md -->
## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->
## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [x] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):
## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->
## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Documentation**
* Deprecated legacy examples and added a migration guide mapping old
paths to new locations
* Added a comprehensive new-examples README detailing configurations,
pipelines, demos, and migration notes
* **New Features**
* Added many runnable examples and demos: database configs,
embedding/LLM setups, permissions and access-control, custom pipelines
(organizational, product recommendation, code analysis, procurement),
multimedia, visualization, temporal/ontology demos, and a local UI
starter
* **Chores**
* Updated CI/test entrypoints to use the new-examples layout
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
This commit is contained in:
parent
9b2b1a9c13
commit
5f8a3e24bd
56 changed files with 5961 additions and 0 deletions
|
|
@ -1,3 +1,15 @@
|
|||
# ⚠️ DEPRECATED - Go to `new-examples/` Instead
|
||||
|
||||
This starter kit is deprecated. Its examples have been integrated into the `/new-examples/` folder.
|
||||
|
||||
| Old Location | New Location |
|
||||
|--------------|--------------|
|
||||
| `src/pipelines/default.py` | none |
|
||||
| `src/pipelines/low_level.py` | `new-examples/custom_pipelines/organizational_hierarchy/` |
|
||||
| `src/pipelines/custom-model.py` | `new-examples/demos/custom_graph_model_entity_schema_definition.py` |
|
||||
| `src/data/` | Included in `new-examples/custom_pipelines/organizational_hierarchy/data/` |
|
||||
|
||||
----------
|
||||
|
||||
# Cognee Starter Kit
|
||||
Welcome to the <a href="https://github.com/topoteretes/cognee">cognee</a> Starter Repo! This repository is designed to help you get started quickly by providing a structured dataset and pre-built data pipelines using cognee to build powerful knowledge graphs.
|
||||
|
|
|
|||
48
examples/README.md
Normal file
48
examples/README.md
Normal file
|
|
@ -0,0 +1,48 @@
|
|||
# ⚠️ DEPRECATED - Go to `new-examples/` Instead
|
||||
|
||||
This folder is deprecated. All examples have been reorganized into `/new-examples/`.
|
||||
|
||||
## Migration Guide
|
||||
|
||||
| Old Location | New Location |
|
||||
|--------------|--------------|
|
||||
| `python/simple_example.py` | `new-examples/demos/simple_default_cognee_pipelines_example.py` |
|
||||
| `python/cognee_simple_document_demo.py` | `new-examples/demos/simple_document_qa/` |
|
||||
| `python/multimedia_example.py` | `new-examples/demos/multimedia_processing/` |
|
||||
| `python/ontology_demo_example.py` | `new-examples/demos/ontology_reference_vocabulary/` |
|
||||
| `python/ontology_demo_example_2.py` | `new-examples/demos/ontology_medical_comparison/` |
|
||||
| `python/temporal_example.py` | `new-examples/demos/temporal_awareness_example.py` |
|
||||
| `python/conversation_session_persistence_example.py` | `new-examples/demos/conversation_session_persistence_example.py` |
|
||||
| `python/feedback_enrichment_minimal_example.py` | `new-examples/demos/feedback_enrichment_minimal_example.py` |
|
||||
| `python/simple_node_set_example.py` | `new-examples/demos/nodeset_memory_grouping_with_tags_example.py` |
|
||||
| `python/weighted_edges_example.py` | `new-examples/demos/weighted_edges_relationships_example.py` |
|
||||
| `python/dynamic_multiple_edges_example.py` | `new-examples/demos/dynamic_multiple_weighted_edges_example.py` |
|
||||
| `python/web_url_fetcher_example.py` | `new-examples/demos/web_url_content_ingestion_example.py` |
|
||||
| `python/permissions_example.py` | `new-examples/configurations/permissions_example/` |
|
||||
| `python/run_custom_pipeline_example.py` | `new-examples/custom_pipelines/custom_cognify_pipeline_example.py` |
|
||||
| `python/dynamic_steps_example.py` | `new-examples/custom_pipelines/dynamic_steps_resume_analysis_hr_example.py` |
|
||||
| `python/memify_coding_agent_example.py` | `new-examples/custom_pipelines/memify_coding_agent_rule_extraction_example.py` |
|
||||
| `python/agentic_reasoning_procurement_example.py` | `new-examples/custom_pipelines/agentic_reasoning_procurement_example.py` |
|
||||
| `python/code_graph_example.py` | `new-examples/custom_pipelines/code_graph_repository_analysis_example.py` |
|
||||
| `python/relational_database_migration_example.py` | `new-examples/custom_pipelines/relational_database_to_knowledge_graph_migration_example.py` |
|
||||
| `database_examples/chromadb_example.py` | `new-examples/configurations/database_examples/chromadb_vector_database_configuration.py` |
|
||||
| `database_examples/kuzu_example.py` | `new-examples/configurations/database_examples/kuzu_graph_database_configuration.py` |
|
||||
| `database_examples/neo4j_example.py` | `new-examples/configurations/database_examples/neo4j_graph_database_configuration.py` |
|
||||
| `database_examples/neptune_analytics_example.py` | `new-examples/configurations/database_examples/neptune_analytics_aws_database_configuration.py` |
|
||||
| `database_examples/pgvector_example.py` | `new-examples/configurations/database_examples/pgvector_postgres_vector_database_configuration.py` |
|
||||
| `low_level/pipeline.py` | `new-examples/custom_pipelines/organizational_hierarchy/` |
|
||||
| `low_level/product_recommendation.py` | `new-examples/custom_pipelines/product_recommendation/` |
|
||||
| `start_ui_example.py` | `new-examples/demos/start_local_ui_frontend_example.py` |
|
||||
| `relational_db_with_dlt/relational_db_and_dlt.py` | `new-examples/custom_pipelines/relational_database_to_knowledge_graph_migration_example.py` |
|
||||
|
||||
|
||||
## Files NOT Migrated
|
||||
|
||||
| File | Reason |
|
||||
|------|--------|
|
||||
| `python/graphiti_example.py` | External Graphiti integration; not core Cognee |
|
||||
| `python/weighted_graph_visualization.html` | Generated artifact, not source code |
|
||||
| `python/artifacts/` | Output directory, not example code |
|
||||
| `relational_db_with_dlt/fix_foreign_keys.sql` | SQL helper script, not standalone example |
|
||||
| `python/ontology_input_example/` | Data files moved to ontology demo folders |
|
||||
| `low_level/*.json` | Data files moved to respective pipeline folders |
|
||||
66
new-examples/README.md
Normal file
66
new-examples/README.md
Normal file
|
|
@ -0,0 +1,66 @@
|
|||
# Cognee Examples
|
||||
|
||||
## 📁 Structure
|
||||
|
||||
| Folder | Purpose |
|
||||
|--------|---------|
|
||||
| `configurations/` | Database, LLM, embedding, and permission setups |
|
||||
| `custom_pipelines/` | Building custom memory pipelines |
|
||||
| `demos/` | Feature demos and getting started examples |
|
||||
|
||||
## 🔧 Configurations
|
||||
|
||||
| Path | Description |
|
||||
|------|-------------|
|
||||
| `database_examples/chromadb_vector_database_configuration.py` | ChromaDB vector database |
|
||||
| `database_examples/kuzu_graph_database_configuration.py` | KuzuDB graph database |
|
||||
| `database_examples/neo4j_graph_database_configuration.py` | Neo4j graph database |
|
||||
| `database_examples/neptune_analytics_aws_database_configuration.py` | AWS Neptune Analytics |
|
||||
| `database_examples/pgvector_postgres_vector_database_configuration.py` | PostgreSQL with PGVector |
|
||||
| `database_examples/s3_storage_configuration.py` | Amazon S3 storage |
|
||||
| `llm_configurations/openai_setup.py` | OpenAI LLM setup |
|
||||
| `llm_configurations/azure_openai_setup.py` | Azure OpenAI LLM setup |
|
||||
| `embedding_configurations/openai_setup.py` | OpenAI embeddings |
|
||||
| `embedding_configurations/azure_openai_setup.py` | Azure OpenAI embeddings |
|
||||
| `structured_output_configurations.py/baml_setup.py` | BAML structured output |
|
||||
| `structured_output_configurations.py/litellm_intructor_setup.py` | LiteLLM Instructor setup |
|
||||
| `permissions_example/` | Multi-user access control (with sample PDF) |
|
||||
| `distributed_execution_with_modal_example.py` | Scale with Modal.com |
|
||||
|
||||
## 🔄 Custom Pipelines
|
||||
|
||||
| Path | Description |
|
||||
|------|-------------|
|
||||
| `custom_cognify_pipeline_example.py` | Customize cognify pipelines |
|
||||
| `memify_coding_agent_rule_extraction_example.py` | Extract rules from conversations |
|
||||
| `relational_database_to_knowledge_graph_migration_example.py` | SQL to knowledge graph |
|
||||
| `agentic_reasoning_procurement_example.py` | AI procurement assistant |
|
||||
| `code_graph_repository_analysis_example.py` | Code repository analysis |
|
||||
| `dynamic_steps_resume_analysis_hr_example.py` | CV/resume filtering |
|
||||
| `organizational_hierarchy/` | Org structure graphs (with JSON data) |
|
||||
| `organizational_hierarchy/organizational_hierarchy_pipeline_low_level_example.py` | Low-level pipeline variant |
|
||||
| `product_recommendation/` | Recommendation system (with customer data) |
|
||||
|
||||
## 🎯 Demos
|
||||
|
||||
| Path | Description |
|
||||
|------|-------------|
|
||||
| `simple_default_cognee_pipelines_example.py` | Default pipeline usage ★ |
|
||||
| `simple_document_qa/` | Document Q&A (with alice_in_wonderland.txt) |
|
||||
| `core_features_getting_started_example.py` | Intro to Cognee |
|
||||
| `multimedia_processing/` | Audio/image processing (with media files) |
|
||||
| `ontology_reference_vocabulary/` | Ontology as vocabulary (with OWL file) |
|
||||
| `ontology_medical_comparison/` | Medical ontology comparison (with papers + OWL) |
|
||||
| `web_url_content_ingestion_example.py` | Extract from web pages and ingest directly to memory |
|
||||
| `temporal_awareness_example.py` | Time-based queries |
|
||||
| `retrievers_and_search_examples.py` | Retriever patterns and search types guide |
|
||||
| `feedback_enrichment_minimal_example.py` | User feedback enrichment |
|
||||
| `nodeset_memory_grouping_with_tags_example.py` | Memory grouping with tags |
|
||||
| `weighted_edges_relationships_example.py` | Weighted edge relationships |
|
||||
| `dynamic_multiple_weighted_edges_example.py` | Multiple weighted edges |
|
||||
| `custom_graph_model_entity_schema_definition.py` | Custom entity schemas ★ |
|
||||
| `graph_visualization_example.py` | Visualize knowledge graphs |
|
||||
| `conversation_session_persistence_example.py` | Session persistence |
|
||||
| `custom_prompt_guide.py` | Custom prompts for extraction |
|
||||
| `direct_llm_call_for_structured_output_example.py` | Direct LLM structured output |
|
||||
| `start_local_ui_frontend_example.py` | Launch Cognee UI |
|
||||
|
|
@ -0,0 +1,89 @@
|
|||
import os
|
||||
import pathlib
|
||||
import asyncio
|
||||
import cognee
|
||||
from cognee.modules.search.types import SearchType
|
||||
|
||||
|
||||
async def main():
|
||||
"""
|
||||
Example script demonstrating how to use Cognee with ChromaDB
|
||||
|
||||
This example:
|
||||
1. Configures Cognee to use ChromaDB as vector database
|
||||
2. Sets up data directories
|
||||
3. Adds sample data to Cognee
|
||||
4. Processes (cognifies) the data
|
||||
5. Performs different types of searches
|
||||
"""
|
||||
# Configure ChromaDB as the vector database provider
|
||||
cognee.config.set_vector_db_config(
|
||||
{
|
||||
"vector_db_url": "http://localhost:8000", # Default ChromaDB server URL
|
||||
"vector_db_key": "", # ChromaDB doesn't require an API key by default
|
||||
"vector_db_provider": "chromadb", # Specify ChromaDB as provider
|
||||
}
|
||||
)
|
||||
|
||||
# Set up data directories for storing documents and system files
|
||||
# You should adjust these paths to your needs
|
||||
current_dir = pathlib.Path(__file__).parent
|
||||
data_directory_path = str(current_dir / "data_storage")
|
||||
cognee.config.data_root_directory(data_directory_path)
|
||||
|
||||
cognee_directory_path = str(current_dir / "cognee_system")
|
||||
cognee.config.system_root_directory(cognee_directory_path)
|
||||
|
||||
# Clean any existing data (optional)
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
# Create a dataset
|
||||
dataset_name = "chromadb_example"
|
||||
|
||||
# Add sample text to the dataset
|
||||
sample_text = """ChromaDB is an open-source embedding database.
|
||||
It allows users to store and query embeddings and their associated metadata.
|
||||
ChromaDB can be deployed in various ways: in-memory, on disk via sqlite, or as a persistent service.
|
||||
It is designed to be fast, scalable, and easy to use, making it a popular choice for AI applications.
|
||||
The database is built to handle vector search efficiently, which is essential for semantic search applications.
|
||||
ChromaDB supports multiple distance metrics for vector similarity search and can be integrated with various ML frameworks."""
|
||||
|
||||
# Add the sample text to the dataset
|
||||
await cognee.add([sample_text], dataset_name)
|
||||
|
||||
# Process the added document to extract knowledge
|
||||
await cognee.cognify([dataset_name])
|
||||
|
||||
# Now let's perform some searches
|
||||
# 1. Search for insights related to "ChromaDB"
|
||||
insights_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text="ChromaDB"
|
||||
)
|
||||
print("\nInsights about ChromaDB:")
|
||||
for result in insights_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# 2. Search for text chunks related to "vector search"
|
||||
chunks_results = await cognee.search(
|
||||
query_type=SearchType.CHUNKS, query_text="vector search", datasets=[dataset_name]
|
||||
)
|
||||
print("\nChunks about vector search:")
|
||||
for result in chunks_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# 3. Get graph completion related to databases
|
||||
graph_completion_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text="database"
|
||||
)
|
||||
print("\nGraph completion for databases:")
|
||||
for result in graph_completion_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# Clean up (optional)
|
||||
# await cognee.prune.prune_data()
|
||||
# await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
|
|
@ -0,0 +1,87 @@
|
|||
import os
|
||||
import pathlib
|
||||
import asyncio
|
||||
import cognee
|
||||
from cognee.modules.search.types import SearchType
|
||||
|
||||
|
||||
async def main():
|
||||
"""
|
||||
Example script demonstrating how to use Cognee with KuzuDB
|
||||
|
||||
This example:
|
||||
1. Configures Cognee to use KuzuDB as graph database
|
||||
2. Sets up data directories
|
||||
3. Adds sample data to Cognee
|
||||
4. Processes (cognifies) the data
|
||||
5. Performs different types of searches
|
||||
"""
|
||||
# Configure KuzuDB as the graph database provider
|
||||
cognee.config.set_graph_db_config(
|
||||
{
|
||||
"graph_database_provider": "kuzu", # Specify KuzuDB as provider
|
||||
}
|
||||
)
|
||||
|
||||
# Set up data directories for storing documents and system files
|
||||
# You should adjust these paths to your needs
|
||||
current_dir = pathlib.Path(__file__).parent
|
||||
data_directory_path = str(current_dir / "data_storage")
|
||||
cognee.config.data_root_directory(data_directory_path)
|
||||
|
||||
cognee_directory_path = str(current_dir / "cognee_system")
|
||||
cognee.config.system_root_directory(cognee_directory_path)
|
||||
|
||||
# Clean any existing data (optional)
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
# Create a dataset
|
||||
dataset_name = "kuzu_example"
|
||||
|
||||
# Add sample text to the dataset
|
||||
sample_text = """KuzuDB is a graph database system optimized for running complex graph analytics.
|
||||
It is designed to be a high-performance graph database for data science workloads.
|
||||
KuzuDB is built with modern hardware optimizations in mind.
|
||||
It provides support for property graphs and offers a Cypher-like query language.
|
||||
KuzuDB can handle both transactional and analytical graph workloads.
|
||||
The database now includes vector search capabilities for AI applications and semantic search."""
|
||||
|
||||
# Add the sample text to the dataset
|
||||
await cognee.add([sample_text], dataset_name)
|
||||
|
||||
# Process the added document to extract knowledge
|
||||
await cognee.cognify([dataset_name])
|
||||
|
||||
# Now let's perform some searches
|
||||
# 1. Search for insights related to "KuzuDB"
|
||||
insights_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text="KuzuDB"
|
||||
)
|
||||
print("\nInsights about KuzuDB:")
|
||||
for result in insights_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# 2. Search for text chunks related to "graph database"
|
||||
chunks_results = await cognee.search(
|
||||
query_type=SearchType.CHUNKS, query_text="graph database", datasets=[dataset_name]
|
||||
)
|
||||
print("\nChunks about graph database:")
|
||||
for result in chunks_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# 3. Get graph completion related to databases
|
||||
graph_completion_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text="database"
|
||||
)
|
||||
print("\nGraph completion for databases:")
|
||||
for result in graph_completion_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# Clean up (optional)
|
||||
# await cognee.prune.prune_data()
|
||||
# await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
|
|
@ -0,0 +1,96 @@
|
|||
import os
|
||||
import pathlib
|
||||
import asyncio
|
||||
import cognee
|
||||
from cognee.modules.search.types import SearchType
|
||||
|
||||
|
||||
async def main():
|
||||
"""
|
||||
Example script demonstrating how to use Cognee with Neo4j
|
||||
|
||||
This example:
|
||||
1. Configures Cognee to use Neo4j as graph database
|
||||
2. Sets up data directories
|
||||
3. Adds sample data to Cognee
|
||||
4. Processes (cognifies) the data
|
||||
5. Performs different types of searches
|
||||
"""
|
||||
|
||||
# Set up Neo4j credentials in .env file and get the values from environment variables
|
||||
neo4j_url = os.getenv("GRAPH_DATABASE_URL")
|
||||
neo4j_user = os.getenv("GRAPH_DATABASE_USERNAME")
|
||||
neo4j_pass = os.getenv("GRAPH_DATABASE_PASSWORD")
|
||||
|
||||
# Configure Neo4j as the graph database provider
|
||||
cognee.config.set_graph_db_config(
|
||||
{
|
||||
"graph_database_url": neo4j_url, # Neo4j Bolt URL
|
||||
"graph_database_provider": "neo4j", # Specify Neo4j as provider
|
||||
"graph_database_username": neo4j_user, # Neo4j username
|
||||
"graph_database_password": neo4j_pass, # Neo4j password
|
||||
}
|
||||
)
|
||||
|
||||
# Set up data directories for storing documents and system files
|
||||
# You should adjust these paths to your needs
|
||||
current_dir = pathlib.Path(__file__).parent
|
||||
data_directory_path = str(current_dir / "data_storage")
|
||||
cognee.config.data_root_directory(data_directory_path)
|
||||
|
||||
cognee_directory_path = str(current_dir / "cognee_system")
|
||||
cognee.config.system_root_directory(cognee_directory_path)
|
||||
|
||||
# Clean any existing data (optional)
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
# Create a dataset
|
||||
dataset_name = "neo4j_example"
|
||||
|
||||
# Add sample text to the dataset
|
||||
sample_text = """Neo4j is a graph database management system.
|
||||
It stores data in nodes and relationships rather than tables as in traditional relational databases.
|
||||
Neo4j provides a powerful query language called Cypher for graph traversal and analysis.
|
||||
It now supports vector indexing for similarity search with the vector index plugin.
|
||||
Neo4j allows embedding generation and vector search to be combined with graph operations.
|
||||
Applications can use Neo4j to connect vector search with graph context for more meaningful results."""
|
||||
|
||||
# Add the sample text to the dataset
|
||||
await cognee.add([sample_text], dataset_name)
|
||||
|
||||
# Process the added document to extract knowledge
|
||||
await cognee.cognify([dataset_name])
|
||||
|
||||
# Now let's perform some searches
|
||||
# 1. Search for insights related to "Neo4j"
|
||||
insights_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text="Neo4j"
|
||||
)
|
||||
print("\nInsights about Neo4j:")
|
||||
for result in insights_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# 2. Search for text chunks related to "graph database"
|
||||
chunks_results = await cognee.search(
|
||||
query_type=SearchType.CHUNKS, query_text="graph database", datasets=[dataset_name]
|
||||
)
|
||||
print("\nChunks about graph database:")
|
||||
for result in chunks_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# 3. Get graph completion related to databases
|
||||
graph_completion_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text="database"
|
||||
)
|
||||
print("\nGraph completion for databases:")
|
||||
for result in graph_completion_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# Clean up (optional)
|
||||
# await cognee.prune.prune_data()
|
||||
# await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
|
|
@ -0,0 +1,110 @@
|
|||
import base64
|
||||
import json
|
||||
import os
|
||||
import pathlib
|
||||
import asyncio
|
||||
import cognee
|
||||
from cognee.modules.search.types import SearchType
|
||||
from dotenv import load_dotenv
|
||||
|
||||
load_dotenv()
|
||||
|
||||
|
||||
async def main():
|
||||
"""
|
||||
Example script demonstrating how to use Cognee with Amazon Neptune Analytics
|
||||
|
||||
This example:
|
||||
1. Configures Cognee to use Neptune Analytics as graph database
|
||||
2. Sets up data directories
|
||||
3. Adds sample data to Cognee
|
||||
4. Processes/cognifies the data
|
||||
5. Performs different types of searches
|
||||
"""
|
||||
|
||||
# Set up Amazon credentials in .env file and get the values from environment variables
|
||||
graph_endpoint_url = "neptune-graph://" + os.getenv("GRAPH_ID", "")
|
||||
|
||||
# Configure Neptune Analytics as the graph & vector database provider
|
||||
cognee.config.set_graph_db_config(
|
||||
{
|
||||
"graph_database_provider": "neptune_analytics", # Specify Neptune Analytics as provider
|
||||
"graph_database_url": graph_endpoint_url, # Neptune Analytics endpoint with the format neptune-graph://<GRAPH_ID>
|
||||
}
|
||||
)
|
||||
cognee.config.set_vector_db_config(
|
||||
{
|
||||
"vector_db_provider": "neptune_analytics", # Specify Neptune Analytics as provider
|
||||
"vector_db_url": graph_endpoint_url, # Neptune Analytics endpoint with the format neptune-graph://<GRAPH_ID>
|
||||
}
|
||||
)
|
||||
|
||||
# Set up data directories for storing documents and system files
|
||||
# You should adjust these paths to your needs
|
||||
current_dir = pathlib.Path(__file__).parent
|
||||
data_directory_path = str(current_dir / "data_storage")
|
||||
cognee.config.data_root_directory(data_directory_path)
|
||||
|
||||
cognee_directory_path = str(current_dir / "cognee_system")
|
||||
cognee.config.system_root_directory(cognee_directory_path)
|
||||
|
||||
# Clean any existing data (optional)
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
# Create a dataset
|
||||
dataset_name = "neptune_example"
|
||||
|
||||
# Add sample text to the dataset
|
||||
sample_text_1 = """Neptune Analytics is a memory-optimized graph database engine for analytics. With Neptune
|
||||
Analytics, you can get insights and find trends by processing large amounts of graph data in seconds. To analyze
|
||||
graph data quickly and easily, Neptune Analytics stores large graph datasets in memory. It supports a library of
|
||||
optimized graph analytic algorithms, low-latency graph queries, and vector search capabilities within graph
|
||||
traversals.
|
||||
"""
|
||||
|
||||
sample_text_2 = """Neptune Analytics is an ideal choice for investigatory, exploratory, or data-science workloads
|
||||
that require fast iteration for data, analytical and algorithmic processing, or vector search on graph data. It
|
||||
complements Amazon Neptune Database, a popular managed graph database. To perform intensive analysis, you can load
|
||||
the data from a Neptune Database graph or snapshot into Neptune Analytics. You can also load graph data that's
|
||||
stored in Amazon S3.
|
||||
"""
|
||||
|
||||
# Add the sample text to the dataset
|
||||
await cognee.add([sample_text_1, sample_text_2], dataset_name)
|
||||
|
||||
# Process the added document to extract knowledge
|
||||
await cognee.cognify([dataset_name])
|
||||
|
||||
# Now let's perform some searches
|
||||
# 1. Search for insights related to "Neptune Analytics"
|
||||
insights_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text="Neptune Analytics"
|
||||
)
|
||||
print("\n========Insights about Neptune Analytics========:")
|
||||
for result in insights_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# 2. Search for text chunks related to "graph database"
|
||||
chunks_results = await cognee.search(
|
||||
query_type=SearchType.CHUNKS, query_text="graph database", datasets=[dataset_name]
|
||||
)
|
||||
print("\n========Chunks about graph database========:")
|
||||
for result in chunks_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# 3. Get graph completion related to databases
|
||||
graph_completion_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text="database"
|
||||
)
|
||||
print("\n========Graph completion for databases========:")
|
||||
for result in graph_completion_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# Clean up (optional)
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
|
|
@ -0,0 +1,101 @@
|
|||
import os
|
||||
import pathlib
|
||||
import asyncio
|
||||
import cognee
|
||||
from cognee.modules.search.types import SearchType
|
||||
|
||||
|
||||
async def main():
|
||||
"""
|
||||
Example script demonstrating how to use Cognee with PGVector
|
||||
|
||||
This example:
|
||||
1. Configures Cognee to use PostgreSQL with PGVector extension as vector database
|
||||
2. Sets up data directories
|
||||
3. Adds sample data to Cognee
|
||||
4. Processes (cognifies) the data
|
||||
5. Performs different types of searches
|
||||
"""
|
||||
# Configure PGVector as the vector database provider
|
||||
cognee.config.set_vector_db_config(
|
||||
{
|
||||
"vector_db_provider": "pgvector", # Specify PGVector as provider
|
||||
}
|
||||
)
|
||||
|
||||
# Configure PostgreSQL connection details
|
||||
# These settings are required for PGVector
|
||||
cognee.config.set_relational_db_config(
|
||||
{
|
||||
"db_path": "",
|
||||
"db_name": "cognee_db",
|
||||
"db_host": "127.0.0.1",
|
||||
"db_port": "5432",
|
||||
"db_username": "cognee",
|
||||
"db_password": "cognee",
|
||||
"db_provider": "postgres",
|
||||
}
|
||||
)
|
||||
|
||||
# Set up data directories for storing documents and system files
|
||||
# You should adjust these paths to your needs
|
||||
current_dir = pathlib.Path(__file__).parent
|
||||
data_directory_path = str(current_dir / "data_storage")
|
||||
cognee.config.data_root_directory(data_directory_path)
|
||||
|
||||
cognee_directory_path = str(current_dir / "cognee_system")
|
||||
cognee.config.system_root_directory(cognee_directory_path)
|
||||
|
||||
# Clean any existing data (optional)
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
# Create a dataset
|
||||
dataset_name = "pgvector_example"
|
||||
|
||||
# Add sample text to the dataset
|
||||
sample_text = """PGVector is an extension for PostgreSQL that adds vector similarity search capabilities.
|
||||
It supports multiple indexing methods, including IVFFlat, HNSW, and brute-force search.
|
||||
PGVector allows you to store vector embeddings directly in your PostgreSQL database.
|
||||
It provides distance functions like L2 distance, inner product, and cosine distance.
|
||||
Using PGVector, you can perform both metadata filtering and vector similarity search in a single query.
|
||||
The extension is often used for applications like semantic search, recommendations, and image similarity."""
|
||||
|
||||
# Add the sample text to the dataset
|
||||
await cognee.add([sample_text], dataset_name)
|
||||
|
||||
# Process the added document to extract knowledge
|
||||
await cognee.cognify([dataset_name])
|
||||
|
||||
# Now let's perform some searches
|
||||
# 1. Search for insights related to "PGVector"
|
||||
insights_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text="PGVector"
|
||||
)
|
||||
print("\nInsights about PGVector:")
|
||||
for result in insights_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# 2. Search for text chunks related to "vector similarity"
|
||||
chunks_results = await cognee.search(
|
||||
query_type=SearchType.CHUNKS, query_text="vector similarity", datasets=[dataset_name]
|
||||
)
|
||||
print("\nChunks about vector similarity:")
|
||||
for result in chunks_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# 3. Get graph completion related to databases
|
||||
graph_completion_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text="database"
|
||||
)
|
||||
print("\nGraph completion for databases:")
|
||||
for result in graph_completion_results:
|
||||
print(f"- {result}")
|
||||
|
||||
# Clean up (optional)
|
||||
# await cognee.prune.prune_data()
|
||||
# await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
|
|
@ -0,0 +1,5 @@
|
|||
"""
|
||||
S3 Storage Configuration Example
|
||||
|
||||
Reference: https://docs.cognee.ai/guides/s3-storage
|
||||
"""
|
||||
|
|
@ -0,0 +1,6 @@
|
|||
"""
|
||||
Distributed Execution with Modal Example
|
||||
|
||||
Reference: https://docs.cognee.ai/guides/distributed-execution
|
||||
|
||||
"""
|
||||
|
|
@ -0,0 +1,5 @@
|
|||
"""
|
||||
Azure OpenAI Embedding Setup Example
|
||||
|
||||
Reference: https://docs.cognee.ai/setup-configuration/embedding-providers
|
||||
"""
|
||||
|
|
@ -0,0 +1,5 @@
|
|||
"""
|
||||
OpenAI Embedding Setup Example
|
||||
|
||||
Reference: https://docs.cognee.ai/setup-configuration/embedding-providers
|
||||
"""
|
||||
|
|
@ -0,0 +1,5 @@
|
|||
"""
|
||||
Azure OpenAI Setup Example
|
||||
|
||||
Reference: https://docs.cognee.ai/setup-configuration/llm-providers
|
||||
"""
|
||||
|
|
@ -0,0 +1,5 @@
|
|||
"""
|
||||
OpenAI Setup Example
|
||||
|
||||
Reference: https://docs.cognee.ai/setup-configuration/llm-providers
|
||||
"""
|
||||
Binary file not shown.
|
|
@ -0,0 +1,228 @@
|
|||
import os
|
||||
import cognee
|
||||
import pathlib
|
||||
|
||||
from cognee.modules.users.exceptions import PermissionDeniedError
|
||||
from cognee.modules.users.tenants.methods import select_tenant
|
||||
from cognee.shared.logging_utils import get_logger
|
||||
from cognee.modules.search.types import SearchType
|
||||
from cognee.modules.users.methods import create_user
|
||||
from cognee.modules.users.permissions.methods import authorized_give_permission_on_datasets
|
||||
from cognee.modules.users.roles.methods import add_user_to_role
|
||||
from cognee.modules.users.roles.methods import create_role
|
||||
from cognee.modules.users.tenants.methods import create_tenant
|
||||
from cognee.modules.users.tenants.methods import add_user_to_tenant
|
||||
from cognee.modules.engine.operations.setup import setup
|
||||
from cognee.shared.logging_utils import setup_logging, CRITICAL
|
||||
|
||||
logger = get_logger()
|
||||
|
||||
|
||||
async def main():
|
||||
# ENABLE PERMISSIONS FEATURE
|
||||
# Note: When ENABLE_BACKEND_ACCESS_CONTROL is enabled vector provider is automatically set to use LanceDB
|
||||
# and graph provider is set to use Kuzu.
|
||||
os.environ["ENABLE_BACKEND_ACCESS_CONTROL"] = "True"
|
||||
|
||||
# Set the rest of your environment variables as needed. By default OpenAI is used as the LLM provider
|
||||
# Reference the .env.tempalte file for available option and how to change LLM provider: https://github.com/topoteretes/cognee/blob/main/.env.template
|
||||
# For example to set your OpenAI LLM API key use:
|
||||
# os.environ["LLM_API_KEY"] = "your-api-key"
|
||||
|
||||
# Create a clean slate for cognee -- reset data and system state
|
||||
print("Resetting cognee data...")
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
print("Data reset complete.\n")
|
||||
|
||||
# Set up the necessary databases and tables for user management.
|
||||
await setup()
|
||||
|
||||
# NOTE: When a document is added in Cognee with permissions enabled only the owner of the document has permissions
|
||||
# to work with the document initially.
|
||||
# Add document for user_1, add it under dataset name AI
|
||||
explanation_file_path = os.path.join(
|
||||
pathlib.Path(__file__).parent, "data/artificial_intelligence.pdf"
|
||||
)
|
||||
|
||||
print("Creating user_1: user_1@example.com")
|
||||
user_1 = await create_user("user_1@example.com", "example")
|
||||
await cognee.add([explanation_file_path], dataset_name="AI", user=user_1)
|
||||
|
||||
# Add document for user_2, add it under dataset name QUANTUM
|
||||
text = """A quantum computer is a computer that takes advantage of quantum mechanical phenomena.
|
||||
At small scales, physical matter exhibits properties of both particles and waves, and quantum computing leverages
|
||||
this behavior, specifically quantum superposition and entanglement, using specialized hardware that supports the
|
||||
preparation and manipulation of quantum states.
|
||||
"""
|
||||
print("\nCreating user_2: user_2@example.com")
|
||||
user_2 = await create_user("user_2@example.com", "example")
|
||||
await cognee.add([text], dataset_name="QUANTUM", user=user_2)
|
||||
|
||||
# Run cognify for both datasets as the appropriate user/owner
|
||||
print("\nCreating different datasets for user_1 (AI dataset) and user_2 (QUANTUM dataset)")
|
||||
ai_cognify_result = await cognee.cognify(["AI"], user=user_1)
|
||||
quantum_cognify_result = await cognee.cognify(["QUANTUM"], user=user_2)
|
||||
|
||||
# Extract dataset_ids from cognify results
|
||||
def extract_dataset_id_from_cognify(cognify_result):
|
||||
"""Extract dataset_id from cognify output dictionary"""
|
||||
for dataset_id, pipeline_result in cognify_result.items():
|
||||
return dataset_id # Return the first dataset_id
|
||||
return None
|
||||
|
||||
# Get dataset IDs from cognify results
|
||||
# Note: When we want to work with datasets from other users (search, add, cognify and etc.) we must supply dataset
|
||||
# information through dataset_id using dataset name only looks for datasets owned by current user
|
||||
ai_dataset_id = extract_dataset_id_from_cognify(ai_cognify_result)
|
||||
quantum_dataset_id = extract_dataset_id_from_cognify(quantum_cognify_result)
|
||||
|
||||
# We can see here that user_1 can read his own dataset (AI dataset)
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text="What is in the document?",
|
||||
user=user_1,
|
||||
datasets=[ai_dataset_id],
|
||||
)
|
||||
print("\nSearch results as user_1 on dataset owned by user_1:")
|
||||
for result in search_results:
|
||||
print(f"{result}\n")
|
||||
|
||||
# But user_1 cant read the dataset owned by user_2 (QUANTUM dataset)
|
||||
print("\nSearch result as user_1 on the dataset owned by user_2:")
|
||||
try:
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text="What is in the document?",
|
||||
user=user_1,
|
||||
datasets=[quantum_dataset_id],
|
||||
)
|
||||
except PermissionDeniedError:
|
||||
print(f"User: {user_1} does not have permission to read from dataset: QUANTUM")
|
||||
|
||||
# user_1 currently also can't add a document to user_2's dataset (QUANTUM dataset)
|
||||
print("\nAttempting to add new data as user_1 to dataset owned by user_2:")
|
||||
try:
|
||||
await cognee.add(
|
||||
[explanation_file_path],
|
||||
dataset_id=quantum_dataset_id,
|
||||
user=user_1,
|
||||
)
|
||||
except PermissionDeniedError:
|
||||
print(f"User: {user_1} does not have permission to write to dataset: QUANTUM")
|
||||
|
||||
# We've shown that user_1 can't interact with the dataset from user_2
|
||||
# Now have user_2 give proper permission to user_1 to read QUANTUM dataset
|
||||
# Note: supported permission types are "read", "write", "delete" and "share"
|
||||
print(
|
||||
"\nOperation started as user_2 to give read permission to user_1 for the dataset owned by user_2"
|
||||
)
|
||||
await authorized_give_permission_on_datasets(
|
||||
user_1.id,
|
||||
[quantum_dataset_id],
|
||||
"read",
|
||||
user_2.id,
|
||||
)
|
||||
|
||||
# Now user_1 can read from quantum dataset after proper permissions have been assigned by the QUANTUM dataset owner.
|
||||
print("\nSearch result as user_1 on the dataset owned by user_2:")
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text="What is in the document?",
|
||||
user=user_1,
|
||||
dataset_ids=[quantum_dataset_id],
|
||||
)
|
||||
for result in search_results:
|
||||
print(f"{result}\n")
|
||||
|
||||
# If we'd like for user_1 to add new documents to the QUANTUM dataset owned by user_2, user_1 would have to get
|
||||
# "write" access permission, which user_1 currently does not have
|
||||
|
||||
# Users can also be added to Roles and Tenants and then permission can be assigned on a Role/Tenant level as well
|
||||
# To create a Role a user first must be an owner of a Tenant
|
||||
print("User 2 is creating CogneeLab tenant/organization")
|
||||
tenant_id = await create_tenant("CogneeLab", user_2.id)
|
||||
|
||||
print("User 2 is selecting CogneeLab tenant/organization as active tenant")
|
||||
await select_tenant(user_id=user_2.id, tenant_id=tenant_id)
|
||||
|
||||
print("\nUser 2 is creating Researcher role")
|
||||
role_id = await create_role(role_name="Researcher", owner_id=user_2.id)
|
||||
|
||||
print("\nCreating user_3: user_3@example.com")
|
||||
user_3 = await create_user("user_3@example.com", "example")
|
||||
|
||||
# To add a user to a role he must be part of the same tenant/organization
|
||||
print("\nOperation started as user_2 to add user_3 to CogneeLab tenant/organization")
|
||||
await add_user_to_tenant(user_id=user_3.id, tenant_id=tenant_id, owner_id=user_2.id)
|
||||
|
||||
print(
|
||||
"\nOperation started by user_2, as tenant owner, to add user_3 to Researcher role inside the tenant/organization"
|
||||
)
|
||||
await add_user_to_role(user_id=user_3.id, role_id=role_id, owner_id=user_2.id)
|
||||
|
||||
print("\nOperation as user_3 to select CogneeLab tenant/organization as active tenant")
|
||||
await select_tenant(user_id=user_3.id, tenant_id=tenant_id)
|
||||
|
||||
print(
|
||||
"\nOperation started as user_2, with CogneeLab as its active tenant, to give read permission to Researcher role for the dataset QUANTUM owned by user_2"
|
||||
)
|
||||
# Even though the dataset owner is user_2, the dataset doesn't belong to the tenant/organization CogneeLab.
|
||||
# So we can't assign permissions to it when we're acting in the CogneeLab tenant.
|
||||
try:
|
||||
await authorized_give_permission_on_datasets(
|
||||
role_id,
|
||||
[quantum_dataset_id],
|
||||
"read",
|
||||
user_2.id,
|
||||
)
|
||||
except PermissionDeniedError:
|
||||
print(
|
||||
"User 2 could not give permission to the role as the QUANTUM dataset is not part of the CogneeLab tenant"
|
||||
)
|
||||
|
||||
print(
|
||||
"We will now create a new QUANTUM dataset with the QUANTUM_COGNEE_LAB name in the CogneeLab tenant so that permissions can be assigned to the Researcher role inside the tenant/organization"
|
||||
)
|
||||
# We can re-create the QUANTUM dataset in the CogneeLab tenant. The old QUANTUM dataset is still owned by user_2 personally
|
||||
# and can still be accessed by selecting the personal tenant for user 2.
|
||||
from cognee.modules.users.methods import get_user
|
||||
|
||||
# Note: We need to update user_2 from the database to refresh its tenant context changes
|
||||
user_2 = await get_user(user_2.id)
|
||||
await cognee.add([text], dataset_name="QUANTUM_COGNEE_LAB", user=user_2)
|
||||
quantum_cognee_lab_cognify_result = await cognee.cognify(["QUANTUM_COGNEE_LAB"], user=user_2)
|
||||
|
||||
# The recreated Quantum dataset will now have a different dataset_id as it's a new dataset in a different organization
|
||||
quantum_cognee_lab_dataset_id = extract_dataset_id_from_cognify(
|
||||
quantum_cognee_lab_cognify_result
|
||||
)
|
||||
print(
|
||||
"\nOperation started as user_2, with CogneeLab as its active tenant, to give read permission to Researcher role for the dataset QUANTUM owned by the CogneeLab tenant"
|
||||
)
|
||||
await authorized_give_permission_on_datasets(
|
||||
role_id,
|
||||
[quantum_cognee_lab_dataset_id],
|
||||
"read",
|
||||
user_2.id,
|
||||
)
|
||||
|
||||
# Now user_3 can read from QUANTUM dataset as part of the Researcher role after proper permissions have been assigned by the QUANTUM dataset owner, user_2.
|
||||
print("\nSearch result as user_3 on the QUANTUM dataset owned by the CogneeLab organization:")
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text="What is in the document?",
|
||||
user=user_3,
|
||||
dataset_ids=[quantum_cognee_lab_dataset_id],
|
||||
)
|
||||
for result in search_results:
|
||||
print(f"{result}\n")
|
||||
|
||||
# Note: All of these function calls and permission system is available through our backend endpoints as well
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
import asyncio
|
||||
|
||||
logger = setup_logging(log_level=CRITICAL)
|
||||
asyncio.run(main())
|
||||
|
|
@ -0,0 +1,5 @@
|
|||
"""
|
||||
BAML Setup Example
|
||||
|
||||
Reference: https://docs.cognee.ai/setup-configuration/structured-output-backends
|
||||
"""
|
||||
|
|
@ -0,0 +1,5 @@
|
|||
"""
|
||||
Litellm Instructor Setup Example
|
||||
|
||||
Reference: https://docs.cognee.ai/setup-configuration/structured-output-backends
|
||||
"""
|
||||
|
|
@ -0,0 +1,203 @@
|
|||
import os
|
||||
import logging
|
||||
import cognee
|
||||
import asyncio
|
||||
|
||||
from cognee.infrastructure.llm.LLMGateway import LLMGateway
|
||||
from dotenv import load_dotenv
|
||||
from cognee.api.v1.search import SearchType
|
||||
from cognee.modules.engine.models import NodeSet
|
||||
from cognee.shared.logging_utils import setup_logging
|
||||
|
||||
|
||||
load_dotenv()
|
||||
|
||||
os.environ["LLM_API_KEY"] = ""
|
||||
# Notes: Nodesets cognee feature only works with kuzu and Neo4j graph databases
|
||||
os.environ["GRAPH_DATABASE_PROVIDER"] = "kuzu"
|
||||
|
||||
|
||||
class ProcurementMemorySystem:
|
||||
"""Procurement system with persistent memory using Cognee"""
|
||||
|
||||
async def setup_memory_data(self):
|
||||
"""Load and store procurement data in memory"""
|
||||
|
||||
# Procurement system dummy data
|
||||
vendor_conversation_text_techsupply = """
|
||||
Assistant: Hello! This is Sarah from TechSupply Solutions.
|
||||
Thanks for reaching out for your IT procurement needs.
|
||||
|
||||
User: We're looking to procure 50 high-performance enterprise laptops.
|
||||
Specs: Intel i7, 16GB RAM, 512GB SSD, dedicated graphics card.
|
||||
Budget: $80,000. What models do you have?
|
||||
|
||||
Assistant: TechSupply Solutions can offer Dell Precision 5570 ($1,450) and Lenovo ThinkPad P1 ($1,550).
|
||||
Both come with a 3-year warranty. Delivery: 2–3 weeks (Dell), 3–4 weeks (Lenovo).
|
||||
|
||||
User: Do you provide bulk discounts? We're planning another 200 units next quarter.
|
||||
|
||||
Assistant: Yes! Orders over $50,000 get 8% off.
|
||||
So for your current order:
|
||||
- Dell = $1,334 each ($66,700 total)
|
||||
- Lenovo = $1,426 each ($71,300 total)
|
||||
|
||||
And for 200 units next quarter, we can offer 12% off with flexible delivery.
|
||||
"""
|
||||
|
||||
vendor_conversation_text_office_solutions = """
|
||||
Assistant: Hi, this is Martin from vendor Office Solutions. How can we assist you?
|
||||
|
||||
User: We need 50 laptops for our engineers.
|
||||
Specs: i7 CPU, 16GB RAM, 512GB SSD, dedicated GPU.
|
||||
We can spend up to $80,000. Can you meet this?
|
||||
|
||||
Assistant: Office Solutions can offer HP ZBook Power G9 for $1,600 each.
|
||||
Comes with 2-year warranty, delivery time is 4–5 weeks.
|
||||
|
||||
User: That's a bit long — any options to speed it up?
|
||||
|
||||
Assistant: We can expedite for $75 per unit, bringing delivery to 3–4 weeks.
|
||||
Also, for orders over $60,000 we give 6% off.
|
||||
|
||||
So:
|
||||
- Base price = $1,600 → $1,504 with discount
|
||||
- Expedited price = $1,579
|
||||
|
||||
User: Understood. Any room for better warranty terms?
|
||||
|
||||
Assistant: We’re working on adding a 3-year warranty option next quarter for enterprise clients.
|
||||
"""
|
||||
|
||||
previous_purchases_text = """
|
||||
Previous Purchase Records:
|
||||
1. Vendor: TechSupply Solutions
|
||||
Item: Desktop computers - 25 units
|
||||
Amount: $35,000
|
||||
Date: 2024-01-15
|
||||
Performance: Excellent delivery, good quality, delivered 2 days early
|
||||
Rating: 5/5
|
||||
Notes: Responsive support team, competitive pricing
|
||||
|
||||
2. Vendor: Office Solutions
|
||||
Item: Office furniture
|
||||
Amount: $12,000
|
||||
Date: 2024-02-20
|
||||
Performance: Delayed delivery by 1 week, average quality
|
||||
Rating: 2/5
|
||||
Notes: Poor communication, but acceptable product quality
|
||||
"""
|
||||
|
||||
procurement_preferences_text = """
|
||||
Procurement Policies and Preferences:
|
||||
1. Preferred vendors must have 3+ year warranty coverage
|
||||
2. Maximum delivery time: 30 days for non-critical items
|
||||
3. Bulk discount requirements: minimum 5% for orders over $50,000
|
||||
4. Prioritize vendors with sustainable/green practices
|
||||
5. Vendor rating system: require minimum 4/5 rating for new contracts
|
||||
"""
|
||||
|
||||
# Initializing and pruning databases
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
# Store data in different memory categories
|
||||
await cognee.add(
|
||||
data=[vendor_conversation_text_techsupply, vendor_conversation_text_office_solutions],
|
||||
node_set=["vendor_conversations"],
|
||||
)
|
||||
|
||||
await cognee.add(data=previous_purchases_text, node_set=["purchase_history"])
|
||||
|
||||
await cognee.add(data=procurement_preferences_text, node_set=["procurement_policies"])
|
||||
|
||||
# Process all data through Cognee's knowledge graph
|
||||
await cognee.cognify()
|
||||
|
||||
async def search_memory(self, query, search_categories=None):
|
||||
"""Search across different memory layers"""
|
||||
results = {}
|
||||
for category in search_categories:
|
||||
category_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text=query,
|
||||
node_type=NodeSet,
|
||||
node_name=[category],
|
||||
top_k=30,
|
||||
)
|
||||
results[category] = category_results
|
||||
|
||||
return results
|
||||
|
||||
|
||||
async def run_procurement_example():
|
||||
"""Main function demonstrating procurement memory system"""
|
||||
print("Building AI Procurement System with Memory: Cognee Integration...\n")
|
||||
|
||||
# Initialize the procurement memory system
|
||||
procurement_system = ProcurementMemorySystem()
|
||||
|
||||
# Setup memory with procurement data
|
||||
print("Setting up procurement memory data...")
|
||||
await procurement_system.setup_memory_data()
|
||||
print("Memory successfully populated and processed.\n")
|
||||
|
||||
research_questions = {
|
||||
"vendor_conversations": [
|
||||
"What are the laptops that are discussed, together with their vendors?",
|
||||
"What pricing was offered by each vendor before and after discounts?",
|
||||
"What were the delivery time estimates for each product?",
|
||||
],
|
||||
"purchase_history": [
|
||||
"Which vendors have we worked with in the past?",
|
||||
"What were the satisfaction ratings for each vendor?",
|
||||
"Were there any complaints or red flags associated with specific vendors?",
|
||||
],
|
||||
"procurement_policies": [
|
||||
"What are our company’s bulk discount requirements?",
|
||||
"What is the maximum acceptable delivery time for non-critical items?",
|
||||
"What is the minimum vendor rating for new contracts?",
|
||||
],
|
||||
}
|
||||
|
||||
research_notes = {}
|
||||
print("Running contextual research questions...\n")
|
||||
for category, questions in research_questions.items():
|
||||
print(f"Category: {category}")
|
||||
research_notes[category] = []
|
||||
for q in questions:
|
||||
print(f"Question: \n{q}")
|
||||
results = await procurement_system.search_memory(q, search_categories=[category])
|
||||
top_answer = results[category][0]
|
||||
print(f"Answer: \n{top_answer.strip()}\n")
|
||||
research_notes[category].append({"question": q, "answer": top_answer})
|
||||
|
||||
print("Contextual research complete.\n")
|
||||
|
||||
print("Compiling structured research information for decision-making...\n")
|
||||
research_information = "\n\n".join(
|
||||
f"Q: {note['question']}\nA: {note['answer'].strip()}"
|
||||
for section in research_notes.values()
|
||||
for note in section
|
||||
)
|
||||
|
||||
print("Compiled Research Summary:\n")
|
||||
print(research_information)
|
||||
print("\nPassing research to LLM for final procurement recommendation...\n")
|
||||
|
||||
final_decision = await LLMGateway.acreate_structured_output(
|
||||
text_input=research_information,
|
||||
system_prompt="""You are a procurement decision assistant. Use the provided QA pairs that were collected through a research phase. Recommend the best vendor,
|
||||
based on pricing, delivery, warranty, policy fit, and past performance. Be concise and justify your choice with evidence.
|
||||
""",
|
||||
response_model=str,
|
||||
)
|
||||
|
||||
print("Final Decision:")
|
||||
print(final_decision.strip())
|
||||
|
||||
|
||||
# Run the example
|
||||
if __name__ == "__main__":
|
||||
setup_logging(logging.ERROR)
|
||||
asyncio.run(run_procurement_example())
|
||||
|
|
@ -0,0 +1,58 @@
|
|||
import argparse
|
||||
import asyncio
|
||||
import cognee
|
||||
from cognee import SearchType
|
||||
from cognee.shared.logging_utils import setup_logging, ERROR
|
||||
|
||||
from cognee.api.v1.cognify.code_graph_pipeline import run_code_graph_pipeline
|
||||
|
||||
|
||||
async def main(repo_path, include_docs):
|
||||
run_status = False
|
||||
async for run_status in run_code_graph_pipeline(repo_path, include_docs=include_docs):
|
||||
run_status = run_status
|
||||
|
||||
# Test CODE search
|
||||
search_results = await cognee.search(query_type=SearchType.CODE, query_text="test")
|
||||
assert len(search_results) != 0, "The search results list is empty."
|
||||
print("\n\nSearch results are:\n")
|
||||
for result in search_results:
|
||||
print(f"{result}\n")
|
||||
|
||||
return run_status
|
||||
|
||||
|
||||
def parse_args():
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("--repo_path", type=str, required=True, help="Path to the repository")
|
||||
parser.add_argument(
|
||||
"--include_docs",
|
||||
type=lambda x: x.lower() in ("true", "1"),
|
||||
default=False,
|
||||
help="Whether or not to process non-code files",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--time",
|
||||
type=lambda x: x.lower() in ("true", "1"),
|
||||
default=True,
|
||||
help="Whether or not to time the pipeline run",
|
||||
)
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
logger = setup_logging(log_level=ERROR)
|
||||
|
||||
args = parse_args()
|
||||
|
||||
if args.time:
|
||||
import time
|
||||
|
||||
start_time = time.time()
|
||||
asyncio.run(main(args.repo_path, args.include_docs))
|
||||
end_time = time.time()
|
||||
print("\n" + "=" * 50)
|
||||
print(f"Pipeline Execution Time: {end_time - start_time:.2f} seconds")
|
||||
print("=" * 50 + "\n")
|
||||
else:
|
||||
asyncio.run(main(args.repo_path, args.include_docs))
|
||||
|
|
@ -0,0 +1,84 @@
|
|||
import asyncio
|
||||
import cognee
|
||||
from cognee.modules.engine.operations.setup import setup
|
||||
from cognee.modules.users.methods import get_default_user
|
||||
from cognee.shared.logging_utils import setup_logging, INFO
|
||||
from cognee.modules.pipelines import Task
|
||||
from cognee.api.v1.search import SearchType
|
||||
|
||||
# Prerequisites:
|
||||
# 1. Copy `.env.template` and rename it to `.env`.
|
||||
# 2. Add your OpenAI API key to the `.env` file in the `LLM_API_KEY` field:
|
||||
# LLM_API_KEY = "your_key_here"
|
||||
|
||||
|
||||
async def main():
|
||||
# Create a clean slate for cognee -- reset data and system state
|
||||
print("Resetting cognee data...")
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
print("Data reset complete.\n")
|
||||
|
||||
# Create relational database and tables
|
||||
await setup()
|
||||
|
||||
# cognee knowledge graph will be created based on this text
|
||||
text = """
|
||||
Natural language processing (NLP) is an interdisciplinary
|
||||
subfield of computer science and information retrieval.
|
||||
"""
|
||||
|
||||
print("Adding text to cognee:")
|
||||
print(text.strip())
|
||||
|
||||
# Let's recreate the cognee add pipeline through the custom pipeline framework
|
||||
from cognee.tasks.ingestion import ingest_data, resolve_data_directories
|
||||
|
||||
user = await get_default_user()
|
||||
|
||||
# Values for tasks need to be filled before calling the pipeline
|
||||
add_tasks = [
|
||||
Task(resolve_data_directories, include_subdirectories=True),
|
||||
Task(
|
||||
ingest_data,
|
||||
"main_dataset",
|
||||
user,
|
||||
),
|
||||
]
|
||||
# Forward tasks to custom pipeline along with data and user information
|
||||
await cognee.run_custom_pipeline(
|
||||
tasks=add_tasks, data=text, user=user, dataset="main_dataset", pipeline_name="add_pipeline"
|
||||
)
|
||||
print("Text added successfully.\n")
|
||||
|
||||
# Use LLMs and cognee to create knowledge graph
|
||||
from cognee.api.v1.cognify.cognify import get_default_tasks
|
||||
|
||||
cognify_tasks = await get_default_tasks(user=user)
|
||||
print("Recreating existing cognify pipeline in custom pipeline to create knowledge graph...\n")
|
||||
await cognee.run_custom_pipeline(
|
||||
tasks=cognify_tasks, user=user, dataset="main_dataset", pipeline_name="cognify_pipeline"
|
||||
)
|
||||
print("Cognify process complete.\n")
|
||||
|
||||
query_text = "Tell me about NLP"
|
||||
print(f"Searching cognee for insights with query: '{query_text}'")
|
||||
# Query cognee for insights on the added text
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text=query_text
|
||||
)
|
||||
|
||||
print("Search results:")
|
||||
# Display results
|
||||
for result_text in search_results:
|
||||
print(result_text)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
logger = setup_logging(log_level=INFO)
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
try:
|
||||
loop.run_until_complete(main())
|
||||
finally:
|
||||
loop.run_until_complete(loop.shutdown_asyncgens())
|
||||
|
|
@ -0,0 +1,212 @@
|
|||
import asyncio
|
||||
|
||||
import cognee
|
||||
from cognee.api.v1.search import SearchType
|
||||
from cognee.shared.logging_utils import setup_logging, ERROR
|
||||
|
||||
job_1 = """
|
||||
CV 1: Relevant
|
||||
Name: Dr. Emily Carter
|
||||
Contact Information:
|
||||
|
||||
Email: emily.carter@example.com
|
||||
Phone: (555) 123-4567
|
||||
Summary:
|
||||
|
||||
Senior Data Scientist with over 8 years of experience in machine learning and predictive analytics. Expertise in developing advanced algorithms and deploying scalable models in production environments.
|
||||
|
||||
Education:
|
||||
|
||||
Ph.D. in Computer Science, Stanford University (2014)
|
||||
B.S. in Mathematics, University of California, Berkeley (2010)
|
||||
Experience:
|
||||
|
||||
Senior Data Scientist, InnovateAI Labs (2016 – Present)
|
||||
Led a team in developing machine learning models for natural language processing applications.
|
||||
Implemented deep learning algorithms that improved prediction accuracy by 25%.
|
||||
Collaborated with cross-functional teams to integrate models into cloud-based platforms.
|
||||
Data Scientist, DataWave Analytics (2014 – 2016)
|
||||
Developed predictive models for customer segmentation and churn analysis.
|
||||
Analyzed large datasets using Hadoop and Spark frameworks.
|
||||
Skills:
|
||||
|
||||
Programming Languages: Python, R, SQL
|
||||
Machine Learning: TensorFlow, Keras, Scikit-Learn
|
||||
Big Data Technologies: Hadoop, Spark
|
||||
Data Visualization: Tableau, Matplotlib
|
||||
"""
|
||||
|
||||
job_2 = """
|
||||
CV 2: Relevant
|
||||
Name: Michael Rodriguez
|
||||
Contact Information:
|
||||
|
||||
Email: michael.rodriguez@example.com
|
||||
Phone: (555) 234-5678
|
||||
Summary:
|
||||
|
||||
Data Scientist with a strong background in machine learning and statistical modeling. Skilled in handling large datasets and translating data into actionable business insights.
|
||||
|
||||
Education:
|
||||
|
||||
M.S. in Data Science, Carnegie Mellon University (2013)
|
||||
B.S. in Computer Science, University of Michigan (2011)
|
||||
Experience:
|
||||
|
||||
Senior Data Scientist, Alpha Analytics (2017 – Present)
|
||||
Developed machine learning models to optimize marketing strategies.
|
||||
Reduced customer acquisition cost by 15% through predictive modeling.
|
||||
Data Scientist, TechInsights (2013 – 2017)
|
||||
Analyzed user behavior data to improve product features.
|
||||
Implemented A/B testing frameworks to evaluate product changes.
|
||||
Skills:
|
||||
|
||||
Programming Languages: Python, Java, SQL
|
||||
Machine Learning: Scikit-Learn, XGBoost
|
||||
Data Visualization: Seaborn, Plotly
|
||||
Databases: MySQL, MongoDB
|
||||
"""
|
||||
|
||||
|
||||
job_3 = """
|
||||
CV 3: Relevant
|
||||
Name: Sarah Nguyen
|
||||
Contact Information:
|
||||
|
||||
Email: sarah.nguyen@example.com
|
||||
Phone: (555) 345-6789
|
||||
Summary:
|
||||
|
||||
Data Scientist specializing in machine learning with 6 years of experience. Passionate about leveraging data to drive business solutions and improve product performance.
|
||||
|
||||
Education:
|
||||
|
||||
M.S. in Statistics, University of Washington (2014)
|
||||
B.S. in Applied Mathematics, University of Texas at Austin (2012)
|
||||
Experience:
|
||||
|
||||
Data Scientist, QuantumTech (2016 – Present)
|
||||
Designed and implemented machine learning algorithms for financial forecasting.
|
||||
Improved model efficiency by 20% through algorithm optimization.
|
||||
Junior Data Scientist, DataCore Solutions (2014 – 2016)
|
||||
Assisted in developing predictive models for supply chain optimization.
|
||||
Conducted data cleaning and preprocessing on large datasets.
|
||||
Skills:
|
||||
|
||||
Programming Languages: Python, R
|
||||
Machine Learning Frameworks: PyTorch, Scikit-Learn
|
||||
Statistical Analysis: SAS, SPSS
|
||||
Cloud Platforms: AWS, Azure
|
||||
"""
|
||||
|
||||
|
||||
job_4 = """
|
||||
CV 4: Not Relevant
|
||||
Name: David Thompson
|
||||
Contact Information:
|
||||
|
||||
Email: david.thompson@example.com
|
||||
Phone: (555) 456-7890
|
||||
Summary:
|
||||
|
||||
Creative Graphic Designer with over 8 years of experience in visual design and branding. Proficient in Adobe Creative Suite and passionate about creating compelling visuals.
|
||||
|
||||
Education:
|
||||
|
||||
B.F.A. in Graphic Design, Rhode Island School of Design (2012)
|
||||
Experience:
|
||||
|
||||
Senior Graphic Designer, CreativeWorks Agency (2015 – Present)
|
||||
Led design projects for clients in various industries.
|
||||
Created branding materials that increased client engagement by 30%.
|
||||
Graphic Designer, Visual Innovations (2012 – 2015)
|
||||
Designed marketing collateral, including brochures, logos, and websites.
|
||||
Collaborated with the marketing team to develop cohesive brand strategies.
|
||||
Skills:
|
||||
|
||||
Design Software: Adobe Photoshop, Illustrator, InDesign
|
||||
Web Design: HTML, CSS
|
||||
Specialties: Branding and Identity, Typography
|
||||
"""
|
||||
|
||||
|
||||
job_5 = """
|
||||
CV 5: Not Relevant
|
||||
Name: Jessica Miller
|
||||
Contact Information:
|
||||
|
||||
Email: jessica.miller@example.com
|
||||
Phone: (555) 567-8901
|
||||
Summary:
|
||||
|
||||
Experienced Sales Manager with a strong track record in driving sales growth and building high-performing teams. Excellent communication and leadership skills.
|
||||
|
||||
Education:
|
||||
|
||||
B.A. in Business Administration, University of Southern California (2010)
|
||||
Experience:
|
||||
|
||||
Sales Manager, Global Enterprises (2015 – Present)
|
||||
Managed a sales team of 15 members, achieving a 20% increase in annual revenue.
|
||||
Developed sales strategies that expanded customer base by 25%.
|
||||
Sales Representative, Market Leaders Inc. (2010 – 2015)
|
||||
Consistently exceeded sales targets and received the 'Top Salesperson' award in 2013.
|
||||
Skills:
|
||||
|
||||
Sales Strategy and Planning
|
||||
Team Leadership and Development
|
||||
CRM Software: Salesforce, Zoho
|
||||
Negotiation and Relationship Building
|
||||
"""
|
||||
|
||||
|
||||
async def main(enable_steps):
|
||||
# Step 1: Reset data and system state
|
||||
if enable_steps.get("prune_data"):
|
||||
await cognee.prune.prune_data()
|
||||
print("Data pruned.")
|
||||
|
||||
if enable_steps.get("prune_system"):
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
print("System pruned.")
|
||||
|
||||
# Step 2: Add text
|
||||
if enable_steps.get("add_text"):
|
||||
text_list = [job_1, job_2, job_3, job_4, job_5]
|
||||
for text in text_list:
|
||||
await cognee.add(text)
|
||||
print(f"Added text: {text[:35]}...")
|
||||
|
||||
# Step 3: Create knowledge graph
|
||||
if enable_steps.get("cognify"):
|
||||
await cognee.cognify()
|
||||
print("Knowledge graph created.")
|
||||
|
||||
# Step 4: Query insights
|
||||
if enable_steps.get("retriever"):
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text="Who has experience in design tools?"
|
||||
)
|
||||
print(search_results)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
logger = setup_logging(log_level=ERROR)
|
||||
|
||||
rebuild_kg = True
|
||||
retrieve = True
|
||||
steps_to_enable = {
|
||||
"prune_data": rebuild_kg,
|
||||
"prune_system": rebuild_kg,
|
||||
"add_text": rebuild_kg,
|
||||
"cognify": rebuild_kg,
|
||||
"graph_metrics": rebuild_kg,
|
||||
"retriever": retrieve,
|
||||
}
|
||||
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
try:
|
||||
loop.run_until_complete(main(steps_to_enable))
|
||||
finally:
|
||||
loop.run_until_complete(loop.shutdown_asyncgens())
|
||||
|
|
@ -0,0 +1,110 @@
|
|||
import asyncio
|
||||
import pathlib
|
||||
import os
|
||||
|
||||
import cognee
|
||||
from cognee import memify
|
||||
from cognee.api.v1.visualize.visualize import visualize_graph
|
||||
from cognee.shared.logging_utils import setup_logging, ERROR
|
||||
from cognee.modules.pipelines.tasks.task import Task
|
||||
from cognee.tasks.memify.extract_subgraph_chunks import extract_subgraph_chunks
|
||||
from cognee.tasks.codingagents.coding_rule_associations import add_rule_associations
|
||||
|
||||
# Prerequisites:
|
||||
# 1. Copy `.env.template` and rename it to `.env`.
|
||||
# 2. Add your OpenAI API key to the `.env` file in the `LLM_API_KEY` field:
|
||||
# LLM_API_KEY = "your_key_here"
|
||||
|
||||
|
||||
async def main():
|
||||
# Create a clean slate for cognee -- reset data and system state
|
||||
print("Resetting cognee data...")
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
print("Data reset complete.\n")
|
||||
print("Adding conversation about rules to cognee:\n")
|
||||
|
||||
coding_rules_chat_from_principal_engineer = """
|
||||
We want code to be formatted by PEP8 standards.
|
||||
Typing and Docstrings must be added.
|
||||
Please also make sure to write NOTE: on all more complex code segments.
|
||||
If there is any duplicate code, try to handle it in one function to avoid code duplication.
|
||||
Susan should also always review new code changes before merging to main.
|
||||
New releases should not happen on Friday so we don't have to fix them during the weekend.
|
||||
"""
|
||||
print(
|
||||
f"Coding rules conversation with principal engineer: {coding_rules_chat_from_principal_engineer}"
|
||||
)
|
||||
|
||||
coding_rules_chat_from_manager = """
|
||||
Susan should always review new code changes before merging to main.
|
||||
New releases should not happen on Friday so we don't have to fix them during the weekend.
|
||||
"""
|
||||
print(f"Coding rules conversation with manager: {coding_rules_chat_from_manager}")
|
||||
|
||||
# Add the text, and make it available for cognify
|
||||
await cognee.add([coding_rules_chat_from_principal_engineer, coding_rules_chat_from_manager])
|
||||
print("Text added successfully.\n")
|
||||
|
||||
# Use LLMs and cognee to create knowledge graph
|
||||
await cognee.cognify()
|
||||
print("Cognify process complete.\n")
|
||||
|
||||
# Visualize graph after cognification
|
||||
file_path = os.path.join(
|
||||
pathlib.Path(__file__).parent, ".artifacts", "graph_visualization_only_cognify.html"
|
||||
)
|
||||
await visualize_graph(file_path)
|
||||
print(f"Open file to see graph visualization only after cognification: {file_path}\n")
|
||||
|
||||
# After graph is created, create a second pipeline that will go through the graph and enchance it with specific
|
||||
# coding rule nodes
|
||||
|
||||
# extract_subgraph_chunks is a function that returns all document chunks from specified subgraphs (if no subgraph is specifed the whole graph will be sent through memify)
|
||||
subgraph_extraction_tasks = [Task(extract_subgraph_chunks)]
|
||||
|
||||
# add_rule_associations is a function that handles processing coding rules from chunks and keeps track of
|
||||
# existing rules so duplicate rules won't be created. As the result of this processing new Rule nodes will be created
|
||||
# in the graph that specify coding rules found in conversations.
|
||||
coding_rules_association_tasks = [
|
||||
Task(
|
||||
add_rule_associations,
|
||||
rules_nodeset_name="coding_agent_rules",
|
||||
task_config={"batch_size": 1},
|
||||
),
|
||||
]
|
||||
|
||||
# Memify accepts these tasks and orchestrates forwarding of graph data through these tasks (if data is not specified).
|
||||
# If data is explicitely specified in the arguments this specified data will be forwarded through the tasks instead
|
||||
await memify(
|
||||
extraction_tasks=subgraph_extraction_tasks,
|
||||
enrichment_tasks=coding_rules_association_tasks,
|
||||
)
|
||||
|
||||
# Find the new specific coding rules added to graph through memify (created based on chat conversation between team members)
|
||||
coding_rules = await cognee.search(
|
||||
query_text="List me the coding rules",
|
||||
query_type=cognee.SearchType.CODING_RULES,
|
||||
node_name=["coding_agent_rules"],
|
||||
)
|
||||
|
||||
print("Coding rules created by memify:")
|
||||
for coding_rule in coding_rules:
|
||||
print("- " + coding_rule)
|
||||
|
||||
# Visualize new graph with added memify context
|
||||
file_path = os.path.join(
|
||||
pathlib.Path(__file__).parent, ".artifacts", "graph_visualization_after_memify.html"
|
||||
)
|
||||
await visualize_graph(file_path)
|
||||
print(f"\nOpen file to see graph visualization after memify enhancment: {file_path}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
logger = setup_logging(log_level=ERROR)
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
try:
|
||||
loop.run_until_complete(main())
|
||||
finally:
|
||||
loop.run_until_complete(loop.shutdown_asyncgens())
|
||||
|
|
@ -0,0 +1,39 @@
|
|||
[
|
||||
{
|
||||
"name": "TechNova Inc.",
|
||||
"departments": [
|
||||
"Engineering",
|
||||
"Marketing"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "GreenFuture Solutions",
|
||||
"departments": [
|
||||
"Research & Development",
|
||||
"Sales",
|
||||
"Customer Support"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "Skyline Financials",
|
||||
"departments": [
|
||||
"Accounting"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "MediCare Plus",
|
||||
"departments": [
|
||||
"Healthcare",
|
||||
"Administration"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "NextGen Robotics",
|
||||
"departments": [
|
||||
"AI Development",
|
||||
"Manufacturing",
|
||||
"HR"
|
||||
]
|
||||
}
|
||||
]
|
||||
|
||||
|
|
@ -0,0 +1,53 @@
|
|||
[
|
||||
{
|
||||
"name": "John Doe",
|
||||
"company": "TechNova Inc.",
|
||||
"department": "Engineering"
|
||||
},
|
||||
{
|
||||
"name": "Jane Smith",
|
||||
"company": "TechNova Inc.",
|
||||
"department": "Marketing"
|
||||
},
|
||||
{
|
||||
"name": "Alice Johnson",
|
||||
"company": "GreenFuture Solutions",
|
||||
"department": "Sales"
|
||||
},
|
||||
{
|
||||
"name": "Bob Williams",
|
||||
"company": "GreenFuture Solutions",
|
||||
"department": "Customer Support"
|
||||
},
|
||||
{
|
||||
"name": "Michael Brown",
|
||||
"company": "Skyline Financials",
|
||||
"department": "Accounting"
|
||||
},
|
||||
{
|
||||
"name": "Emily Davis",
|
||||
"company": "MediCare Plus",
|
||||
"department": "Healthcare"
|
||||
},
|
||||
{
|
||||
"name": "David Wilson",
|
||||
"company": "MediCare Plus",
|
||||
"department": "Administration"
|
||||
},
|
||||
{
|
||||
"name": "Emma Thompson",
|
||||
"company": "NextGen Robotics",
|
||||
"department": "AI Development"
|
||||
},
|
||||
{
|
||||
"name": "Chris Martin",
|
||||
"company": "NextGen Robotics",
|
||||
"department": "Manufacturing"
|
||||
},
|
||||
{
|
||||
"name": "Sophia White",
|
||||
"company": "NextGen Robotics",
|
||||
"department": "HR"
|
||||
}
|
||||
]
|
||||
|
||||
|
|
@ -0,0 +1,119 @@
|
|||
import os
|
||||
import json
|
||||
import asyncio
|
||||
from typing import List, Any
|
||||
from cognee import prune
|
||||
from cognee import visualize_graph
|
||||
from cognee.low_level import setup, DataPoint
|
||||
from cognee.modules.data.methods import load_or_create_datasets
|
||||
from cognee.modules.users.methods import get_default_user
|
||||
from cognee.pipelines import run_tasks, Task
|
||||
from cognee.tasks.storage import add_data_points
|
||||
|
||||
|
||||
class Person(DataPoint):
|
||||
name: str
|
||||
# Metadata "index_fields" specifies which DataPoint fields should be embedded for vector search
|
||||
metadata: dict = {"index_fields": ["name"]}
|
||||
|
||||
|
||||
class Department(DataPoint):
|
||||
name: str
|
||||
employees: list[Person]
|
||||
# Metadata "index_fields" specifies which DataPoint fields should be embedded for vector search
|
||||
metadata: dict = {"index_fields": ["name"]}
|
||||
|
||||
|
||||
class CompanyType(DataPoint):
|
||||
name: str = "Company"
|
||||
# Metadata "index_fields" specifies which DataPoint fields should be embedded for vector search
|
||||
metadata: dict = {"index_fields": ["name"]}
|
||||
|
||||
|
||||
class Company(DataPoint):
|
||||
name: str
|
||||
departments: list[Department]
|
||||
is_type: CompanyType
|
||||
# Metadata "index_fields" specifies which DataPoint fields should be embedded for vector search
|
||||
metadata: dict = {"index_fields": ["name"]}
|
||||
|
||||
|
||||
def ingest_files(data: List[Any]):
|
||||
people_data_points = {}
|
||||
departments_data_points = {}
|
||||
companies_data_points = {}
|
||||
|
||||
for data_item in data:
|
||||
people = data_item["people"]
|
||||
companies = data_item["companies"]
|
||||
|
||||
for person in people:
|
||||
new_person = Person(name=person["name"])
|
||||
people_data_points[person["name"]] = new_person
|
||||
|
||||
if person["department"] not in departments_data_points:
|
||||
departments_data_points[person["department"]] = Department(
|
||||
name=person["department"], employees=[new_person]
|
||||
)
|
||||
else:
|
||||
departments_data_points[person["department"]].employees.append(new_person)
|
||||
|
||||
# Create a single CompanyType node, so we connect all companies to it.
|
||||
companyType = CompanyType()
|
||||
|
||||
for company in companies:
|
||||
new_company = Company(name=company["name"], departments=[], is_type=companyType)
|
||||
companies_data_points[company["name"]] = new_company
|
||||
|
||||
for department_name in company["departments"]:
|
||||
if department_name not in departments_data_points:
|
||||
departments_data_points[department_name] = Department(
|
||||
name=department_name, employees=[]
|
||||
)
|
||||
|
||||
new_company.departments.append(departments_data_points[department_name])
|
||||
|
||||
return list(companies_data_points.values())
|
||||
|
||||
|
||||
async def main():
|
||||
await prune.prune_data()
|
||||
await prune.prune_system(metadata=True)
|
||||
|
||||
# Create relational database tables
|
||||
await setup()
|
||||
|
||||
# If no user is provided use default user
|
||||
user = await get_default_user()
|
||||
|
||||
# Create dataset object to keep track of pipeline status
|
||||
datasets = await load_or_create_datasets(["test_dataset"], [], user)
|
||||
|
||||
# Prepare data for pipeline
|
||||
companies_file_path = os.path.join(os.path.dirname(__file__), "data", "companies.json")
|
||||
companies = json.loads(open(companies_file_path, "r").read())
|
||||
people_file_path = os.path.join(os.path.dirname(__file__), "data", "people.json")
|
||||
people = json.loads(open(people_file_path, "r").read())
|
||||
|
||||
# Run tasks expects a list of data even if it is just one document
|
||||
data = [{"companies": companies, "people": people}]
|
||||
|
||||
pipeline = run_tasks(
|
||||
[Task(ingest_files), Task(add_data_points)],
|
||||
dataset_id=datasets[0].id,
|
||||
data=data,
|
||||
incremental_loading=False,
|
||||
)
|
||||
|
||||
async for status in pipeline:
|
||||
print(status)
|
||||
|
||||
# Or use our simple graph preview
|
||||
graph_file_path = str(
|
||||
os.path.join(os.path.dirname(__file__), ".artifacts/graph_visualization.html")
|
||||
)
|
||||
await visualize_graph(graph_file_path)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
|
|
@ -0,0 +1,266 @@
|
|||
"""Cognee demo with simplified structure."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import logging
|
||||
from collections import defaultdict
|
||||
from pathlib import Path
|
||||
from typing import Any, Iterable, List, Mapping
|
||||
|
||||
from cognee import config, prune, search, SearchType, visualize_graph
|
||||
from cognee.low_level import setup, DataPoint
|
||||
from cognee.pipelines import run_tasks, Task
|
||||
from cognee.tasks.storage import add_data_points
|
||||
from cognee.tasks.storage.index_graph_edges import index_graph_edges
|
||||
from cognee.modules.users.methods import get_default_user
|
||||
from cognee.modules.data.methods import load_or_create_datasets
|
||||
|
||||
|
||||
class Person(DataPoint):
|
||||
"""Represent a person."""
|
||||
|
||||
name: str
|
||||
metadata: dict = {"index_fields": ["name"]}
|
||||
|
||||
|
||||
class Department(DataPoint):
|
||||
"""Represent a department."""
|
||||
|
||||
name: str
|
||||
employees: list[Person]
|
||||
metadata: dict = {"index_fields": ["name"]}
|
||||
|
||||
|
||||
class CompanyType(DataPoint):
|
||||
"""Represent a company type."""
|
||||
|
||||
name: str = "Company"
|
||||
|
||||
|
||||
class Company(DataPoint):
|
||||
"""Represent a company."""
|
||||
|
||||
name: str
|
||||
departments: list[Department]
|
||||
is_type: CompanyType
|
||||
metadata: dict = {"index_fields": ["name"]}
|
||||
|
||||
|
||||
ROOT = Path(__file__).resolve().parent
|
||||
DATA_DIR = ROOT / "data"
|
||||
COGNEE_DIR = ROOT / ".cognee_system"
|
||||
ARTIFACTS_DIR = ROOT / ".artifacts"
|
||||
GRAPH_HTML = ARTIFACTS_DIR / "graph_visualization.html"
|
||||
COMPANIES_JSON = DATA_DIR / "companies.json"
|
||||
PEOPLE_JSON = DATA_DIR / "people.json"
|
||||
|
||||
|
||||
def load_json_file(path: Path) -> Any:
|
||||
"""Load a JSON file."""
|
||||
if not path.exists():
|
||||
raise FileNotFoundError(f"Missing required file: {path}")
|
||||
return json.loads(path.read_text(encoding="utf-8"))
|
||||
|
||||
|
||||
def remove_duplicates_preserve_order(seq: Iterable[Any]) -> list[Any]:
|
||||
"""Return list with duplicates removed while preserving order."""
|
||||
seen = set()
|
||||
out = []
|
||||
for x in seq:
|
||||
if x in seen:
|
||||
continue
|
||||
seen.add(x)
|
||||
out.append(x)
|
||||
return out
|
||||
|
||||
|
||||
def collect_people(payloads: Iterable[Mapping[str, Any]]) -> list[Mapping[str, Any]]:
|
||||
"""Collect people from payloads."""
|
||||
people = [person for payload in payloads for person in payload.get("people", [])]
|
||||
return people
|
||||
|
||||
|
||||
def collect_companies(payloads: Iterable[Mapping[str, Any]]) -> list[Mapping[str, Any]]:
|
||||
"""Collect companies from payloads."""
|
||||
companies = [company for payload in payloads for company in payload.get("companies", [])]
|
||||
return companies
|
||||
|
||||
|
||||
def build_people_nodes(people: Iterable[Mapping[str, Any]]) -> dict:
|
||||
"""Build person nodes keyed by name."""
|
||||
nodes = {p["name"]: Person(name=p["name"]) for p in people if p.get("name")}
|
||||
return nodes
|
||||
|
||||
|
||||
def group_people_by_department(people: Iterable[Mapping[str, Any]]) -> dict:
|
||||
"""Group person names by department."""
|
||||
groups = defaultdict(list)
|
||||
for person in people:
|
||||
name = person.get("name")
|
||||
if not name:
|
||||
continue
|
||||
dept = person.get("department", "Unknown")
|
||||
groups[dept].append(name)
|
||||
return groups
|
||||
|
||||
|
||||
def collect_declared_departments(
|
||||
groups: Mapping[str, list[str]], companies: Iterable[Mapping[str, Any]]
|
||||
) -> set:
|
||||
"""Collect department names referenced anywhere."""
|
||||
names = set(groups)
|
||||
for company in companies:
|
||||
for dept in company.get("departments", []):
|
||||
names.add(dept)
|
||||
return names
|
||||
|
||||
|
||||
def build_department_nodes(dept_names: Iterable[str]) -> dict:
|
||||
"""Build department nodes keyed by name."""
|
||||
nodes = {name: Department(name=name, employees=[]) for name in dept_names}
|
||||
return nodes
|
||||
|
||||
|
||||
def build_company_nodes(companies: Iterable[Mapping[str, Any]], company_type: CompanyType) -> dict:
|
||||
"""Build company nodes keyed by name."""
|
||||
nodes = {
|
||||
c["name"]: Company(name=c["name"], departments=[], is_type=company_type)
|
||||
for c in companies
|
||||
if c.get("name")
|
||||
}
|
||||
return nodes
|
||||
|
||||
|
||||
def iterate_company_department_pairs(companies: Iterable[Mapping[str, Any]]):
|
||||
"""Yield (company_name, department_name) pairs."""
|
||||
for company in companies:
|
||||
comp_name = company.get("name")
|
||||
if not comp_name:
|
||||
continue
|
||||
for dept in company.get("departments", []):
|
||||
yield comp_name, dept
|
||||
|
||||
|
||||
def attach_departments_to_companies(
|
||||
companies: Iterable[Mapping[str, Any]],
|
||||
dept_nodes: Mapping[str, Department],
|
||||
company_nodes: Mapping[str, Company],
|
||||
) -> None:
|
||||
"""Attach department nodes to companies."""
|
||||
for comp_name in company_nodes:
|
||||
company_nodes[comp_name].departments = []
|
||||
for comp_name, dept_name in iterate_company_department_pairs(companies):
|
||||
dept = dept_nodes.get(dept_name)
|
||||
company = company_nodes.get(comp_name)
|
||||
if not dept or not company:
|
||||
continue
|
||||
company.departments.append(dept)
|
||||
|
||||
|
||||
def attach_employees_to_departments(
|
||||
groups: Mapping[str, list[str]],
|
||||
people_nodes: Mapping[str, Person],
|
||||
dept_nodes: Mapping[str, Department],
|
||||
) -> None:
|
||||
"""Attach employees to departments."""
|
||||
for dept in dept_nodes.values():
|
||||
dept.employees = []
|
||||
for dept_name, names in groups.items():
|
||||
unique_names = remove_duplicates_preserve_order(names)
|
||||
target = dept_nodes.get(dept_name)
|
||||
if not target:
|
||||
continue
|
||||
employees = [people_nodes[n] for n in unique_names if n in people_nodes]
|
||||
target.employees = employees
|
||||
|
||||
|
||||
def build_companies(payloads: Iterable[Mapping[str, Any]]) -> list[Company]:
|
||||
"""Build company nodes from payloads."""
|
||||
people = collect_people(payloads)
|
||||
companies = collect_companies(payloads)
|
||||
people_nodes = build_people_nodes(people)
|
||||
groups = group_people_by_department(people)
|
||||
dept_names = collect_declared_departments(groups, companies)
|
||||
dept_nodes = build_department_nodes(dept_names)
|
||||
company_type = CompanyType()
|
||||
company_nodes = build_company_nodes(companies, company_type)
|
||||
attach_departments_to_companies(companies, dept_nodes, company_nodes)
|
||||
attach_employees_to_departments(groups, people_nodes, dept_nodes)
|
||||
result = list(company_nodes.values())
|
||||
return result
|
||||
|
||||
|
||||
def load_default_payload() -> list[Mapping[str, Any]]:
|
||||
"""Load the default payload from data files."""
|
||||
companies = load_json_file(COMPANIES_JSON)
|
||||
people = load_json_file(PEOPLE_JSON)
|
||||
payload = [{"companies": companies, "people": people}]
|
||||
return payload
|
||||
|
||||
|
||||
def ingest_payloads(data: List[Any] | None) -> list[Company]:
|
||||
"""Ingest payloads and build company nodes."""
|
||||
if not data or data == [None]:
|
||||
data = load_default_payload()
|
||||
companies = build_companies(data)
|
||||
return companies
|
||||
|
||||
|
||||
async def execute_pipeline() -> None:
|
||||
"""Execute Cognee pipeline."""
|
||||
|
||||
# Configure system paths
|
||||
logging.info("Configuring Cognee directories at %s", COGNEE_DIR)
|
||||
config.system_root_directory(str(COGNEE_DIR))
|
||||
ARTIFACTS_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Reset state and initialize
|
||||
await prune.prune_system(metadata=True)
|
||||
await setup()
|
||||
|
||||
# Get user and dataset
|
||||
user = await get_default_user()
|
||||
datasets = await load_or_create_datasets(["demo_dataset"], [], user)
|
||||
dataset_id = datasets[0].id
|
||||
|
||||
# Build and run pipeline
|
||||
tasks = [Task(ingest_payloads), Task(add_data_points)]
|
||||
pipeline = run_tasks(tasks, dataset_id, None, user, "demo_pipeline")
|
||||
async for status in pipeline:
|
||||
logging.info("Pipeline status: %s", status)
|
||||
|
||||
# Post-process: index graph edges and visualize
|
||||
await index_graph_edges()
|
||||
await visualize_graph(str(GRAPH_HTML))
|
||||
|
||||
# Run query against graph
|
||||
completion = await search(
|
||||
query_text="Who works for GreenFuture Solutions?",
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
)
|
||||
result = completion
|
||||
logging.info("Graph completion result: %s", result)
|
||||
|
||||
|
||||
def configure_logging() -> None:
|
||||
"""Configure logging."""
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s | %(levelname)s | %(message)s",
|
||||
)
|
||||
|
||||
|
||||
async def main() -> None:
|
||||
"""Run main function."""
|
||||
configure_logging()
|
||||
try:
|
||||
await execute_pipeline()
|
||||
except Exception:
|
||||
logging.exception("Run failed")
|
||||
raise
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
|
|
@ -0,0 +1,109 @@
|
|||
[{
|
||||
"id": "customer_1",
|
||||
"name": "John Doe",
|
||||
"preferences": [{
|
||||
"id": "preference_1",
|
||||
"name": "ShoeSize",
|
||||
"value": "40.5"
|
||||
}, {
|
||||
"id": "preference_2",
|
||||
"name": "Color",
|
||||
"value": "Navy Blue"
|
||||
}, {
|
||||
"id": "preference_3",
|
||||
"name": "Color",
|
||||
"value": "White"
|
||||
}, {
|
||||
"id": "preference_4",
|
||||
"name": "ShoeType",
|
||||
"value": "Regular Sneakers"
|
||||
}],
|
||||
"products": [{
|
||||
"id": "product_1",
|
||||
"name": "Sneakers",
|
||||
"price": 79.99,
|
||||
"colors": ["Blue", "Brown"],
|
||||
"type": "Regular Sneakers",
|
||||
"action": "purchased"
|
||||
}, {
|
||||
"id": "product_2",
|
||||
"name": "Shirt",
|
||||
"price": 19.99,
|
||||
"colors": ["Black"],
|
||||
"type": "T-Shirt",
|
||||
"action": "liked"
|
||||
}, {
|
||||
"id": "product_3",
|
||||
"name": "Jacket",
|
||||
"price": 59.99,
|
||||
"colors": ["Gray", "White"],
|
||||
"type": "Jacket",
|
||||
"action": "purchased"
|
||||
}, {
|
||||
"id": "product_4",
|
||||
"name": "Shoes",
|
||||
"price": 129.99,
|
||||
"colors": ["Red", "Black"],
|
||||
"type": "Formal Shoes",
|
||||
"action": "liked"
|
||||
}]
|
||||
}, {
|
||||
"id": "customer_2",
|
||||
"name": "Jane Smith",
|
||||
"preferences": [{
|
||||
"id": "preference_5",
|
||||
"name": "ShoeSize",
|
||||
"value": "38.5"
|
||||
}, {
|
||||
"id": "preference_6",
|
||||
"name": "Color",
|
||||
"value": "Black"
|
||||
}, {
|
||||
"id": "preference_7",
|
||||
"name": "ShoeType",
|
||||
"value": "Slip-On"
|
||||
}],
|
||||
"products": [{
|
||||
"id": "product_5",
|
||||
"name": "Sneakers",
|
||||
"price": 69.99,
|
||||
"colors": ["Blue", "White"],
|
||||
"type": "Slip-On",
|
||||
"action": "purchased"
|
||||
}, {
|
||||
"id": "product_6",
|
||||
"name": "Shirt",
|
||||
"price": 14.99,
|
||||
"colors": ["Red", "Blue"],
|
||||
"type": "T-Shirt",
|
||||
"action": "purchased"
|
||||
}, {
|
||||
"id": "product_7",
|
||||
"name": "Jacket",
|
||||
"price": 49.99,
|
||||
"colors": ["Gray", "Black"],
|
||||
"type": "Jacket",
|
||||
"action": "liked"
|
||||
}]
|
||||
}, {
|
||||
"id": "customer_3",
|
||||
"name": "Michael Johnson",
|
||||
"preferences": [{
|
||||
"id": "preference_8",
|
||||
"name": "Color",
|
||||
"value": "Red"
|
||||
}, {
|
||||
"id": "preference_9",
|
||||
"name": "ShoeType",
|
||||
"value": "Boots"
|
||||
}],
|
||||
"products": [{
|
||||
"id": "product_8",
|
||||
"name": "Cowboy Boots",
|
||||
"price": 299.99,
|
||||
"colors": ["Red", "White"],
|
||||
"type": "Cowboy Boots",
|
||||
"action": "purchased"
|
||||
}]
|
||||
}]
|
||||
|
||||
|
|
@ -0,0 +1,193 @@
|
|||
import os
|
||||
import json
|
||||
import asyncio
|
||||
from neo4j import exceptions
|
||||
|
||||
from cognee import prune
|
||||
|
||||
# from cognee import visualize_graph
|
||||
from cognee.infrastructure.databases.graph import get_graph_engine
|
||||
from cognee.low_level import setup, DataPoint
|
||||
from cognee.pipelines import run_tasks, Task
|
||||
from cognee.tasks.storage import add_data_points
|
||||
|
||||
|
||||
class Products(DataPoint):
|
||||
name: str = "Products"
|
||||
|
||||
|
||||
products_aggregator_node = Products()
|
||||
|
||||
|
||||
class Product(DataPoint):
|
||||
id: str
|
||||
name: str
|
||||
type: str
|
||||
price: float
|
||||
colors: list[str]
|
||||
is_type: Products = products_aggregator_node
|
||||
|
||||
|
||||
class Preferences(DataPoint):
|
||||
name: str = "Preferences"
|
||||
|
||||
|
||||
preferences_aggregator_node = Preferences()
|
||||
|
||||
|
||||
class Preference(DataPoint):
|
||||
id: str
|
||||
name: str
|
||||
value: str
|
||||
is_type: Preferences = preferences_aggregator_node
|
||||
|
||||
|
||||
class Customers(DataPoint):
|
||||
name: str = "Customers"
|
||||
|
||||
|
||||
customers_aggregator_node = Customers()
|
||||
|
||||
|
||||
class Customer(DataPoint):
|
||||
id: str
|
||||
name: str
|
||||
has_preference: list[Preference]
|
||||
purchased: list[Product]
|
||||
liked: list[Product]
|
||||
is_type: Customers = customers_aggregator_node
|
||||
|
||||
|
||||
def ingest_files():
|
||||
customers_file_path = os.path.join(os.path.dirname(__file__), "data/customers.json")
|
||||
customers = json.loads(open(customers_file_path, "r").read())
|
||||
|
||||
customers_data_points = {}
|
||||
products_data_points = {}
|
||||
preferences_data_points = {}
|
||||
|
||||
for customer in customers:
|
||||
new_customer = Customer(
|
||||
id=customer["id"],
|
||||
name=customer["name"],
|
||||
liked=[],
|
||||
purchased=[],
|
||||
has_preference=[],
|
||||
)
|
||||
customers_data_points[customer["name"]] = new_customer
|
||||
|
||||
for product in customer["products"]:
|
||||
if product["id"] not in products_data_points:
|
||||
products_data_points[product["id"]] = Product(
|
||||
id=product["id"],
|
||||
type=product["type"],
|
||||
name=product["name"],
|
||||
price=product["price"],
|
||||
colors=product["colors"],
|
||||
)
|
||||
|
||||
new_product = products_data_points[product["id"]]
|
||||
|
||||
if product["action"] == "purchased":
|
||||
new_customer.purchased.append(new_product)
|
||||
elif product["action"] == "liked":
|
||||
new_customer.liked.append(new_product)
|
||||
|
||||
for preference in customer["preferences"]:
|
||||
if preference["id"] not in preferences_data_points:
|
||||
preferences_data_points[preference["id"]] = Preference(
|
||||
id=preference["id"],
|
||||
name=preference["name"],
|
||||
value=preference["value"],
|
||||
)
|
||||
|
||||
new_preference = preferences_data_points[preference["id"]]
|
||||
new_customer.has_preference.append(new_preference)
|
||||
|
||||
return customers_data_points.values()
|
||||
|
||||
|
||||
async def main():
|
||||
await prune.prune_data()
|
||||
await prune.prune_system(metadata=True)
|
||||
|
||||
await setup()
|
||||
|
||||
pipeline = run_tasks([Task(ingest_files), Task(add_data_points)])
|
||||
|
||||
async for status in pipeline:
|
||||
print(status)
|
||||
|
||||
graph_engine = await get_graph_engine()
|
||||
|
||||
products_results = await graph_engine.query(
|
||||
"""
|
||||
// Step 1: Use new customers's preferences from input
|
||||
UNWIND $preferences AS pref_input
|
||||
|
||||
// Step 2: Find other customers who have these preferences
|
||||
MATCH (other_customer:Customer)-[:has_preference]->(preference:Preference)
|
||||
WHERE preference.value = pref_input
|
||||
|
||||
WITH other_customer, count(preference) AS similarity_score
|
||||
|
||||
// Step 3: Limit to the top-N most similar customers
|
||||
ORDER BY similarity_score DESC
|
||||
LIMIT 5
|
||||
|
||||
// Step 4: Get products that these similar customers have purchased
|
||||
MATCH (other_customer)-[:purchased]->(product:Product)
|
||||
|
||||
// Step 5: Rank products based on frequency
|
||||
RETURN product, count(*) AS recommendation_score
|
||||
ORDER BY recommendation_score DESC
|
||||
LIMIT 10
|
||||
""",
|
||||
{
|
||||
"preferences": ["White", "Navy Blue", "Regular Sneakers"],
|
||||
},
|
||||
)
|
||||
|
||||
print("Top 10 recommended products:")
|
||||
for result in products_results:
|
||||
print(f"{result['product']['id']}: {result['product']['name']}")
|
||||
|
||||
try:
|
||||
await graph_engine.query(
|
||||
"""
|
||||
// Match the customer and their stored shoe size preference
|
||||
MATCH (customer:Customer {id: $customer_id})
|
||||
OPTIONAL MATCH (customer)-[:has_preference]->(preference:Preference {name: 'ShoeSize'})
|
||||
|
||||
// Assume the new shoe size is passed as a parameter $new_size
|
||||
WITH customer, preference, $new_size AS new_size
|
||||
|
||||
// If a stored preference exists and it does not match the new value,
|
||||
// raise an error using APOC's utility procedure.
|
||||
CALL apoc.util.validate(
|
||||
preference IS NOT NULL AND preference.value <> new_size,
|
||||
"Conflicting shoe size preference: existing size is " + preference.value + " and new size is " + new_size,
|
||||
[]
|
||||
)
|
||||
|
||||
// If no conflict, continue with the update or further processing
|
||||
// ...
|
||||
RETURN customer
|
||||
""",
|
||||
{
|
||||
"customer_id": "customer_1",
|
||||
"new_size": "42",
|
||||
},
|
||||
)
|
||||
except exceptions.ClientError as error:
|
||||
print(f"Anomaly detected: {str(error.message)}")
|
||||
|
||||
# # Or use our simple graph preview
|
||||
# graph_file_path = str(
|
||||
# os.path.join(os.path.dirname(__file__), ".artifacts/graph_visualization.html")
|
||||
# )
|
||||
# await visualize_graph(graph_file_path)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
|
|
@ -0,0 +1,110 @@
|
|||
from pathlib import Path
|
||||
import asyncio
|
||||
import os
|
||||
|
||||
import cognee
|
||||
from cognee.infrastructure.databases.relational.config import get_migration_config
|
||||
from cognee.infrastructure.databases.graph import get_graph_engine
|
||||
from cognee.api.v1.visualize.visualize import visualize_graph
|
||||
from cognee.infrastructure.databases.relational import (
|
||||
get_migration_relational_engine,
|
||||
)
|
||||
from cognee.modules.search.types import SearchType
|
||||
from cognee.infrastructure.databases.relational import (
|
||||
create_db_and_tables as create_relational_db_and_tables,
|
||||
)
|
||||
from cognee.infrastructure.databases.vector.pgvector import (
|
||||
create_db_and_tables as create_vector_db_and_tables,
|
||||
)
|
||||
|
||||
# Prerequisites:
|
||||
# 1. Copy `.env.template` and rename it to `.env`.
|
||||
# 2. Add your OpenAI API key to the `.env` file in the `LLM_API_KEY` field:
|
||||
# LLM_API_KEY = "your_key_here"
|
||||
# 3. Fill all relevant MIGRATION_DB information for the database you want to migrate to graph / Cognee
|
||||
|
||||
# NOTE: If you don't have a DB you want to migrate you can try it out with our
|
||||
# test database at the following location:
|
||||
# MIGRATION_DB_PATH="/{path_to_your_local_cognee}/cognee/tests/test_data"
|
||||
# MIGRATION_DB_NAME="migration_database.sqlite"
|
||||
# MIGRATION_DB_PROVIDER="sqlite"
|
||||
|
||||
|
||||
async def main():
|
||||
# Clean all data stored in Cognee
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
# Needed to create appropriate database tables only on the Cognee side
|
||||
await create_relational_db_and_tables()
|
||||
await create_vector_db_and_tables()
|
||||
|
||||
# In case environment variables are not set use the example database from the Cognee repo
|
||||
migration_db_provider = os.environ.get("MIGRATION_DB_PROVIDER", "sqlite")
|
||||
migration_db_path = os.environ.get(
|
||||
"MIGRATION_DB_PATH",
|
||||
os.path.join(Path(__file__).resolve().parent.parent.parent, "cognee/tests/test_data"),
|
||||
)
|
||||
migration_db_name = os.environ.get("MIGRATION_DB_NAME", "migration_database.sqlite")
|
||||
|
||||
migration_config = get_migration_config()
|
||||
migration_config.migration_db_provider = migration_db_provider
|
||||
migration_config.migration_db_path = migration_db_path
|
||||
migration_config.migration_db_name = migration_db_name
|
||||
|
||||
engine = get_migration_relational_engine()
|
||||
|
||||
print("\nExtracting schema of database to migrate.")
|
||||
schema = await engine.extract_schema()
|
||||
print(f"Migrated database schema:\n{schema}")
|
||||
|
||||
graph = await get_graph_engine()
|
||||
print("Migrating relational database to graph database based on schema.")
|
||||
from cognee.tasks.ingestion import migrate_relational_database
|
||||
|
||||
await migrate_relational_database(graph, schema=schema)
|
||||
print("Relational database migration complete.")
|
||||
|
||||
# Make sure to set top_k at a high value for a broader search, the default value is only 10!
|
||||
# top_k represent the number of graph tripplets to supply to the LLM to answer your question
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text="What kind of data do you contain?",
|
||||
top_k=200,
|
||||
)
|
||||
print(f"Search results: {search_results}")
|
||||
|
||||
# Having a top_k value set to too high might overwhelm the LLM context when specific questions need to be answered.
|
||||
# For this kind of question we've set the top_k to 50
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text="What invoices are related to Leonie Köhler?",
|
||||
top_k=50,
|
||||
)
|
||||
print(f"Search results: {search_results}")
|
||||
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text="What invoices are related to Luís Gonçalves?",
|
||||
top_k=50,
|
||||
)
|
||||
print(f"Search results: {search_results}")
|
||||
|
||||
# If you check the relational database for this example you can see that the search results successfully found all
|
||||
# the invoices related to the two customers, without any hallucinations or additional information
|
||||
|
||||
# Define location where to store html visualization of graph of the migrated database
|
||||
home_dir = os.path.expanduser("~")
|
||||
destination_file_path = os.path.join(home_dir, "graph_visualization.html")
|
||||
print("Adding html visualization of graph database after migration.")
|
||||
await visualize_graph(destination_file_path)
|
||||
print(f"Visualization can be found at: {destination_file_path}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
try:
|
||||
loop.run_until_complete(main())
|
||||
finally:
|
||||
loop.run_until_complete(loop.shutdown_asyncgens())
|
||||
|
|
@ -0,0 +1,98 @@
|
|||
import asyncio
|
||||
|
||||
import cognee
|
||||
from cognee import visualize_graph
|
||||
from cognee.memify_pipelines.persist_sessions_in_knowledge_graph import (
|
||||
persist_sessions_in_knowledge_graph_pipeline,
|
||||
)
|
||||
from cognee.modules.search.types import SearchType
|
||||
from cognee.modules.users.methods import get_default_user
|
||||
from cognee.shared.logging_utils import get_logger
|
||||
|
||||
logger = get_logger("conversation_session_persistence_example")
|
||||
|
||||
|
||||
async def main():
|
||||
# NOTE: CACHING has to be enabled for this example to work
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
text_1 = "Cognee is a solution that can build knowledge graph from text, creating an AI memory system"
|
||||
text_2 = "Germany is a country located next to the Netherlands"
|
||||
|
||||
await cognee.add([text_1, text_2])
|
||||
await cognee.cognify()
|
||||
|
||||
question = "What can I use to create a knowledge graph?"
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text=question,
|
||||
)
|
||||
print("\nSession ID: default_session")
|
||||
print(f"Question: {question}")
|
||||
print(f"Answer: {search_results}\n")
|
||||
|
||||
question = "You sure about that?"
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text=question
|
||||
)
|
||||
print("\nSession ID: default_session")
|
||||
print(f"Question: {question}")
|
||||
print(f"Answer: {search_results}\n")
|
||||
|
||||
question = "This is awesome!"
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text=question
|
||||
)
|
||||
print("\nSession ID: default_session")
|
||||
print(f"Question: {question}")
|
||||
print(f"Answer: {search_results}\n")
|
||||
|
||||
question = "Where is Germany?"
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text=question,
|
||||
session_id="different_session",
|
||||
)
|
||||
print("\nSession ID: different_session")
|
||||
print(f"Question: {question}")
|
||||
print(f"Answer: {search_results}\n")
|
||||
|
||||
question = "Next to which country again?"
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text=question,
|
||||
session_id="different_session",
|
||||
)
|
||||
print("\nSession ID: different_session")
|
||||
print(f"Question: {question}")
|
||||
print(f"Answer: {search_results}\n")
|
||||
|
||||
question = "So you remember everything I asked from you?"
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text=question,
|
||||
session_id="different_session",
|
||||
)
|
||||
print("\nSession ID: different_session")
|
||||
print(f"Question: {question}")
|
||||
print(f"Answer: {search_results}\n")
|
||||
|
||||
session_ids_to_persist = ["default_session", "different_session"]
|
||||
default_user = await get_default_user()
|
||||
|
||||
await persist_sessions_in_knowledge_graph_pipeline(
|
||||
user=default_user,
|
||||
session_ids=session_ids_to_persist,
|
||||
)
|
||||
|
||||
await visualize_graph()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
try:
|
||||
loop.run_until_complete(main())
|
||||
finally:
|
||||
loop.run_until_complete(loop.shutdown_asyncgens())
|
||||
|
|
@ -0,0 +1,6 @@
|
|||
"""
|
||||
Core Features Getting Started Example
|
||||
|
||||
Reference: https://colab.research.google.com/drive/12Vi9zID-M3fpKpKiaqDBvkk98ElkRPWy?usp=sharing
|
||||
|
||||
"""
|
||||
|
|
@ -0,0 +1,87 @@
|
|||
import os
|
||||
import asyncio
|
||||
import pathlib
|
||||
from cognee import config, add, cognify, search, SearchType, prune, visualize_graph
|
||||
|
||||
from cognee.low_level import DataPoint
|
||||
|
||||
|
||||
async def main():
|
||||
data_directory_path = str(
|
||||
pathlib.Path(os.path.join(pathlib.Path(__file__).parent, ".data_storage")).resolve()
|
||||
)
|
||||
# Set up the data directory. Cognee will store files here.
|
||||
config.data_root_directory(data_directory_path)
|
||||
|
||||
cognee_directory_path = str(
|
||||
pathlib.Path(os.path.join(pathlib.Path(__file__).parent, ".cognee_system")).resolve()
|
||||
)
|
||||
# Set up the Cognee system directory. Cognee will store system files and databases here.
|
||||
config.system_root_directory(cognee_directory_path)
|
||||
|
||||
# Prune data and system metadata before running, only if we want "fresh" state.
|
||||
await prune.prune_data()
|
||||
await prune.prune_system(metadata=True)
|
||||
|
||||
text = "The Python programming language is widely used in data analysis, web development, and machine learning."
|
||||
|
||||
# Add the text data to Cognee.
|
||||
await add(text)
|
||||
|
||||
# Define a custom graph model for programming languages.
|
||||
class FieldType(DataPoint):
|
||||
name: str = "Field"
|
||||
|
||||
class Field(DataPoint):
|
||||
name: str
|
||||
is_type: FieldType
|
||||
metadata: dict = {"index_fields": ["name"]}
|
||||
|
||||
class ProgrammingLanguageType(DataPoint):
|
||||
name: str = "Programming Language"
|
||||
|
||||
class ProgrammingLanguage(DataPoint):
|
||||
name: str
|
||||
used_in: list[Field] = []
|
||||
is_type: ProgrammingLanguageType
|
||||
metadata: dict = {"index_fields": ["name"]}
|
||||
|
||||
# Cognify the text data.
|
||||
await cognify(graph_model=ProgrammingLanguage)
|
||||
|
||||
# Or use our simple graph preview
|
||||
graph_file_path = str(
|
||||
pathlib.Path(
|
||||
os.path.join(pathlib.Path(__file__).parent, ".artifacts/graph_visualization.html")
|
||||
).resolve()
|
||||
)
|
||||
await visualize_graph(graph_file_path)
|
||||
|
||||
# Completion query that uses graph data to form context.
|
||||
graph_completion = await search(
|
||||
query_text="What is python?", query_type=SearchType.GRAPH_COMPLETION
|
||||
)
|
||||
print("Graph completion result is:")
|
||||
print(graph_completion)
|
||||
|
||||
# Completion query that uses document chunks to form context.
|
||||
rag_completion = await search(
|
||||
query_text="What is Python?", query_type=SearchType.RAG_COMPLETION
|
||||
)
|
||||
print("Completion result is:")
|
||||
print(rag_completion)
|
||||
|
||||
# Query all summaries related to query.
|
||||
summaries = await search(query_text="Python", query_type=SearchType.SUMMARIES)
|
||||
print("Summary results are:")
|
||||
for summary in summaries:
|
||||
print(summary)
|
||||
|
||||
chunks = await search(query_text="Python", query_type=SearchType.CHUNKS)
|
||||
print("Chunk results are:")
|
||||
for chunk in chunks:
|
||||
print(chunk)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
7
new-examples/demos/custom_prompt_guide.py
Normal file
7
new-examples/demos/custom_prompt_guide.py
Normal file
|
|
@ -0,0 +1,7 @@
|
|||
"""
|
||||
Custom Prompt Example
|
||||
|
||||
|
||||
Reference: https://docs.cognee.ai/guides/custom-prompts
|
||||
|
||||
"""
|
||||
|
|
@ -0,0 +1,4 @@
|
|||
"""
|
||||
Direct LLM Call for Structured Output Example
|
||||
Reference: https://docs.cognee.ai/guides/low-level-llm
|
||||
"""
|
||||
115
new-examples/demos/dynamic_multiple_weighted_edges_example.py
Normal file
115
new-examples/demos/dynamic_multiple_weighted_edges_example.py
Normal file
|
|
@ -0,0 +1,115 @@
|
|||
import asyncio
|
||||
from os import path
|
||||
from typing import Any
|
||||
from pydantic import SkipValidation
|
||||
from cognee.api.v1.visualize.visualize import visualize_graph
|
||||
from cognee.infrastructure.engine import DataPoint
|
||||
from cognee.infrastructure.engine.models.Edge import Edge
|
||||
from cognee.tasks.storage import add_data_points
|
||||
import cognee
|
||||
|
||||
|
||||
class Employee(DataPoint):
|
||||
name: str
|
||||
role: str
|
||||
|
||||
|
||||
class Company(DataPoint):
|
||||
name: str
|
||||
industry: str
|
||||
employs: SkipValidation[Any] # Mixed list: employees with/without weights
|
||||
|
||||
|
||||
async def main():
|
||||
# Clear the database for a clean state
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
# Create employees
|
||||
michael = Employee(name="Michael", role="Regional Manager")
|
||||
dwight = Employee(name="Dwight", role="Assistant to the Regional Manager")
|
||||
jim = Employee(name="Jim", role="Sales Representative")
|
||||
pam = Employee(name="Pam", role="Receptionist")
|
||||
kevin = Employee(name="Kevin", role="Accountant")
|
||||
angela = Employee(name="Angela", role="Senior Accountant")
|
||||
oscar = Employee(name="Oscar", role="Accountant")
|
||||
stanley = Employee(name="Stanley", role="Sales Representative")
|
||||
phyllis = Employee(name="Phyllis", role="Sales Representative")
|
||||
|
||||
# Create Dunder Mifflin with mixed employee relationships
|
||||
dunder_mifflin = Company(
|
||||
name="Dunder Mifflin Paper Company",
|
||||
industry="Paper Sales",
|
||||
employs=[
|
||||
# Manager with high authority weight
|
||||
(Edge(weight=0.9, relationship_type="manager"), michael),
|
||||
# Sales team with performance weights
|
||||
(
|
||||
Edge(weights={"sales_performance": 0.8, "loyalty": 0.9}, relationship_type="sales"),
|
||||
dwight,
|
||||
),
|
||||
(
|
||||
Edge(
|
||||
weights={"sales_performance": 0.7, "creativity": 0.8}, relationship_type="sales"
|
||||
),
|
||||
jim,
|
||||
),
|
||||
(
|
||||
Edge(
|
||||
weights={"sales_performance": 0.6, "customer_service": 0.9},
|
||||
relationship_type="sales",
|
||||
),
|
||||
phyllis,
|
||||
),
|
||||
(
|
||||
Edge(
|
||||
weights={"sales_performance": 0.5, "experience": 0.8}, relationship_type="sales"
|
||||
),
|
||||
stanley,
|
||||
),
|
||||
# Accounting department as a group
|
||||
(
|
||||
Edge(
|
||||
weights={"department_efficiency": 0.8, "team_cohesion": 0.9},
|
||||
relationship_type="accounting",
|
||||
),
|
||||
[oscar, kevin, angela],
|
||||
),
|
||||
# Admin staff without weights (simple relationships)
|
||||
pam,
|
||||
],
|
||||
)
|
||||
|
||||
all_data_points = [
|
||||
michael,
|
||||
dwight,
|
||||
jim,
|
||||
pam,
|
||||
kevin,
|
||||
angela,
|
||||
oscar,
|
||||
stanley,
|
||||
phyllis,
|
||||
dunder_mifflin,
|
||||
]
|
||||
|
||||
# Add data points to the graph
|
||||
await add_data_points(all_data_points)
|
||||
|
||||
# Visualize the graph
|
||||
graph_visualization_path = path.join(path.dirname(__file__), "dunder_mifflin_graph.html")
|
||||
await visualize_graph(graph_visualization_path)
|
||||
|
||||
print("Dynamic multiple edges graph has been created and visualized!")
|
||||
print(f"Visualization saved to: {graph_visualization_path}")
|
||||
print("\nTechnical features demonstrated:")
|
||||
print("- Mixed list support: weighted and unweighted relationships in single field")
|
||||
print("- Single weight edges with relationship types")
|
||||
print("- Multiple weight edges with custom metrics")
|
||||
print("- Group relationships: single edge connecting multiple nodes")
|
||||
print("- Simple relationships without edge metadata")
|
||||
print("- Flexible edge extraction from heterogeneous data structures")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
81
new-examples/demos/feedback_enrichment_minimal_example.py
Normal file
81
new-examples/demos/feedback_enrichment_minimal_example.py
Normal file
|
|
@ -0,0 +1,81 @@
|
|||
import asyncio
|
||||
|
||||
import cognee
|
||||
from cognee.api.v1.search import SearchType
|
||||
from cognee.modules.pipelines.tasks.task import Task
|
||||
from cognee.tasks.graph import extract_graph_from_data
|
||||
from cognee.tasks.storage import add_data_points
|
||||
from cognee.shared.data_models import KnowledgeGraph
|
||||
|
||||
from cognee.tasks.feedback.extract_feedback_interactions import extract_feedback_interactions
|
||||
from cognee.tasks.feedback.generate_improved_answers import generate_improved_answers
|
||||
from cognee.tasks.feedback.create_enrichments import create_enrichments
|
||||
from cognee.tasks.feedback.link_enrichments_to_feedback import link_enrichments_to_feedback
|
||||
|
||||
|
||||
CONVERSATION = [
|
||||
"Alice: Hey, Bob. Did you talk to Mallory?",
|
||||
"Bob: Yeah, I just saw her before coming here.",
|
||||
"Alice: Then she told you to bring my documents, right?",
|
||||
"Bob: Uh… not exactly. She said you wanted me to bring you donuts. Which sounded kind of odd…",
|
||||
"Alice: Ugh, she’s so annoying. Thanks for the donuts anyway!",
|
||||
]
|
||||
|
||||
|
||||
async def initialize_conversation_and_graph(conversation):
|
||||
"""Prune data/system, add conversation, cognify."""
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
await cognee.add(conversation)
|
||||
await cognee.cognify()
|
||||
|
||||
|
||||
async def run_question_and_submit_feedback(question_text: str) -> bool:
|
||||
"""Ask question, submit feedback based on correctness, and return correctness flag."""
|
||||
result = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text=question_text,
|
||||
save_interaction=True,
|
||||
)
|
||||
answer_text = str(result).lower()
|
||||
mentions_mallory = "mallory" in answer_text
|
||||
feedback_text = (
|
||||
"Great answers, very helpful!"
|
||||
if mentions_mallory
|
||||
else "The answer about Bob and donuts was wrong."
|
||||
)
|
||||
await cognee.search(
|
||||
query_type=SearchType.FEEDBACK,
|
||||
query_text=feedback_text,
|
||||
last_k=1,
|
||||
)
|
||||
return mentions_mallory
|
||||
|
||||
|
||||
async def run_feedback_enrichment_memify(last_n: int = 5):
|
||||
"""Execute memify with extraction, answer improvement, enrichment creation, and graph processing tasks."""
|
||||
# Instantiate tasks with their own kwargs
|
||||
extraction_tasks = [Task(extract_feedback_interactions, last_n=last_n)]
|
||||
enrichment_tasks = [
|
||||
Task(generate_improved_answers, top_k=20),
|
||||
Task(create_enrichments),
|
||||
Task(extract_graph_from_data, graph_model=KnowledgeGraph, task_config={"batch_size": 10}),
|
||||
Task(add_data_points, task_config={"batch_size": 10}),
|
||||
Task(link_enrichments_to_feedback),
|
||||
]
|
||||
await cognee.memify(
|
||||
extraction_tasks=extraction_tasks,
|
||||
enrichment_tasks=enrichment_tasks,
|
||||
data=[{}], # A placeholder to prevent fetching the entire graph
|
||||
)
|
||||
|
||||
|
||||
async def main():
|
||||
await initialize_conversation_and_graph(CONVERSATION)
|
||||
is_correct = await run_question_and_submit_feedback("Who told Bob to bring the donuts?")
|
||||
if not is_correct:
|
||||
await run_feedback_enrichment_memify(last_n=5)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
5
new-examples/demos/graph_visualization_example.py
Normal file
5
new-examples/demos/graph_visualization_example.py
Normal file
|
|
@ -0,0 +1,5 @@
|
|||
"""
|
||||
Graph Visualization Example
|
||||
|
||||
Reference: https://docs.cognee.ai/guides/graph-visualization
|
||||
"""
|
||||
BIN
new-examples/demos/multimedia_processing/data/example.png
Normal file
BIN
new-examples/demos/multimedia_processing/data/example.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 10 KiB |
BIN
new-examples/demos/multimedia_processing/data/text_to_speech.mp3
Normal file
BIN
new-examples/demos/multimedia_processing/data/text_to_speech.mp3
Normal file
Binary file not shown.
|
|
@ -0,0 +1,55 @@
|
|||
import os
|
||||
import asyncio
|
||||
import pathlib
|
||||
from cognee.shared.logging_utils import setup_logging, ERROR
|
||||
|
||||
import cognee
|
||||
from cognee.api.v1.search import SearchType
|
||||
|
||||
# Prerequisites:
|
||||
# 1. Copy `.env.template` and rename it to `.env`.
|
||||
# 2. Add your OpenAI API key to the `.env` file in the `LLM_API_KEY` field:
|
||||
# LLM_API_KEY = "your_key_here"
|
||||
|
||||
|
||||
async def main():
|
||||
# Create a clean slate for cognee -- reset data and system state
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
# cognee knowledge graph will be created based on the text
|
||||
# and description of these files
|
||||
mp3_file_path = os.path.join(
|
||||
pathlib.Path(__file__).parent,
|
||||
"data/text_to_speech.mp3",
|
||||
)
|
||||
png_file_path = os.path.join(
|
||||
pathlib.Path(__file__).parent,
|
||||
"data/example.png",
|
||||
)
|
||||
|
||||
# Add the files, and make it available for cognify
|
||||
await cognee.add([mp3_file_path, png_file_path])
|
||||
|
||||
# Use LLMs and cognee to create knowledge graph
|
||||
await cognee.cognify()
|
||||
|
||||
# Query cognee for summaries of the data in the multimedia files
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.SUMMARIES,
|
||||
query_text="What is in the multimedia files?",
|
||||
)
|
||||
|
||||
# Display search results
|
||||
for result_text in search_results:
|
||||
print(result_text)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
logger = setup_logging(log_level=ERROR)
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
try:
|
||||
loop.run_until_complete(main())
|
||||
finally:
|
||||
loop.run_until_complete(loop.shutdown_asyncgens())
|
||||
|
|
@ -0,0 +1,44 @@
|
|||
import os
|
||||
import asyncio
|
||||
import cognee
|
||||
from cognee.api.v1.visualize.visualize import visualize_graph
|
||||
from cognee.shared.logging_utils import setup_logging, ERROR
|
||||
|
||||
text_a = """
|
||||
AI is revolutionizing financial services through intelligent fraud detection
|
||||
and automated customer service platforms.
|
||||
"""
|
||||
|
||||
text_b = """
|
||||
Advances in AI are enabling smarter systems that learn and adapt over time.
|
||||
"""
|
||||
|
||||
text_c = """
|
||||
MedTech startups have seen significant growth in recent years, driven by innovation
|
||||
in digital health and medical devices.
|
||||
"""
|
||||
|
||||
node_set_a = ["AI", "FinTech"]
|
||||
node_set_b = ["AI"]
|
||||
node_set_c = ["MedTech"]
|
||||
|
||||
|
||||
async def main():
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
await cognee.add(text_a, node_set=node_set_a)
|
||||
await cognee.add(text_b, node_set=node_set_b)
|
||||
await cognee.add(text_c, node_set=node_set_c)
|
||||
await cognee.cognify()
|
||||
|
||||
visualization_path = os.path.join(
|
||||
os.path.dirname(__file__), "./.artifacts/graph_visualization.html"
|
||||
)
|
||||
await visualize_graph(visualization_path)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
logger = setup_logging(log_level=ERROR)
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.run(main())
|
||||
|
|
@ -0,0 +1,313 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<rdf:RDF
|
||||
xmlns:ns1="http://example.org/ontology#"
|
||||
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
|
||||
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
|
||||
>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Type2Diabetes">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Disease"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<rdfs:comment>A chronic condition that affects how the body processes glucose.</rdfs:comment>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Obesity"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#PoorDiet"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#SedentaryLifestyle"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Genetics"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#WeightLoss"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#Exercise"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#ModerateCoffeeConsumption"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#IncreasedThirst"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#FrequentUrination"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Fatigue"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#InsulinResistance">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#HighBloodSugar">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#hasPreventiveFactor">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/>
|
||||
<rdfs:domain rdf:resource="http://example.org/ontology#Disease"/>
|
||||
<rdfs:range rdf:resource="http://example.org/ontology#PreventiveFactor"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Hypertension">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Disease"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdfs:comment>A condition where the force of blood against artery walls is too high.</rdfs:comment>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#HighSaltIntake"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Obesity"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Stress"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Genetics"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#LowSodiumDiet"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#Exercise"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#ModerateCoffeeConsumption"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Headache"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Dizziness"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#BlurredVision"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Cancer">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Disease"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<rdfs:comment>A disease of abnormal cell growth with potential to invade or spread.</rdfs:comment>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Smoking"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Radiation"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Infections"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#GeneticMutations"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#Screening"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#HealthyDiet"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#ModerateCoffeeConsumption"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#UnexplainedWeightLoss"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Fatigue"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#LumpFormation"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#hasRiskFactor">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/>
|
||||
<rdfs:domain rdf:resource="http://example.org/ontology#Disease"/>
|
||||
<rdfs:range rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#LowSodiumDiet">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#MetabolicSyndrome">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Disease"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<rdfs:comment>A cluster of conditions increasing the risk of heart disease and diabetes.</rdfs:comment>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Obesity"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#InsulinResistance"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Hypertension"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#HealthyDiet"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#PhysicalActivity"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#GreenCoffeeBlend"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#IncreasedWaistCircumference"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#HighBloodSugar"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Fatigue"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#ShortnessofBreath">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Disease">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
<rdfs:comment>Disease is a concept used to classify relevant medical terms.</rdfs:comment>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#HeartDisease">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Screening">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#AtrialFibrillation">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Disease"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<rdfs:comment>An irregular and often rapid heart rhythm that may cause blood clots.</rdfs:comment>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#HeartDisease"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#HighBloodPressure"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#AlcoholUse"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#HealthyDiet"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#Exercise"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#ModerateCoffeeConsumption"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Palpitations"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Weakness"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#ShortnessofBreath"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Genetics">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#ChestPain">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#PhysicalActivity">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#CardiovascularDisease">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Disease"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<rdfs:comment>A class of diseases that involve the heart or blood vessels.</rdfs:comment>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Smoking"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#HighBloodPressure"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#HighCholesterol"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Diabetes"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Obesity"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#PhysicalActivity"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#MediterraneanDiet"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#ModerateCoffeeConsumption"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#ChestPain"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#ShortnessofBreath"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Fatigue"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#HealthyDiet">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#HighSaltIntake">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Diabetes">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Palpitations">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Headache">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#BloodPressureControl">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#PreventiveFactor">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
<rdfs:comment>PreventiveFactor is a concept used to classify relevant medical terms.</rdfs:comment>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Symptom">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
<rdfs:comment>Symptom is a concept used to classify relevant medical terms.</rdfs:comment>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#MediterraneanDiet">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#HighBloodPressure">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#IncreasedThirst">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Swelling">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#BlurredVision">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#HeartFailure">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Disease"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<rdfs:comment>A condition in which the heart is unable to pump sufficiently.</rdfs:comment>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#CoronaryArteryDisease"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Hypertension"/>
|
||||
<ns1:hasRiskFactor rdf:resource="http://example.org/ontology#Diabetes"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#BloodPressureControl"/>
|
||||
<ns1:hasPreventiveFactor rdf:resource="http://example.org/ontology#HealthyLifestyle"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#ShortnessofBreath"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Swelling"/>
|
||||
<ns1:hasSymptom rdf:resource="http://example.org/ontology#Fatigue"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#UnexplainedWeightLoss">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Fatigue">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Infections">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#IncreasedWaistCircumference">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#HealthyLifestyle">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#SedentaryLifestyle">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#GreenCoffeeBlend">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#ModerateCoffeeConsumption">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Obesity">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#HighCholesterol">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#GeneticMutations">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#AlcoholUse">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Dizziness">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Radiation">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#LumpFormation">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#PoorDiet">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Smoking">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Stress">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#hasSymptom">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/>
|
||||
<rdfs:domain rdf:resource="http://example.org/ontology#Disease"/>
|
||||
<rdfs:range rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#CoronaryArteryDisease">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#RiskFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Exercise">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Weakness">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#WeightLoss">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#PreventiveFactor"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#FrequentUrination">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Symptom"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#RiskFactor">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
<rdfs:comment>RiskFactor is a concept used to classify relevant medical terms.</rdfs:comment>
|
||||
</rdf:Description>
|
||||
</rdf:RDF>
|
||||
Binary file not shown.
Binary file not shown.
|
|
@ -0,0 +1,110 @@
|
|||
import cognee
|
||||
import asyncio
|
||||
from cognee.shared.logging_utils import setup_logging
|
||||
import os
|
||||
import textwrap
|
||||
from cognee.api.v1.search import SearchType
|
||||
from cognee.api.v1.visualize.visualize import visualize_graph
|
||||
from cognee.modules.ontology.rdf_xml.RDFLibOntologyResolver import RDFLibOntologyResolver
|
||||
from cognee.modules.ontology.ontology_config import Config
|
||||
|
||||
|
||||
async def run_pipeline(ontology_path=None):
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
scientific_papers_dir = os.path.join(
|
||||
os.path.dirname(os.path.abspath(__file__)), "data/scientific_papers/"
|
||||
)
|
||||
|
||||
await cognee.add(scientific_papers_dir)
|
||||
|
||||
config: Config = {
|
||||
"ontology_config": {
|
||||
"ontology_resolver": RDFLibOntologyResolver(ontology_file=ontology_path)
|
||||
}
|
||||
}
|
||||
|
||||
pipeline_run = await cognee.cognify(config=config)
|
||||
|
||||
return pipeline_run
|
||||
|
||||
|
||||
async def query_pipeline(questions):
|
||||
answers = []
|
||||
for question in questions:
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text=question,
|
||||
)
|
||||
answers.append(search_results)
|
||||
|
||||
return answers
|
||||
|
||||
|
||||
def print_comparison_table(questions, answers_with, answers_without, col_width=45):
|
||||
separator = "-" * (col_width * 3 + 6)
|
||||
|
||||
header = f"{'Question'.ljust(col_width)} | {'WITH Ontology (owl grounded facts)'.ljust(col_width)} | {'WITHOUT Ontology'.ljust(col_width)}"
|
||||
logger.info(separator)
|
||||
logger.info(header)
|
||||
logger.info(separator)
|
||||
|
||||
for q, with_o, without_o in zip(questions, answers_with, answers_without):
|
||||
q_wrapped = textwrap.fill(q, width=col_width)
|
||||
with_o_wrapped = textwrap.fill(str(with_o), width=col_width)
|
||||
without_o_wrapped = textwrap.fill(str(without_o), width=col_width)
|
||||
|
||||
q_lines = q_wrapped.split("\n")
|
||||
with_lines = with_o_wrapped.split("\n")
|
||||
without_lines = without_o_wrapped.split("\n")
|
||||
|
||||
max_lines = max(len(q_lines), len(with_lines), len(without_lines))
|
||||
|
||||
for i in range(max_lines):
|
||||
q_line = q_lines[i] if i < len(q_lines) else ""
|
||||
with_line = with_lines[i] if i < len(with_lines) else ""
|
||||
without_line = without_lines[i] if i < len(without_lines) else ""
|
||||
logger.info(
|
||||
f"{q_line.ljust(col_width)} | {with_line.ljust(col_width)} | {without_line.ljust(col_width)}"
|
||||
)
|
||||
|
||||
logger.info(separator)
|
||||
|
||||
|
||||
async def main():
|
||||
questions = [
|
||||
"What are common risk factors for Type 2 Diabetes?",
|
||||
"What preventive measures reduce the risk of Hypertension?",
|
||||
"What symptoms indicate possible Cardiovascular Disease?",
|
||||
"I have blurred vision and a headache. What diease do I have?",
|
||||
"What diseases are associated with Obesity?",
|
||||
]
|
||||
|
||||
ontology_path = os.path.join(
|
||||
os.path.dirname(os.path.abspath(__file__)),
|
||||
"data/enriched_medical_ontology_with_classes.owl",
|
||||
)
|
||||
|
||||
logger.info("\n--- Generating answers WITH ontology ---\n")
|
||||
await run_pipeline(ontology_path=ontology_path)
|
||||
answers_with_ontology = await query_pipeline(questions)
|
||||
|
||||
logger.info("\n--- Generating answers WITHOUT ontology ---\n")
|
||||
await run_pipeline()
|
||||
answers_without_ontology = await query_pipeline(questions)
|
||||
|
||||
print_comparison_table(questions, answers_with_ontology, answers_without_ontology)
|
||||
|
||||
await visualize_graph()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
logger = setup_logging()
|
||||
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
try:
|
||||
loop.run_until_complete(main())
|
||||
finally:
|
||||
loop.run_until_complete(loop.shutdown_asyncgens())
|
||||
|
|
@ -0,0 +1,290 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<rdf:RDF
|
||||
xmlns:ns1="http://example.org/ontology#"
|
||||
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
|
||||
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
|
||||
>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Volkswagen">
|
||||
<rdfs:comment>Created for making cars accessible to everyone.</rdfs:comment>
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#VW_Golf"/>
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#VW_ID4"/>
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#VW_Touareg"/>
|
||||
<rdf:type rdf:resource="http://example.org/ontology#CarManufacturer"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Azure">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#CloudServiceProvider"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Porsche">
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#Porsche_Cayenne"/>
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#Porsche_Taycan"/>
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#Porsche_911"/>
|
||||
<rdf:type rdf:resource="http://example.org/ontology#CarManufacturer"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<rdfs:comment>Famous for high-performance sports cars.</rdfs:comment>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Meta">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#TechnologyCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#Instagram"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#Facebook"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#Oculus"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#WhatsApp"/>
|
||||
<rdfs:comment>Pioneering social media and virtual reality technology.</rdfs:comment>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#TechnologyCompany">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Apple">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#TechnologyCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<rdfs:comment>Known for its innovative consumer electronics and software.</rdfs:comment>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#iPad"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#iPhone"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#AppleWatch"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#MacBook"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Audi">
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#Audi_eTron"/>
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#Audi_R8"/>
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#Audi_A8"/>
|
||||
<rdf:type rdf:resource="http://example.org/ontology#CarManufacturer"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<rdfs:comment>Known for its modern designs and technology.</rdfs:comment>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#AmazonEcho">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Porsche_Taycan">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#ElectricCar"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#BMW">
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#BMW_7Series"/>
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#BMW_M4"/>
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#BMW_iX"/>
|
||||
<rdf:type rdf:resource="http://example.org/ontology#CarManufacturer"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<rdfs:comment>Focused on performance and driving pleasure.</rdfs:comment>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#VW_Touareg">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#SUV"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#SportsCar">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
<rdfs:subClassOf rdf:resource="http://example.org/ontology#Car"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#ElectricCar">
|
||||
<rdfs:subClassOf rdf:resource="http://example.org/ontology#Car"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Google">
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#GooglePixel"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#GoogleCloud"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#Android"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#GoogleSearch"/>
|
||||
<rdf:type rdf:resource="http://example.org/ontology#TechnologyCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<rdfs:comment>Started as a search engine and expanded into cloud computing and AI.</rdfs:comment>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#AmazonPrime">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#SoftwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Car">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#WindowsOS">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#SoftwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Android">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#SoftwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Oculus">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#GoogleCloud">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#CloudServiceProvider"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Microsoft">
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#Surface"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#WindowsOS"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#Azure"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#Xbox"/>
|
||||
<rdf:type rdf:resource="http://example.org/ontology#TechnologyCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<rdfs:comment>Dominant in software, cloud computing, and gaming.</rdfs:comment>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#GoogleSearch">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#SoftwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Mercedes_SClass">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#LuxuryCar"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Audi_A8">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#LuxuryCar"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Sedan">
|
||||
<rdfs:subClassOf rdf:resource="http://example.org/ontology#Car"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#VW_Golf">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#Sedan"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Facebook">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#SoftwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#WhatsApp">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#SoftwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#produces">
|
||||
<rdfs:domain rdf:resource="http://example.org/ontology#CarManufacturer"/>
|
||||
<rdfs:range rdf:resource="http://example.org/ontology#Car"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#BMW_7Series">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#LuxuryCar"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#BMW_M4">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#SportsCar"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Audi_eTron">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#ElectricCar"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Kindle">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#BMW_iX">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#ElectricCar"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#SoftwareCompany">
|
||||
<rdfs:subClassOf rdf:resource="http://example.org/ontology#TechnologyCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Audi_R8">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#SportsCar"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Xbox">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Technology">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Mercedes_EQS">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#ElectricCar"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Porsche_911">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#SportsCar"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#HardwareCompany">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
<rdfs:subClassOf rdf:resource="http://example.org/ontology#TechnologyCompany"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#MercedesBenz">
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#Mercedes_SClass"/>
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#Mercedes_EQS"/>
|
||||
<ns1:produces rdf:resource="http://example.org/ontology#Mercedes_AMG_GT"/>
|
||||
<rdfs:comment>Synonymous with luxury and quality.</rdfs:comment>
|
||||
<rdf:type rdf:resource="http://example.org/ontology#CarManufacturer"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Amazon">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#TechnologyCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#Kindle"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#AmazonEcho"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#AWS"/>
|
||||
<ns1:develops rdf:resource="http://example.org/ontology#AmazonPrime"/>
|
||||
<rdfs:comment>From e-commerce to cloud computing giant with AWS.</rdfs:comment>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Instagram">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#SoftwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#AWS">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#CloudServiceProvider"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#SUV">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
<rdfs:subClassOf rdf:resource="http://example.org/ontology#Car"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#VW_ID4">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#ElectricCar"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#CloudServiceProvider">
|
||||
<rdfs:subClassOf rdf:resource="http://example.org/ontology#TechnologyCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Surface">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#iPad">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#iPhone">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Mercedes_AMG_GT">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#SportsCar"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#MacBook">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#develops">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/>
|
||||
<rdfs:range rdf:resource="http://example.org/ontology#Technology"/>
|
||||
<rdfs:domain rdf:resource="http://example.org/ontology#TechnologyCompany"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#LuxuryCar">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
<rdfs:subClassOf rdf:resource="http://example.org/ontology#Car"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#AppleWatch">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Porsche_Cayenne">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#SUV"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#GooglePixel">
|
||||
<rdf:type rdf:resource="http://example.org/ontology#HardwareCompany"/>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#Company">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://example.org/ontology#CarManufacturer">
|
||||
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
|
||||
<rdfs:subClassOf rdf:resource="http://example.org/ontology#Company"/>
|
||||
</rdf:Description>
|
||||
</rdf:RDF>
|
||||
|
|
@ -0,0 +1,93 @@
|
|||
import asyncio
|
||||
import os
|
||||
|
||||
import cognee
|
||||
from cognee.api.v1.search import SearchType
|
||||
from cognee.api.v1.visualize.visualize import visualize_graph
|
||||
from cognee.shared.logging_utils import setup_logging
|
||||
from cognee.modules.ontology.rdf_xml.RDFLibOntologyResolver import RDFLibOntologyResolver
|
||||
from cognee.modules.ontology.ontology_config import Config
|
||||
|
||||
text_1 = """
|
||||
1. Audi
|
||||
Audi is known for its modern designs and advanced technology. Founded in the early 1900s, the brand has earned a reputation for precision engineering and innovation. With features like the Quattro all-wheel-drive system, Audi offers a range of vehicles from stylish sedans to high-performance sports cars.
|
||||
|
||||
2. BMW
|
||||
BMW, short for Bayerische Motoren Werke, is celebrated for its focus on performance and driving pleasure. The company's vehicles are designed to provide a dynamic and engaging driving experience, and their slogan, "The Ultimate Driving Machine," reflects that commitment. BMW produces a variety of cars that combine luxury with sporty performance.
|
||||
|
||||
3. Mercedes-Benz
|
||||
Mercedes-Benz is synonymous with luxury and quality. With a history dating back to the early 20th century, the brand is known for its elegant designs, innovative safety features, and high-quality engineering. Mercedes-Benz manufactures not only luxury sedans but also SUVs, sports cars, and commercial vehicles, catering to a wide range of needs.
|
||||
|
||||
4. Porsche
|
||||
Porsche is a name that stands for high-performance sports cars. Founded in 1931, the brand has become famous for models like the iconic Porsche 911. Porsche cars are celebrated for their speed, precision, and distinctive design, appealing to car enthusiasts who value both performance and style.
|
||||
|
||||
5. Volkswagen
|
||||
Volkswagen, which means "people's car" in German, was established with the idea of making affordable and reliable vehicles accessible to everyone. Over the years, Volkswagen has produced several iconic models, such as the Beetle and the Golf. Today, it remains one of the largest car manufacturers in the world, offering a wide range of vehicles that balance practicality with quality.
|
||||
|
||||
Each of these car manufacturer contributes to Germany's reputation as a leader in the global automotive industry, showcasing a blend of innovation, performance, and design excellence.
|
||||
"""
|
||||
|
||||
text_2 = """
|
||||
1. Apple
|
||||
Apple is renowned for its innovative consumer electronics and software. Its product lineup includes the iPhone, iPad, Mac computers, and wearables like the Apple Watch. Known for its emphasis on sleek design and user-friendly interfaces, Apple has built a loyal customer base and created a seamless ecosystem that integrates hardware, software, and services.
|
||||
|
||||
2. Google
|
||||
Founded in 1998, Google started as a search engine and quickly became the go-to resource for finding information online. Over the years, the company has diversified its offerings to include digital advertising, cloud computing, mobile operating systems (Android), and various web services like Gmail and Google Maps. Google's innovations have played a major role in shaping the internet landscape.
|
||||
|
||||
3. Microsoft
|
||||
Microsoft Corporation has been a dominant force in software for decades. Its Windows operating system and Microsoft Office suite are staples in both business and personal computing. In recent years, Microsoft has expanded into cloud computing with Azure, gaming with the Xbox platform, and even hardware through products like the Surface line. This evolution has helped the company maintain its relevance in a rapidly changing tech world.
|
||||
|
||||
4. Amazon
|
||||
What began as an online bookstore has grown into one of the largest e-commerce platforms globally. Amazon is known for its vast online marketplace, but its influence extends far beyond retail. With Amazon Web Services (AWS), the company has become a leader in cloud computing, offering robust solutions that power websites, applications, and businesses around the world. Amazon's constant drive for innovation continues to reshape both retail and technology sectors.
|
||||
|
||||
5. Meta
|
||||
Meta, originally known as Facebook, revolutionized social media by connecting billions of people worldwide. Beyond its core social networking service, Meta is investing in the next generation of digital experiences through virtual and augmented reality technologies, with projects like Oculus. The company's efforts signal a commitment to evolving digital interaction and building the metaverse—a shared virtual space where users can connect and collaborate.
|
||||
|
||||
Each of these companies has significantly impacted the technology landscape, driving innovation and transforming everyday life through their groundbreaking products and services.
|
||||
"""
|
||||
|
||||
|
||||
async def main():
|
||||
# Step 1: Reset data and system state
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
# Step 2: Add text
|
||||
text_list = [text_1, text_2]
|
||||
await cognee.add(text_list)
|
||||
|
||||
# Step 3: Create knowledge graph
|
||||
|
||||
ontology_path = os.path.join(
|
||||
os.path.dirname(os.path.abspath(__file__)), "data/basic_ontology.owl"
|
||||
)
|
||||
|
||||
# Create full config structure manually
|
||||
config: Config = {
|
||||
"ontology_config": {
|
||||
"ontology_resolver": RDFLibOntologyResolver(ontology_file=ontology_path)
|
||||
}
|
||||
}
|
||||
|
||||
await cognee.cognify(config=config)
|
||||
print("Knowledge with ontology created.")
|
||||
|
||||
# Step 4: Query insights
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION,
|
||||
query_text="What are the exact cars and their types produced by Audi?",
|
||||
)
|
||||
print(search_results)
|
||||
|
||||
await visualize_graph()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
logger = setup_logging()
|
||||
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
try:
|
||||
loop.run_until_complete(main())
|
||||
finally:
|
||||
loop.run_until_complete(loop.shutdown_asyncgens())
|
||||
5
new-examples/demos/retrievers_and_search_examples.py
Normal file
5
new-examples/demos/retrievers_and_search_examples.py
Normal file
|
|
@ -0,0 +1,5 @@
|
|||
"""
|
||||
Retrievers and Search Examples
|
||||
|
||||
Reference: https://docs.cognee.ai/guides/search-basics
|
||||
"""
|
||||
|
|
@ -0,0 +1,70 @@
|
|||
import asyncio
|
||||
import cognee
|
||||
from cognee.shared.logging_utils import setup_logging, ERROR
|
||||
from cognee.api.v1.search import SearchType
|
||||
|
||||
# Prerequisites:
|
||||
# 1. Copy `.env.template` and rename it to `.env`.
|
||||
# 2. Add your OpenAI API key to the `.env` file in the `LLM_API_KEY` field:
|
||||
# LLM_API_KEY = "your_key_here"
|
||||
|
||||
|
||||
async def main():
|
||||
# Create a clean slate for cognee -- reset data and system state
|
||||
print("Resetting cognee data...")
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
print("Data reset complete.\n")
|
||||
|
||||
# cognee knowledge graph will be created based on this text
|
||||
text = """
|
||||
Natural language processing (NLP) is an interdisciplinary
|
||||
subfield of computer science and information retrieval.
|
||||
"""
|
||||
|
||||
print("Adding text to cognee:")
|
||||
print(text.strip())
|
||||
# Add the text, and make it available for cognify
|
||||
await cognee.add(text)
|
||||
print("Text added successfully.\n")
|
||||
|
||||
print("Running cognify to create knowledge graph...\n")
|
||||
print("Cognify process steps:")
|
||||
print("1. Classifying the document: Determining the type and category of the input text.")
|
||||
print(
|
||||
"2. Checking permissions: Ensuring the user has the necessary rights to process the text."
|
||||
)
|
||||
print(
|
||||
"3. Extracting text chunks: Breaking down the text into sentences or phrases for analysis."
|
||||
)
|
||||
print("4. Adding data points: Storing the extracted chunks for processing.")
|
||||
print(
|
||||
"5. Generating knowledge graph: Extracting entities and relationships to form a knowledge graph."
|
||||
)
|
||||
print("6. Summarizing text: Creating concise summaries of the content for quick insights.\n")
|
||||
|
||||
# Use LLMs and cognee to create knowledge graph
|
||||
await cognee.cognify()
|
||||
print("Cognify process complete.\n")
|
||||
|
||||
query_text = "Tell me about NLP"
|
||||
print(f"Searching cognee for insights with query: '{query_text}'")
|
||||
# Query cognee for insights on the added text
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.GRAPH_COMPLETION, query_text=query_text
|
||||
)
|
||||
|
||||
print("Search results:")
|
||||
# Display results
|
||||
for result_text in search_results:
|
||||
print(result_text)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
logger = setup_logging(log_level=ERROR)
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
try:
|
||||
loop.run_until_complete(main())
|
||||
finally:
|
||||
loop.run_until_complete(loop.shutdown_asyncgens())
|
||||
1772
new-examples/demos/simple_document_qa/data/alice_in_wonderland.txt
Normal file
1772
new-examples/demos/simple_document_qa/data/alice_in_wonderland.txt
Normal file
File diff suppressed because it is too large
Load diff
|
|
@ -0,0 +1,37 @@
|
|||
import asyncio
|
||||
import cognee
|
||||
|
||||
import os
|
||||
|
||||
# By default cognee uses OpenAI's gpt-5-mini LLM model
|
||||
# Provide your OpenAI LLM API KEY
|
||||
os.environ["LLM_API_KEY"] = ""
|
||||
|
||||
|
||||
async def cognee_demo():
|
||||
# Get file path to document to process
|
||||
from pathlib import Path
|
||||
|
||||
current_directory = Path(__file__).resolve().parent
|
||||
file_path = os.path.join(current_directory, "data", "alice_in_wonderland.txt")
|
||||
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
# Call Cognee to process document
|
||||
await cognee.add(file_path)
|
||||
await cognee.cognify()
|
||||
|
||||
# Query Cognee for information from provided document
|
||||
answer = await cognee.search("List me all the important characters in Alice in Wonderland.")
|
||||
print(answer)
|
||||
|
||||
answer = await cognee.search("How did Alice end up in Wonderland?")
|
||||
print(answer)
|
||||
|
||||
answer = await cognee.search("Tell me about Alice's personality.")
|
||||
print(answer)
|
||||
|
||||
|
||||
# Cognee is an async library, it has to be called in an async context
|
||||
asyncio.run(cognee_demo())
|
||||
60
new-examples/demos/start_local_ui_frontend_example.py
Normal file
60
new-examples/demos/start_local_ui_frontend_example.py
Normal file
|
|
@ -0,0 +1,60 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
Example showing how to use cognee.start_ui() to launch the frontend.
|
||||
|
||||
This demonstrates the new UI functionality that works similar to DuckDB's start_ui().
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import cognee
|
||||
import time
|
||||
|
||||
|
||||
async def main():
|
||||
# First, let's add some data to cognee for the UI to display
|
||||
print("Adding sample data to cognee...")
|
||||
await cognee.add(
|
||||
"Natural language processing (NLP) is an interdisciplinary subfield of computer science and information retrieval."
|
||||
)
|
||||
await cognee.add(
|
||||
"Machine learning (ML) is a subset of artificial intelligence that focuses on algorithms and statistical models."
|
||||
)
|
||||
|
||||
# Generate the knowledge graph
|
||||
print("Generating knowledge graph...")
|
||||
await cognee.cognify()
|
||||
|
||||
print("\n" + "=" * 60)
|
||||
print("Starting cognee UI...")
|
||||
print("=" * 60)
|
||||
|
||||
# Start the UI server
|
||||
def dummy_callback(pid):
|
||||
pass
|
||||
|
||||
server = cognee.start_ui(
|
||||
pid_callback=dummy_callback,
|
||||
port=3000,
|
||||
open_browser=True, # This will automatically open your browser
|
||||
)
|
||||
|
||||
if server:
|
||||
print("UI server started successfully!")
|
||||
print("The interface will be available at: http://localhost:3000")
|
||||
print("\nPress Ctrl+C to stop the server when you're done...")
|
||||
|
||||
try:
|
||||
# Keep the server running
|
||||
while server.poll() is None: # While process is still running
|
||||
time.sleep(1)
|
||||
except KeyboardInterrupt:
|
||||
print("\nStopping UI server...")
|
||||
server.terminate()
|
||||
server.wait() # Wait for process to finish
|
||||
print("UI server stopped.")
|
||||
else:
|
||||
print("Failed to start UI server. Check the logs above for details.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
101
new-examples/demos/temporal_awareness_example.py
Normal file
101
new-examples/demos/temporal_awareness_example.py
Normal file
|
|
@ -0,0 +1,101 @@
|
|||
import asyncio
|
||||
import cognee
|
||||
from cognee.shared.logging_utils import setup_logging, INFO
|
||||
from cognee.api.v1.search import SearchType
|
||||
|
||||
|
||||
biography_1 = """
|
||||
Attaphol Buspakom Attaphol Buspakom ( ; ) , nicknamed Tak ( ; ) ; 1 October 1962 – 16 April 2015 ) was a Thai national and football coach . He was given the role at Muangthong United and Buriram United after TTM Samut Sakhon folded after the 2009 season . He played for the Thailand national football team , appearing in several FIFA World Cup qualifying matches .
|
||||
|
||||
Club career .
|
||||
Attaphol began his career as a player at Thai Port FC Authority of Thailand in 1985 . In his first year , he won his first championship with the club . He played for the club until 1989 and in 1987 also won the Queens Cup . He then moved to Malaysia for two seasons for Pahang FA , then return to Thailand to his former club . His time from 1991 to 1994 was marked by less success than in his first stay at Port Authority . From 1994 to 1996 he played for Pahang again and this time he was able to win with the club , the Malaysia Super League and also reached the final of the Malaysia Cup and the Malaysia FA Cup . Both cup finals but lost . Back in Thailand , he let end his playing career at FC Stock Exchange of Thailand , with which he once again runner-up in 1996-97 . In 1998 , he finished his career .
|
||||
|
||||
International career .
|
||||
For the Thailand national football team Attaphol played between 1985 and 1998 a total of 85 games and scored 13 results . In 1992 , he participated with the team in the finals of the Asian Cup . He also stood in various cadres to qualifications to FIFA World Cup .
|
||||
|
||||
Coaching career .
|
||||
Bec Tero Sasana .
|
||||
In BEC Tero Sasana F.C . began his coaching career in 2001 for him , first as assistant coach . He took over the reigning champions of the Thai League T1 , after his predecessor Pichai Pituwong resigned from his post . It was his first coach station and he had the difficult task of leading the club through the new AFC Champions League . He could accomplish this task with flying colors and even led the club to the finals . The finale , then still played in home and away matches , was lost with 1:2 at the end against Al Ain FC . Attaphol is and was next to Charnwit Polcheewin the only coach who managed a club from Thailand to lead to the final of the AFC Champions League . 2002-03 and 2003-04 he won with the club also two runner-up . In his team , which reached the final of the Champions League , were a number of exceptional players like Therdsak Chaiman , Worrawoot Srimaka , Dusit Chalermsan and Anurak Srikerd .
|
||||
|
||||
Geylang United / Krung Thai Bank .
|
||||
In 2006 , he went to Singapore in the S-League to Geylang United He was released after a few months due to lack of success . In 2008 , he took over as coach at Krung Thai Bank F.C. , where he had almost a similar task , as a few years earlier by BEC-Tero . As vice-champion of the club was also qualified for the AFC Champions League . However , he failed to lead the team through the group stage of the season 2008 and beyond . With the Kashima Antlers of Japan and Beijing Guoan F.C . athletic competition was too great . One of the highlights was put under his leadership , yet the club . In the group match against the Vietnam club Nam Dinh F.C . his team won with 9-1 , but also lost four weeks later with 1-8 against Kashima Antlers . At the end of the National Football League season , he reached the Krung Thai 6th Table space . The Erstligalizenz the club was sold at the end of the season at the Bangkok Glass F.C. . Attaphol finished his coaching career with the club and accepted an offer of TTM Samutsakorn . After only a short time in office
|
||||
|
||||
Muangthong United .
|
||||
In 2009 , he received an offer from Muangthong United F.C. , which he accepted and changed . He can champion Muang Thong United for 2009 Thai Premier League and Attaphol won Coach of The year for Thai Premier League and he was able to lead Muang Thong United to play AFC Champions League qualifying play-off for the first in the clubs history .
|
||||
|
||||
Buriram United .
|
||||
In 2010 Buspakom moved from Muangthong United to Buriram United F.C. . He received Coach of the Month in Thai Premier League 2 time in June and October . In 2011 , he led Buriram United win 2011 Thai Premier League second time for club and set a record with the most points in the Thai League T1 for 85 point and He led Buriram win 2011 Thai FA Cup by beat Muangthong United F.C . 1-0 and he led Buriram win 2011 Thai League Cup by beat Thai Port F.C . 2-0 . In 2012 , he led Buriram United to the 2012 AFC Champions League group stage . Buriram along with Guangzhou Evergrande F.C . from China , Kashiwa Reysol from Japan and Jeonbuk Hyundai Motors which are all champions from their country . In the first match of Buriram they beat Kashiwa 3-2 and Second Match they beat Guangzhou 1-2 at the Tianhe Stadium . Before losing to Jeonbuk 0-2 and 3-2 with lose Kashiwa and Guangzhou 1-0 and 1-2 respectively and Thai Premier League Attaphol lead Buriram end 4th for table with win 2012 Thai FA Cup and 2012 Thai League Cup .
|
||||
|
||||
Bangkok Glass .
|
||||
In 2013 , he moved from Buriram United to Bangkok Glass F.C. .
|
||||
|
||||
Individual
|
||||
- Thai Premier League Coach of the Year ( 3 ) : 2001-02 , 2009 , 2013
|
||||
"""
|
||||
|
||||
biography_2 = """
|
||||
Arnulf Øverland Ole Peter Arnulf Øverland ( 27 April 1889 – 25 March 1968 ) was a Norwegian poet and artist . He is principally known for his poetry which served to inspire the Norwegian resistance movement during the German occupation of Norway during World War II .
|
||||
|
||||
Biography .
|
||||
Øverland was born in Kristiansund and raised in Bergen . His parents were Peter Anton Øverland ( 1852–1906 ) and Hanna Hage ( 1854–1939 ) . The early death of his father , left the family economically stressed . He was able to attend Bergen Cathedral School and in 1904 Kristiania Cathedral School . He graduated in 1907 and for a time studied philology at University of Kristiania . Øverland published his first collection of poems ( 1911 ) .
|
||||
|
||||
Øverland became a communist sympathizer from the early 1920s and became a member of Mot Dag . He also served as chairman of the Norwegian Students Society 1923–28 . He changed his stand in 1937 , partly as an expression of dissent against the ongoing Moscow Trials . He was an avid opponent of Nazism and in 1936 he wrote the poem Du må ikke sove which was printed in the journal Samtiden . It ends with . ( I thought: : Something is imminent . Our era is over – Europe’s on fire! ) . Probably the most famous line of the poem is ( You mustnt endure so well the injustice that doesnt affect you yourself! )
|
||||
|
||||
During the German occupation of Norway from 1940 in World War II , he wrote to inspire the Norwegian resistance movement . He wrote a series of poems which were clandestinely distributed , leading to the arrest of both him and his future wife Margrete Aamot Øverland in 1941 . Arnulf Øverland was held first in the prison camp of Grini before being transferred to Sachsenhausen concentration camp in Germany . He spent a four-year imprisonment until the liberation of Norway in 1945 . His poems were later collected in Vi overlever alt and published in 1945 .
|
||||
|
||||
Øverland played an important role in the Norwegian language struggle in the post-war era . He became a noted supporter for the conservative written form of Norwegian called Riksmål , he was president of Riksmålsforbundet ( an organization in support of Riksmål ) from 1947 to 1956 . In addition , Øverland adhered to the traditionalist style of writing , criticising modernist poetry on several occasions . His speech Tungetale fra parnasset , published in Arbeiderbladet in 1954 , initiated the so-called Glossolalia debate .
|
||||
|
||||
Personal life .
|
||||
In 1918 he had married the singer Hildur Arntzen ( 1888–1957 ) . Their marriage was dissolved in 1939 . In 1940 , he married Bartholine Eufemia Leganger ( 1903–1995 ) . They separated shortly after , and were officially divorced in 1945 . Øverland was married to journalist Margrete Aamot Øverland ( 1913–1978 ) during June 1945 . In 1946 , the Norwegian Parliament arranged for Arnulf and Margrete Aamot Øverland to reside at the Grotten . He lived there until his death in 1968 and she lived there for another ten years until her death in 1978 . Arnulf Øverland was buried at Vår Frelsers Gravlund in Oslo . Joseph Grimeland designed the bust of Arnulf Øverland ( bronze , 1970 ) at his grave site .
|
||||
|
||||
Selected Works .
|
||||
- Den ensomme fest ( 1911 )
|
||||
- Berget det blå ( 1927 )
|
||||
- En Hustavle ( 1929 )
|
||||
- Den røde front ( 1937 )
|
||||
- Vi overlever alt ( 1945 )
|
||||
- Sverdet bak døren ( 1956 )
|
||||
- Livets minutter ( 1965 )
|
||||
|
||||
Awards .
|
||||
- Gyldendals Endowment ( 1935 )
|
||||
- Dobloug Prize ( 1951 )
|
||||
- Mads Wiel Nygaards legat ( 1961 )
|
||||
"""
|
||||
|
||||
|
||||
async def main():
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
await cognee.add([biography_1, biography_2])
|
||||
await cognee.cognify(temporal_cognify=True)
|
||||
|
||||
queries = [
|
||||
"What happened before 1980?",
|
||||
"What happened after 2010?",
|
||||
"What happened between 2000 and 2006?",
|
||||
"What happened between 1903 and 1995, I am interested in the Selected Works of Arnulf Øverland Ole Peter Arnulf Øverland?",
|
||||
"Who is Attaphol Buspakom Attaphol Buspakom?",
|
||||
"Who was Arnulf Øverland?",
|
||||
]
|
||||
|
||||
for query_text in queries:
|
||||
search_results = await cognee.search(
|
||||
query_type=SearchType.TEMPORAL,
|
||||
query_text=query_text,
|
||||
top_k=15,
|
||||
)
|
||||
print(f"Query: {query_text}")
|
||||
print(f"Results: {search_results}\n")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
logger = setup_logging(log_level=INFO)
|
||||
|
||||
loop = asyncio.new_event_loop()
|
||||
asyncio.set_event_loop(loop)
|
||||
try:
|
||||
loop.run_until_complete(main())
|
||||
finally:
|
||||
loop.run_until_complete(loop.shutdown_asyncgens())
|
||||
37
new-examples/demos/web_url_content_ingestion_example.py
Normal file
37
new-examples/demos/web_url_content_ingestion_example.py
Normal file
|
|
@ -0,0 +1,37 @@
|
|||
import asyncio
|
||||
|
||||
import cognee
|
||||
|
||||
|
||||
async def main():
|
||||
await cognee.prune.prune_data()
|
||||
print("Data pruned.")
|
||||
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
extraction_rules = {
|
||||
"title": {"selector": "title"},
|
||||
"headings": {"selector": "h1, h2, h3", "all": True},
|
||||
"links": {
|
||||
"selector": "a",
|
||||
"attr": "href",
|
||||
"all": True,
|
||||
},
|
||||
"paragraphs": {"selector": "p", "all": True},
|
||||
}
|
||||
|
||||
await cognee.add(
|
||||
"https://en.wikipedia.org/wiki/Large_language_model",
|
||||
incremental_loading=False,
|
||||
preferred_loaders={"beautiful_soup_loader": {"extraction_rules": extraction_rules}},
|
||||
)
|
||||
|
||||
await cognee.cognify()
|
||||
print("Knowledge graph created.")
|
||||
|
||||
await cognee.visualize_graph()
|
||||
print("Data visualized")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
137
new-examples/demos/weighted_edges_relationships_example.py
Normal file
137
new-examples/demos/weighted_edges_relationships_example.py
Normal file
|
|
@ -0,0 +1,137 @@
|
|||
import asyncio
|
||||
from os import path
|
||||
from typing import Any
|
||||
from pydantic import SkipValidation
|
||||
from cognee.api.v1.visualize.visualize import visualize_graph
|
||||
from cognee.infrastructure.engine import DataPoint
|
||||
from cognee.infrastructure.engine.models.Edge import Edge
|
||||
from cognee.tasks.storage import add_data_points
|
||||
import cognee
|
||||
|
||||
|
||||
class Clothes(DataPoint):
|
||||
name: str
|
||||
description: str
|
||||
|
||||
|
||||
class Object(DataPoint):
|
||||
name: str
|
||||
description: str
|
||||
has_clothes: list[Clothes]
|
||||
|
||||
|
||||
class Person(DataPoint):
|
||||
name: str
|
||||
description: str
|
||||
has_items: SkipValidation[Any] # (Edge, list[Clothes])
|
||||
has_objects: SkipValidation[Any] # (Edge, list[Object])
|
||||
knows: SkipValidation[Any] # (Edge, list["Person"])
|
||||
|
||||
|
||||
async def main():
|
||||
# Clear the database for a clean state
|
||||
await cognee.prune.prune_data()
|
||||
await cognee.prune.prune_system(metadata=True)
|
||||
|
||||
# Create clothes items
|
||||
item1 = Clothes(name="Shirt", description="A blue shirt")
|
||||
item2 = Clothes(name="Pants", description="Black pants")
|
||||
item3 = Clothes(name="Jacket", description="Leather jacket")
|
||||
|
||||
# Create object with simple relationship to clothes
|
||||
object1 = Object(
|
||||
name="Closet", description="A wooden closet", has_clothes=[item1, item2, item3]
|
||||
)
|
||||
|
||||
# Create people with various weighted relationships
|
||||
person1 = Person(
|
||||
name="John",
|
||||
description="A software engineer",
|
||||
# Single weight (backward compatible)
|
||||
has_items=(Edge(weight=0.8, relationship_type="owns"), [item1, item2]),
|
||||
# Simple relationship without weights
|
||||
has_objects=(Edge(relationship_type="stores_in"), [object1]),
|
||||
knows=[],
|
||||
)
|
||||
|
||||
person2 = Person(
|
||||
name="Alice",
|
||||
description="A designer",
|
||||
# Multiple weights on edge
|
||||
has_items=(
|
||||
Edge(
|
||||
weights={
|
||||
"ownership": 0.9,
|
||||
"frequency_of_use": 0.7,
|
||||
"emotional_attachment": 0.8,
|
||||
"monetary_value": 0.6,
|
||||
},
|
||||
relationship_type="owns",
|
||||
),
|
||||
[item3],
|
||||
),
|
||||
has_objects=(Edge(relationship_type="uses"), [object1]),
|
||||
knows=[],
|
||||
)
|
||||
|
||||
person3 = Person(
|
||||
name="Bob",
|
||||
description="A friend",
|
||||
# Mixed: single weight + multiple weights
|
||||
has_items=(
|
||||
Edge(
|
||||
weight=0.5, # Default weight
|
||||
weights={"trust_level": 0.9, "communication_frequency": 0.6},
|
||||
relationship_type="borrows",
|
||||
),
|
||||
[item1],
|
||||
),
|
||||
has_objects=[],
|
||||
knows=[],
|
||||
)
|
||||
|
||||
# Create relationships between people with multiple weights
|
||||
person1.knows = (
|
||||
Edge(
|
||||
weights={
|
||||
"friendship_strength": 0.9,
|
||||
"trust_level": 0.8,
|
||||
"years_known": 0.7,
|
||||
"shared_interests": 0.6,
|
||||
},
|
||||
relationship_type="friend",
|
||||
),
|
||||
[person2, person3],
|
||||
)
|
||||
|
||||
person2.knows = (
|
||||
Edge(
|
||||
weights={"professional_collaboration": 0.8, "personal_friendship": 0.6},
|
||||
relationship_type="colleague",
|
||||
),
|
||||
[person1],
|
||||
)
|
||||
|
||||
all_data_points = [item1, item2, item3, object1, person1, person2, person3]
|
||||
|
||||
# Add data points to the graph
|
||||
await add_data_points(all_data_points)
|
||||
|
||||
# Visualize the graph
|
||||
graph_visualization_path = path.join(
|
||||
path.dirname(__file__), "weighted_graph_visualization.html"
|
||||
)
|
||||
await visualize_graph(graph_visualization_path)
|
||||
|
||||
print("Graph with multiple weighted edges has been created and visualized!")
|
||||
print(f"Visualization saved to: {graph_visualization_path}")
|
||||
print("\nFeatures demonstrated:")
|
||||
print("- Single weight edges (backward compatible)")
|
||||
print("- Multiple weights on single edges")
|
||||
print("- Mixed single + multiple weights")
|
||||
print("- Hover over edges to see all weight information")
|
||||
print("- Different visual styling for single vs. multiple weighted edges")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
Loading…
Add table
Reference in a new issue