cognee/new-examples/custom_pipelines/memify_coding_agent_rule_extraction_example.py
Hande 5f8a3e24bd
refactor: restructure examples and starter kit into new-examples (#1862)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [x] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Deprecated legacy examples and added a migration guide mapping old
paths to new locations
* Added a comprehensive new-examples README detailing configurations,
pipelines, demos, and migration notes

* **New Features**
* Added many runnable examples and demos: database configs,
embedding/LLM setups, permissions and access-control, custom pipelines
(organizational, product recommendation, code analysis, procurement),
multimedia, visualization, temporal/ontology demos, and a local UI
starter

* **Chores**
  * Updated CI/test entrypoints to use the new-examples layout

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
2025-12-20 02:07:28 +01:00

110 lines
4.6 KiB
Python

import asyncio
import pathlib
import os
import cognee
from cognee import memify
from cognee.api.v1.visualize.visualize import visualize_graph
from cognee.shared.logging_utils import setup_logging, ERROR
from cognee.modules.pipelines.tasks.task import Task
from cognee.tasks.memify.extract_subgraph_chunks import extract_subgraph_chunks
from cognee.tasks.codingagents.coding_rule_associations import add_rule_associations
# Prerequisites:
# 1. Copy `.env.template` and rename it to `.env`.
# 2. Add your OpenAI API key to the `.env` file in the `LLM_API_KEY` field:
# LLM_API_KEY = "your_key_here"
async def main():
# Create a clean slate for cognee -- reset data and system state
print("Resetting cognee data...")
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
print("Data reset complete.\n")
print("Adding conversation about rules to cognee:\n")
coding_rules_chat_from_principal_engineer = """
We want code to be formatted by PEP8 standards.
Typing and Docstrings must be added.
Please also make sure to write NOTE: on all more complex code segments.
If there is any duplicate code, try to handle it in one function to avoid code duplication.
Susan should also always review new code changes before merging to main.
New releases should not happen on Friday so we don't have to fix them during the weekend.
"""
print(
f"Coding rules conversation with principal engineer: {coding_rules_chat_from_principal_engineer}"
)
coding_rules_chat_from_manager = """
Susan should always review new code changes before merging to main.
New releases should not happen on Friday so we don't have to fix them during the weekend.
"""
print(f"Coding rules conversation with manager: {coding_rules_chat_from_manager}")
# Add the text, and make it available for cognify
await cognee.add([coding_rules_chat_from_principal_engineer, coding_rules_chat_from_manager])
print("Text added successfully.\n")
# Use LLMs and cognee to create knowledge graph
await cognee.cognify()
print("Cognify process complete.\n")
# Visualize graph after cognification
file_path = os.path.join(
pathlib.Path(__file__).parent, ".artifacts", "graph_visualization_only_cognify.html"
)
await visualize_graph(file_path)
print(f"Open file to see graph visualization only after cognification: {file_path}\n")
# After graph is created, create a second pipeline that will go through the graph and enchance it with specific
# coding rule nodes
# extract_subgraph_chunks is a function that returns all document chunks from specified subgraphs (if no subgraph is specifed the whole graph will be sent through memify)
subgraph_extraction_tasks = [Task(extract_subgraph_chunks)]
# add_rule_associations is a function that handles processing coding rules from chunks and keeps track of
# existing rules so duplicate rules won't be created. As the result of this processing new Rule nodes will be created
# in the graph that specify coding rules found in conversations.
coding_rules_association_tasks = [
Task(
add_rule_associations,
rules_nodeset_name="coding_agent_rules",
task_config={"batch_size": 1},
),
]
# Memify accepts these tasks and orchestrates forwarding of graph data through these tasks (if data is not specified).
# If data is explicitely specified in the arguments this specified data will be forwarded through the tasks instead
await memify(
extraction_tasks=subgraph_extraction_tasks,
enrichment_tasks=coding_rules_association_tasks,
)
# Find the new specific coding rules added to graph through memify (created based on chat conversation between team members)
coding_rules = await cognee.search(
query_text="List me the coding rules",
query_type=cognee.SearchType.CODING_RULES,
node_name=["coding_agent_rules"],
)
print("Coding rules created by memify:")
for coding_rule in coding_rules:
print("- " + coding_rule)
# Visualize new graph with added memify context
file_path = os.path.join(
pathlib.Path(__file__).parent, ".artifacts", "graph_visualization_after_memify.html"
)
await visualize_graph(file_path)
print(f"\nOpen file to see graph visualization after memify enhancment: {file_path}")
if __name__ == "__main__":
logger = setup_logging(log_level=ERROR)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())