No description
Find a file
alekszievr 4efdb29187
Summarize retrieved edges to compact string [COG-1181] (#522)
<!-- .github/pull_request_template.md -->

## Description
Summarize retrieved edges to compact string with no redundancies.
Example:
**Before summarization:**


CV example:

visual innovations -- employs -- visual innovations
---

CV 4: Not Relevant
Name: David Thompson
Contact Information:

Email: david.thompson@example.com
Phone: (555) 456-7890
Summary:

Creative Graphic Designer with over 8 years of experience in visual
design and branding. Proficient in Adobe Creative Suite and passionate
about creating compelling visuals.

Education:

B.F.A. in Graphic Design, Rhode Island School of Design (2012)
Experience:

Senior Graphic Designer, CreativeWorks Agency (2015 – Present)
Led design projects for clients in various industries.
Created branding materials that increased client engagement by 30%.
Graphic Designer, Visual Innovations (2012 – 2015)
Designed marketing collateral, including brochures, logos, and websites.
Collaborated with the marketing team to develop cohesive brand
strategies.
Skills:

Design Software: Adobe Photoshop, Illustrator, InDesign
Web Design: HTML, CSS
Specialties: Branding and Identity, Typography
 -- contains -- creativeworks agency
---

CV 4: Not Relevant
Name: David Thompson
Contact Information:

Email: david.thompson@example.com
Phone: (555) 456-7890
Summary:

Creative Graphic Designer with over 8 years of experience in visual
design and branding. Proficient in Adobe Creative Suite and passionate
about creating compelling visuals.

Education:

B.F.A. in Graphic Design, Rhode Island School of Design (2012)
Experience:

Senior Graphic Designer, CreativeWorks Agency (2015 – Present)
Led design projects for clients in various industries.
Created branding materials that increased client engagement by 30%.
Graphic Designer, Visual Innovations (2012 – 2015)
Designed marketing collateral, including brochures, logos, and websites.
Collaborated with the marketing team to develop cohesive brand
strategies.
Skills:

Design Software: Adobe Photoshop, Illustrator, InDesign
Web Design: HTML, CSS
Specialties: Branding and Identity, Typography
 -- contains -- visual innovations
---

CV 4: Not Relevant
Name: David Thompson
Contact Information:

Email: david.thompson@example.com
Phone: (555) 456-7890
Summary:

Creative Graphic Designer with over 8 years of experience in visual
design and branding. Proficient in Adobe Creative Suite and passionate
about creating compelling visuals.

Education:

B.F.A. in Graphic Design, Rhode Island School of Design (2012)
Experience:

Senior Graphic Designer, CreativeWorks Agency (2015 – Present)
Led design projects for clients in various industries.
Created branding materials that increased client engagement by 30%.
Graphic Designer, Visual Innovations (2012 – 2015)
Designed marketing collateral, including brochures, logos, and websites.
Collaborated with the marketing team to develop cohesive brand
strategies.
Skills:

Design Software: Adobe Photoshop, Illustrator, InDesign
Web Design: HTML, CSS
Specialties: Branding and Identity, Typography
 -- contains -- rhode island school of design
---
Experienced Graphic Designer with over 8 years in visual design and
branding, specializing in Adobe Creative Suite and enthusiastic about
producing engaging visuals. -- made_from --
CV 4: Not Relevant
Name: David Thompson
Contact Information:

Email: david.thompson@example.com
Phone: (555) 456-7890
Summary:

Creative Graphic Designer with over 8 years of experience in visual
design and branding. Proficient in Adobe Creative Suite and passionate
about creating compelling visuals.

Education:

B.F.A. in Graphic Design, Rhode Island School of Design (2012)
Experience:

Senior Graphic Designer, CreativeWorks Agency (2015 – Present)
Led design projects for clients in various industries.
Created branding materials that increased client engagement by 30%.
Graphic Designer, Visual Innovations (2012 – 2015)
Designed marketing collateral, including brochures, logos, and websites.
Collaborated with the marketing team to develop cohesive brand
strategies.
Skills:

Design Software: Adobe Photoshop, Illustrator, InDesign
Web Design: HTML, CSS
Specialties: Branding and Identity, Typography

**After summarization:**

David Thompson is a Creative Graphic Designer with over 8 years of
experience in visual design and branding, proficient in Adobe Creative
Suite and passionate about creating compelling visuals. He holds a
B.F.A. in Graphic Design from the Rhode Island School of Design (2012).
His experience includes working as a Senior Graphic Designer at
CreativeWorks Agency (2015 – Present), where he led design projects and
created branding materials that increased client engagement by 30%, and
as a Graphic Designer at Visual Innovations (2012 – 2015), where he
designed marketing collateral and collaborated with the marketing team
to develop cohesive brand strategies. His skills include design software
such as Adobe Photoshop, Illustrator, and InDesign, as well as web
design in HTML and CSS, with specialties in Branding and Identity and
Typography.

1. David Thompson employs his skills in visual design and branding.
2. David Thompson contains experience from CreativeWorks Agency.
3. David Thompson contains experience from Visual Innovations.
4. David Thompson made his qualifications from the Rhode Island School
of Design.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a summarization engine that converts relationship-based
inputs into concise, natural sentences.
- Expanded search capabilities with a new query option that generates
graph summaries, providing insightful and aggregated results from graph
data.
- Enhanced asynchronous processing for improved performance in handling
graph data queries and summarization.
- Added flexibility in specifying string conversion methods for graph
edge retrieval.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris <boris@topoteretes.com>
2025-02-18 17:29:55 +01:00
.data Remove files 2024-12-11 15:34:29 +01:00
.dlt fix: remove obsolete code 2024-03-13 10:19:03 +01:00
.github GraphRAG vs RAG notebook (#503) 2025-02-18 01:21:14 +01:00
alembic Version 0.1.21 (#431) 2025-01-10 19:37:50 +01:00
assets update architecture diagram in readme (#506) 2025-02-07 17:17:07 +01:00
bin chore: enable all origins in cors settings 2024-09-25 14:34:14 +02:00
cognee Summarize retrieved edges to compact string [COG-1181] (#522) 2025-02-18 17:29:55 +01:00
cognee-frontend Switch to gpt-4o-mini by default (#233) 2024-11-18 17:38:54 +01:00
cognee-mcp fix: consolidate api/sdk/mcp search 2025-02-13 13:15:39 +01:00
evals Cog 1293 corpus builder custom cognify tasks (#527) 2025-02-12 16:44:08 +01:00
examples Feat: log pipeline status and pass it through pipeline [COG-1214] (#501) 2025-02-11 16:41:40 +01:00
licenses add NOTICE file, reference CoC in contribution guidelines, add licenses folder for external licenses 2024-12-06 13:27:55 +00:00
notebooks GraphRAG vs RAG notebook (#503) 2025-02-18 01:21:14 +01:00
profiling Fix linter issues 2025-01-05 19:48:35 +01:00
tests Version 0.1.21 (#431) 2025-01-10 19:37:50 +01:00
tools Version 0.1.21 (#431) 2025-01-10 19:37:50 +01:00
.dockerignore chore: add vanilla docker config 2024-06-23 00:36:34 +02:00
.env.template Comment out the postgres configuration from .env.template (#502) 2025-02-06 21:35:40 +01:00
.gitignore fix: custom model pipeline (#508) 2025-02-08 02:00:15 +01:00
.pre-commit-config.yaml Feat: log pipeline status and pass it through pipeline [COG-1214] (#501) 2025-02-11 16:41:40 +01:00
.pylintrc fix: enable sqlalchemy adapter 2024-08-04 22:23:28 +02:00
.python-version chore: update python version to 3.11 2024-03-29 14:10:20 +01:00
alembic.ini feat: migrate search to tasks (#144) 2024-10-07 14:41:35 +02:00
CODE_OF_CONDUCT.md Update CODE_OF_CONDUCT.md 2024-12-13 11:30:16 +01:00
cognee-gui.py Cognee gui add visualziation[COG-1334] (#538) 2025-02-15 03:08:26 +01:00
CONTRIBUTING.md Update CONTRIBUTING.md 2025-01-16 20:09:43 +01:00
DCO.md Create DCO.md 2024-12-13 11:28:44 +01:00
docker-compose.yml Comment out the postgres configuration from docker-compose.yml (#504) 2025-02-10 13:45:45 +01:00
Dockerfile Fix linter issues 2025-01-05 20:01:30 +01:00
Dockerfile_modal feat: implements modal wrapper + dockerfile for modal containers 2025-01-23 18:06:09 +01:00
entrypoint-old.sh fix: run frontend in a container 2024-06-23 13:24:58 +02:00
entrypoint.sh fix: fixes cognee backend on windows 2025-01-17 09:52:05 +01:00
LICENSE Update LICENSE 2024-03-30 11:57:07 +01:00
modal_deployment.py refactor: Refactor search so graph completion is used by default (#505) 2025-02-07 17:16:34 +01:00
mypy.ini Improve processing, update networkx client, and Neo4j, and dspy (#69) 2024-04-20 19:05:40 +02:00
NOTICE.md add NOTICE file, reference CoC in contribution guidelines, add licenses folder for external licenses 2024-12-06 13:27:55 +00:00
poetry.lock Cognee gui [COG-1307] (#530) 2025-02-14 15:51:33 +01:00
pyproject.toml Cognee gui [COG-1307] (#530) 2025-02-14 15:51:33 +01:00
README.md Update README.md 2025-02-18 03:10:17 +01:00


Logo

cognee - memory layer for AI apps and Agents

GitHub forks GitHub stars GitHub commits Github tag Downloads License Contributors

We build for developers who need a reliable, production-ready data layer for AI applications

What is cognee?

Cognee implements scalable, modular ECL (Extract, Cognify, Load) pipelines that allow you to interconnect and retrieve past conversations, documents, and audio transcriptions while reducing hallucinations, developer effort, and cost.

Cognee merges graph and vector databases to uncover hidden relationships and new patterns in your data. You can automatically model, load and retrieve entities and objects representing your business domain and analyze their relationships, uncovering insights that neither vector stores nor graph stores alone can provide. Learn more about use-cases here.

Try it in a Google Colab notebook or have a look at our documentation.

If you have questions, join our Discord community.

Have you seen cognee's starter repo? Check it out!

why cognee

Contributing

Your contributions are at the core of making this a true open source project. Any contributions you make are greatly appreciated. See CONTRIBUTING.md for more information.

Code of Conduct

We are committed to making open source an enjoyable and respectful experience for our community. See CODE_OF_CONDUCT for more information.

📦 Installation

You can install Cognee using either pip or poetry. Support for various databases and vector stores is available through extras.

With pip

pip install cognee

With poetry

If adding to you project

poetry add cognee

If installing inside cloned repository

poetry config virtualenvs.in-project true
poetry self add poetry-plugin-shell
poetry install
poetry shell

With pip with specific database support

To install Cognee with support for specific databases use the appropriate command below. Replace <database> with the name of the database you need.

pip install 'cognee[<database>]'

Replace <database> with any of the following databases:

  • postgres
  • weaviate
  • qdrant
  • neo4j
  • milvus

Installing Cognee with PostgreSQL and Neo4j support example:

pip install 'cognee[postgres, neo4j]'

With poetry with specific database support

To install Cognee with support for specific databases use the appropriate command below. Replace <database> with the name of the database you need.

poetry add cognee -E <database>

Replace <database> with any of the following databases:

  • postgres
  • weaviate
  • qdrant
  • neo4j
  • milvus

Installing Cognee with PostgreSQL and Neo4j support example:

poetry add cognee -E postgres -E neo4j

💻 Basic Usage

Setup

import os

os.environ["LLM_API_KEY"] = "YOUR OPENAI_API_KEY"

or

import cognee
cognee.config.set_llm_api_key("YOUR_OPENAI_API_KEY")

You can also set the variables by creating .env file, here is our template. To use different LLM providers, for more info check out our documentation

Simple example

First, copy .env.template to .env and add your OpenAI API key to the LLM_API_KEY field.

This script will run the default pipeline:

import cognee
import asyncio
from cognee.modules.search.types import SearchType

async def main():
    # Create a clean slate for cognee -- reset data and system state
    print("Resetting cognee data...")
    await cognee.prune.prune_data()
    await cognee.prune.prune_system(metadata=True)
    print("Data reset complete.\n")

    # cognee knowledge graph will be created based on this text
    text = """
    Natural language processing (NLP) is an interdisciplinary
    subfield of computer science and information retrieval.
    """

    print("Adding text to cognee:")
    print(text.strip())
    # Add the text, and make it available for cognify
    await cognee.add(text)
    print("Text added successfully.\n")


    print("Running cognify to create knowledge graph...\n")
    print("Cognify process steps:")
    print("1. Classifying the document: Determining the type and category of the input text.")
    print("2. Checking permissions: Ensuring the user has the necessary rights to process the text.")
    print("3. Extracting text chunks: Breaking down the text into sentences or phrases for analysis.")
    print("4. Adding data points: Storing the extracted chunks for processing.")
    print("5. Generating knowledge graph: Extracting entities and relationships to form a knowledge graph.")
    print("6. Summarizing text: Creating concise summaries of the content for quick insights.\n")

    # Use LLMs and cognee to create knowledge graph
    await cognee.cognify()
    print("Cognify process complete.\n")


    query_text = 'Tell me about NLP'
    print(f"Searching cognee for insights with query: '{query_text}'")
    # Query cognee for insights on the added text
    search_results = await cognee.search(
        query_text=query_text, query_type=SearchType.INSIGHTS
    )

    print("Search results:")
    # Display results
    for result_text in search_results:
        print(result_text)

    # Example output:
       # ({'id': UUID('bc338a39-64d6-549a-acec-da60846dd90d'), 'updated_at': datetime.datetime(2024, 11, 21, 12, 23, 1, 211808, tzinfo=datetime.timezone.utc), 'name': 'natural language processing', 'description': 'An interdisciplinary subfield of computer science and information retrieval.'}, {'relationship_name': 'is_a_subfield_of', 'source_node_id': UUID('bc338a39-64d6-549a-acec-da60846dd90d'), 'target_node_id': UUID('6218dbab-eb6a-5759-a864-b3419755ffe0'), 'updated_at': datetime.datetime(2024, 11, 21, 12, 23, 15, 473137, tzinfo=datetime.timezone.utc)}, {'id': UUID('6218dbab-eb6a-5759-a864-b3419755ffe0'), 'updated_at': datetime.datetime(2024, 11, 21, 12, 23, 1, 211808, tzinfo=datetime.timezone.utc), 'name': 'computer science', 'description': 'The study of computation and information processing.'})
       # (...)
        #
        # It represents nodes and relationships in the knowledge graph:
        # - The first element is the source node (e.g., 'natural language processing').
        # - The second element is the relationship between nodes (e.g., 'is_a_subfield_of').
        # - The third element is the target node (e.g., 'computer science').

if __name__ == '__main__':
    asyncio.run(main())

When you run this script, you will see step-by-step messages in the console that help you trace the execution flow and understand what the script is doing at each stage. A version of this example is here: examples/python/simple_example.py

Understand our architecture

cognee framework consists of tasks that can be grouped into pipelines. Each task can be an independent part of business logic, that can be tied to other tasks to form a pipeline. These tasks persist data into your memory store enabling you to search for relevant context of past conversations, documents, or any other data you have stored.

cognee concept diagram

Vector retrieval, Graphs and LLMs

Cognee supports a variety of tools and services for different operations:

  • Modular: Cognee is modular by nature, using tasks grouped into pipelines

  • Local Setup: By default, LanceDB runs locally with NetworkX and OpenAI.

  • Vector Stores: Cognee supports LanceDB, Qdrant, PGVector and Weaviate for vector storage.

  • Language Models (LLMs): You can use either Anyscale or Ollama as your LLM provider.

  • Graph Stores: In addition to NetworkX, Neo4j is also supported for graph storage.

  • User management: Create individual user graphs and manage permissions

Demo

Check out our demo notebook here or watch the Youtube video below

Get Started

Install Server

Please see the cognee Quick Start Guide for important configuration information.

docker compose up

Install SDK

Please see the cognee Development Guide for important beta information and usage instructions.

pip install cognee

💫 Contributors

contributors

Star History

Star History Chart

Vector & Graph Databases Implementation State

Name Type Current state (Mac/Linux) Known Issues Current state (Windows) Known Issues
Qdrant Vector Stable Unstable
Weaviate Vector Stable Unstable
LanceDB Vector Stable Stable
Neo4j Graph Stable Stable
NetworkX Graph Stable Stable
FalkorDB Vector/Graph Stable Unstable
PGVector Vector Stable Unstable
Milvus Vector Stable Unstable