cognee

No description

Find a file

hajdul88 d4d190ac2b feature: adds triplet embedding via memify (#1832 ) <!-- .github/pull_request_template.md --> ## Description This PR introduces triplet embeddings via a new create_triplet_embeddings memify pipeline. The pipeline reads the graph in batches, extracts properties from graph elements based on their datapoint types, and generates combined triplet embeddings. These embeddings are stored in the vector database as a new collection. Changes in This PR: -Added a new create_triplet_embeddings memify pipeline. -Added a new get_triplet_datapoints memify task. -Introduced a new triplet_completion search type. -Added full test coverage --Unit tests: memify task, pipeline, and retriever --Integration tests: memify task, pipeline, and retriever --End-to-end tests: updated session history tests and multi-DB search tests; added tests for triplet_completion and memify pipeline execution Acceptance Criteria and Testing Scenario 1: -Run default add, cognify pipelines -Run create triplet embeddings memify pipeline -Verify the vector DB contains a non empty Triplet_text collection. -Use the new triplet_completion search type and confirm it works correctly. Scenario 2: -Run the default add and cognify pipelines. -Do not run the triplet embeddings memify pipeline. -Attempt to use the triplet_completion search type. -You should receive an error indicating that the triplet embeddings memify pipeline must be executed first. ## Type of Change <!-- Please check the relevant option --> - [ ] Bug fix (non-breaking change that fixes an issue) - [x] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Code refactoring - [ ] Performance improvement - [ ] Other (please specify): ## Screenshots/Videos (if applicable) <!-- Add screenshots or videos to help explain your changes --> ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [x] I have tested my changes thoroughly before submitting this PR - [x] This PR contains minimal changes necessary to address the issue/feature - [x] My code follows the project's coding standards and style guidelines - [x] I have added tests that prove my fix is effective or that my feature works - [x] I have added necessary documentation (if applicable) - [x] All new and existing tests pass - [x] I have searched existing PRs to ensure this change hasn't been submitted already - [x] I have linked any relevant issues in the description - [x] My commits have clear and descriptive messages ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Triplet-based search with LLM-powered completions (TRIPLET_COMPLETION) * Batch triplet retrieval and a triplet embeddings pipeline for extraction, indexing, and optional background processing * Context retrieval from triplet embeddings with optional caching and conversation-history support * New Triplet data type exposed for indexing and search * Examples * End-to-end example demonstrating triplet embeddings extraction and TRIPLET_COMPLETION search * Tests * Unit and integration tests covering triplet extraction, retrieval, embedding pipeline, and completion flows <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Pavel Zorin <pazonec@yandex.ru>		2025-12-02 18:27:08 +01:00
.github	Increase the machine size	2025-12-01 15:23:59 +01:00
alembic	feat: enable multi user for falkor (#1689 )	2025-11-11 17:03:48 +01:00
assets	chore: update cognee ui on readme	2025-09-11 11:05:18 +02:00
bin	Revert "Clean up core cognee repo"	2025-05-15 10:46:01 +02:00
cognee	feature: adds triplet embedding via memify (#1832 )	2025-12-02 18:27:08 +01:00
cognee-frontend	COG-3050 - remove insights search (#1506 )	2025-10-11 09:09:56 +02:00
cognee-mcp	Fix: MCP remove cognee.add() preprequisite from the doc	2025-11-13 17:35:16 +01:00
cognee-starter-kit	improve structure, readability	2025-09-04 16:20:36 +02:00
deployment	Fix/add async lock to all vector databases (#1244 )	2025-08-14 15:57:34 +02:00
distributed	chore: deletes toml and lock files from distributed directory	2025-10-14 09:55:02 +02:00
evals	Deprecate `SearchType.INSIGHTS`, replace all references to default search type - `SearchType.GRAPH_COMPLETION`	2025-10-08 12:13:59 +01:00
examples	feature: adds triplet embedding via memify (#1832 )	2025-12-02 18:27:08 +01:00
licenses	Revert "Clean up core cognee repo"	2025-05-15 10:46:01 +02:00
logs	refactor: Return logs folder	2025-10-29 16:31:42 +01:00
notebooks	rerun and update notebooks with latest cognee	2025-10-22 19:05:01 +01:00
tools	Revert "Clean up core cognee repo"	2025-05-15 10:46:01 +02:00
working_dir_error_replication	feat: Redis lock integration and Kuzu agentic access fix (#1504 )	2025-10-16 15:48:20 +02:00
.coderabbit.yaml	coderabbit fix	2025-11-25 18:09:43 +01:00
.dockerignore	Revert "Clean up core cognee repo"	2025-05-15 10:46:01 +02:00
.env.template	fix: PR comment changes	2025-11-21 16:20:19 +01:00
.gitattributes	Merge dev with main (#921 )	2025-06-07 07:48:47 -07:00
.gitguardian.yml	fix: Mcp improvements (#1114 )	2025-07-24 21:52:16 +02:00
.gitignore	feat: add welcome tutorial notebook for new users (#1425 )	2025-09-18 18:07:05 +02:00
.pre-commit-config.yaml	Feat: log pipeline status and pass it through pipeline [COG-1214] (#501 )	2025-02-11 16:41:40 +01:00
.pylintrc	fix: enable sqlalchemy adapter	2024-08-04 22:23:28 +02:00
AGENTS.md	Add repository guidelines to AGENTS.md	2025-10-26 11:18:17 +01:00
alembic.ini	fix: Logger suppresion and database logs (#1041 )	2025-07-03 20:08:27 +02:00
CODE_OF_CONDUCT.md	Update CODE_OF_CONDUCT.md	2024-12-13 11:30:16 +01:00
CONTRIBUTING.md	Merge main vol 4 (#1200 )	2025-08-05 12:48:24 +02:00
CONTRIBUTORS.md	Merge with main (#892 )	2025-05-30 23:13:04 +02:00
DCO.md	Create DCO.md	2024-12-13 11:28:44 +01:00
docker-compose.yml	added logs	2025-10-25 10:26:46 +02:00
Dockerfile	fix: Resolve issue with Kuzu graph database persistence on our local … (#1490 )	2025-10-07 20:38:43 +02:00
entrypoint.sh	added logs	2025-10-25 10:26:46 +02:00
LICENSE	Update LICENSE	2024-03-30 11:57:07 +01:00
mypy.ini	fix: Remove weaviate (#1139 )	2025-07-23 19:34:35 +02:00
NOTICE.md	add NOTICE file, reference CoC in contribution guidelines, add licenses folder for external licenses	2024-12-06 13:27:55 +00:00
poetry.lock	Fix distributed issues with latest pydantic version (#1859 )	2025-12-02 16:08:26 +01:00
pyproject.toml	Fix distributed issues with latest pydantic version (#1859 )	2025-12-02 16:08:26 +01:00
README.md	Correct typo in installation section of README	2025-10-25 13:25:16 +02:00
SECURITY.md	Merge main vol 2 (#967 )	2025-06-11 09:28:41 -04:00
uv.lock	Fix distributed issues with latest pydantic version (#1859 )	2025-12-02 16:08:26 +01:00

README.md

cognee - Memory for AI Agents in 6 lines of code

Demo . Learn more · Join Discord · Join r/AIMemory . Docs . cognee community repo

Build dynamic memory for Agents and replace RAG using scalable, modular ECL (Extract, Cognify, Load) pipelines.

Get Started

Get started quickly with a Google Colab notebook , Deepnote notebook or starter repo

About cognee

cognee works locally and stores your data on your device. Our hosted solution is just our deployment of OSS cognee on Modal, with the goal of making development and productionization easier.

Self-hosted package:

Interconnects any kind of documents: past conversations, files, images, and audio transcriptions
Replaces RAG systems with a memory layer based on graphs and vectors
Reduces developer effort and cost, while increasing quality and precision
Provides Pythonic data pipelines that manage data ingestion from 30+ data sources
Is highly customizable with custom tasks, pipelines, and a set of built-in search endpoints

Hosted platform:

Includes a managed UI and a hosted solution

Self-Hosted (Open Source)

📦 Installation

You can install Cognee using either pip, poetry, uv or any other python package manager..

Cognee supports Python 3.10 to 3.12

With uv

uv pip install cognee

Detailed instructions can be found in our docs

💻 Basic Usage

Setup

import os
os.environ["LLM_API_KEY"] = "YOUR OPENAI_API_KEY"

You can also set the variables by creating .env file, using our template. To use different LLM providers, for more info check out our documentation

Simple example

Python

This script will run the default pipeline:

import cognee
import asyncio


async def main():
    # Add text to cognee
    await cognee.add("Cognee turns documents into AI memory.")

    # Generate the knowledge graph
    await cognee.cognify()

    # Add memory algorithms to the graph
    await cognee.memify()

    # Query the knowledge graph
    results = await cognee.search("What does cognee do?")

    # Display the results
    for result in results:
        print(result)


if __name__ == '__main__':
    asyncio.run(main())

Example output:

  Cognee turns documents into AI memory.

Via CLI

Let's get the basics covered

cognee-cli add "Cognee turns documents into AI memory."

cognee-cli cognify

cognee-cli search "What does cognee do?"
cognee-cli delete --all

or run

cognee-cli -ui

Hosted Platform

Get up and running in minutes with automatic updates, analytics, and enterprise security.

Sign up on cogwit
Add your API key to local UI and sync your data to Cogwit

Demos

Cogwit Beta demo:

Cogwit Beta

Simple GraphRAG demo

Simple GraphRAG demo

cognee with Ollama

cognee with local models

Contributing

Your contributions are at the core of making this a true open source project. Any contributions you make are greatly appreciated. See CONTRIBUTING.md for more information.

Code of Conduct

We are committed to making open source an enjoyable and respectful experience for our community. See CODE_OF_CONDUCT for more information.

Citation

We now have a paper you can cite:

@misc{markovic2025optimizinginterfaceknowledgegraphs,
      title={Optimizing the Interface Between Knowledge Graphs and LLMs for Complex Reasoning},
      author={Vasilije Markovic and Lazar Obradovic and Laszlo Hajdu and Jovan Pavlovic},
      year={2025},
      eprint={2505.24478},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2505.24478},
}