Main merge vol4 (#1856)

<!-- .github/pull_request_template.md -->

## Description
Merge main changes into dev branch

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Overhauled README: renamed tagline, clarified product positioning,
reorganized Get Started into Open Source and Cloud paths, streamlined
Quickstart, refreshed demos, navigation, and citation/contributing
sections.

* **New Features**
* Added MCP tools for developer rules management, interaction logging,
and enhanced search modes (SUMMARIES, CYPHER, FEELING_LUCKY).

* **Bug Fixes**
  * Improved embedding handling to support alternate response formats.

* **Chores**
  * Updated test/dev dependency versions.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
This commit is contained in:
Igor Ilic 2025-12-04 17:48:11 +01:00 committed by GitHub
commit 4afde917a9
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 117 additions and 95 deletions

20
.github/release-drafter.yml vendored Normal file
View file

@ -0,0 +1,20 @@
name-template: 'v$NEXT_PATCH_VERSION'
tag-template: 'v$NEXT_PATCH_VERSION'
categories:
- title: 'Features'
labels: ['feature', 'enhancement']
- title: 'Bug Fixes'
labels: ['bug', 'fix']
- title: 'Maintenance'
labels: ['chore', 'refactor', 'ci']
change-template: '- $TITLE (#$NUMBER) @$AUTHOR'
template: |
## Whats Changed
$CHANGES
## Contributors
$CONTRIBUTORS

9
.mergify.yml Normal file
View file

@ -0,0 +1,9 @@
pull_request_rules:
- name: Backport to main when backport_main label is set
conditions:
- label=backport_main
- base=dev
actions:
backport:
branches:
- main

164
README.md
View file

@ -5,27 +5,27 @@
<br />
cognee - Memory for AI Agents in 6 lines of code
Cognee - Accurate and Persistent AI Memory
<p align="center">
<a href="https://www.youtube.com/watch?v=1bezuvLwJmw&t=2s">Demo</a>
.
<a href="https://cognee.ai">Learn more</a>
<a href="https://docs.cognee.ai/">Docs</a>
.
<a href="https://cognee.ai">Learn More</a>
·
<a href="https://discord.gg/NQPKmU5CCg">Join Discord</a>
·
<a href="https://www.reddit.com/r/AIMemory/">Join r/AIMemory</a>
.
<a href="https://docs.cognee.ai/">Docs</a>
.
<a href="https://github.com/topoteretes/cognee-community">cognee community repo</a>
<a href="https://github.com/topoteretes/cognee-community">Community Plugins & Add-ons</a>
</p>
[![GitHub forks](https://img.shields.io/github/forks/topoteretes/cognee.svg?style=social&label=Fork&maxAge=2592000)](https://GitHub.com/topoteretes/cognee/network/)
[![GitHub stars](https://img.shields.io/github/stars/topoteretes/cognee.svg?style=social&label=Star&maxAge=2592000)](https://GitHub.com/topoteretes/cognee/stargazers/)
[![GitHub commits](https://badgen.net/github/commits/topoteretes/cognee)](https://GitHub.com/topoteretes/cognee/commit/)
[![Github tag](https://badgen.net/github/tag/topoteretes/cognee)](https://github.com/topoteretes/cognee/tags/)
[![GitHub tag](https://badgen.net/github/tag/topoteretes/cognee)](https://github.com/topoteretes/cognee/tags/)
[![Downloads](https://static.pepy.tech/badge/cognee)](https://pepy.tech/project/cognee)
[![License](https://img.shields.io/github/license/topoteretes/cognee?colorA=00C586&colorB=000000)](https://github.com/topoteretes/cognee/blob/main/LICENSE)
[![Contributors](https://img.shields.io/github/contributors/topoteretes/cognee?colorA=00C586&colorB=000000)](https://github.com/topoteretes/cognee/graphs/contributors)
@ -41,11 +41,7 @@
</a>
</p>
Build dynamic memory for Agents and replace RAG using scalable, modular ECL (Extract, Cognify, Load) pipelines.
Use your data to build personalized and dynamic memory for AI Agents. Cognee lets you replace RAG with scalable and modular ECL (Extract, Cognify, Load) pipelines.
<p align="center">
🌐 Available Languages
@ -53,7 +49,7 @@ Build dynamic memory for Agents and replace RAG using scalable, modular ECL (Ext
<!-- Keep these links. Translations will automatically update with the README. -->
<a href="https://www.readme-i18n.com/topoteretes/cognee?lang=de">Deutsch</a> |
<a href="https://www.readme-i18n.com/topoteretes/cognee?lang=es">Español</a> |
<a href="https://www.readme-i18n.com/topoteretes/cognee?lang=fr">français</a> |
<a href="https://www.readme-i18n.com/topoteretes/cognee?lang=fr">Français</a> |
<a href="https://www.readme-i18n.com/topoteretes/cognee?lang=ja">日本語</a> |
<a href="https://www.readme-i18n.com/topoteretes/cognee?lang=ko">한국어</a> |
<a href="https://www.readme-i18n.com/topoteretes/cognee?lang=pt">Português</a> |
@ -67,69 +63,65 @@ Build dynamic memory for Agents and replace RAG using scalable, modular ECL (Ext
</div>
</div>
## About Cognee
Cognee is an open-source tool and platform that transforms your raw data into persistent and dynamic AI memory for Agents. It combines vector search with graph databases to make your documents both searchable by meaning and connected by relationships.
## Get Started
You can use Cognee in two ways:
Get started quickly with a Google Colab <a href="https://colab.research.google.com/drive/12Vi9zID-M3fpKpKiaqDBvkk98ElkRPWy?usp=sharing">notebook</a> , <a href="https://deepnote.com/workspace/cognee-382213d0-0444-4c89-8265-13770e333c02/project/cognee-demo-78ffacb9-5832-4611-bb1a-560386068b30/notebook/Notebook-1-75b24cda566d4c24ab348f7150792601?utm_source=share-modal&utm_medium=product-shared-content&utm_campaign=notebook&utm_content=78ffacb9-5832-4611-bb1a-560386068b30">Deepnote notebook</a> or <a href="https://github.com/topoteretes/cognee/tree/main/cognee-starter-kit">starter repo</a>
1. [Self-host Cognee Open Source](https://docs.cognee.ai/getting-started/installation), which stores all data locally by default.
2. [Connect to Cognee Cloud](https://platform.cognee.ai/), and get the same OSS stack on managed infrastructure for easier development and productionization.
### Cognee Open Source (self-hosted):
## About cognee
- Interconnects any type of data — including past conversations, files, images, and audio transcriptions
- Replaces traditional RAG systems with a unified memory layer built on graphs and vectors
- Reduces developer effort and infrastructure cost while improving quality and precision
- Provides Pythonic data pipelines for ingestion from 30+ data sources
- Offers high customizability through user-defined tasks, modular pipelines, and built-in search endpoints
cognee works locally and stores your data on your device.
Our hosted solution is just our deployment of OSS cognee on Modal, with the goal of making development and productionization easier.
### Cognee Cloud (managed):
- Hosted web UI dashboard
- Automatic version updates
- Resource usage analytics
- GDPR compliant, enterprise-grade security
Self-hosted package:
## Basic Usage & Feature Guide
- Interconnects any kind of documents: past conversations, files, images, and audio transcriptions
- Replaces RAG systems with a memory layer based on graphs and vectors
- Reduces developer effort and cost, while increasing quality and precision
- Provides Pythonic data pipelines that manage data ingestion from 30+ data sources
- Is highly customizable with custom tasks, pipelines, and a set of built-in search endpoints
To learn more, [check out this short, end-to-end Colab walkthrough](https://colab.research.google.com/drive/12Vi9zID-M3fpKpKiaqDBvkk98ElkRPWy?usp=sharing) of Cognee's core features.
Hosted platform:
- Includes a managed UI and a [hosted solution](https://www.cognee.ai)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/12Vi9zID-M3fpKpKiaqDBvkk98ElkRPWy?usp=sharing)
## Quickstart
Lets try Cognee in just a few lines of code. For detailed setup and configuration, see the [Cognee Docs](https://docs.cognee.ai/getting-started/installation#environment-configuration).
## Self-Hosted (Open Source)
### Prerequisites
- Python 3.10 to 3.13
### 📦 Installation
### Step 1: Install Cognee
You can install Cognee using either **pip**, **poetry**, **uv** or any other python package manager..
Cognee supports Python 3.10 to 3.12
#### With uv
You can install Cognee with **pip**, **poetry**, **uv**, or your preferred Python package manager.
```bash
uv pip install cognee
```
Detailed instructions can be found in our [docs](https://docs.cognee.ai/getting-started/installation#environment-configuration)
### 💻 Basic Usage
#### Setup
```
### Step 2: Configure the LLM
```python
import os
os.environ["LLM_API_KEY"] = "YOUR OPENAI_API_KEY"
```
Alternatively, create a `.env` file using our [template](https://github.com/topoteretes/cognee/blob/main/.env.template).
You can also set the variables by creating .env file, using our <a href="https://github.com/topoteretes/cognee/blob/main/.env.template">template.</a>
To use different LLM providers, for more info check out our <a href="https://docs.cognee.ai/setup-configuration/llm-providers">documentation</a>
To integrate other LLM providers, see our [LLM Provider Documentation](https://docs.cognee.ai/setup-configuration/llm-providers).
### Step 3: Run the Pipeline
#### Simple example
Cognee will take your documents, generate a knowledge graph from them and then query the graph based on combined relationships.
##### Python
This script will run the default pipeline:
Now, run a minimal pipeline:
```python
import cognee
@ -147,7 +139,7 @@ async def main():
await cognee.memify()
# Query the knowledge graph
results = await cognee.search("What does cognee do?")
results = await cognee.search("What does Cognee do?")
# Display the results
for result in results:
@ -158,69 +150,61 @@ if __name__ == '__main__':
asyncio.run(main())
```
Example output:
```
As you can see, the output is generated from the document we previously stored in Cognee:
```bash
Cognee turns documents into AI memory.
```
##### Via CLI
Let's get the basics covered
### Use the Cognee CLI
```
As an alternative, you can get started with these essential commands:
```bash
cognee-cli add "Cognee turns documents into AI memory."
cognee-cli cognify
cognee-cli search "What does cognee do?"
cognee-cli search "What does Cognee do?"
cognee-cli delete --all
```
or run
```
To open the local UI, run:
```bash
cognee-cli -ui
```
## Demos & Examples
</div>
See Cognee in action:
### Persistent Agent Memory
[Cognee Memory for LangGraph Agents](https://github.com/user-attachments/assets/e113b628-7212-4a2b-b288-0be39a93a1c3)
### Simple GraphRAG
[Watch Demo](https://github.com/user-attachments/assets/f2186b2e-305a-42b0-9c2d-9f4473f15df8)
### Cognee with Ollama
[Watch Demo](https://github.com/user-attachments/assets/39672858-f774-4136-b957-1e2de67b8981)
### Hosted Platform
## Community & Support
Get up and running in minutes with automatic updates, analytics, and enterprise security.
### Contributing
We welcome contributions from the community! Your input helps make Cognee better for everyone. See [`CONTRIBUTING.md`](CONTRIBUTING.md) to get started.
1. Sign up on [cogwit](https://www.cognee.ai)
2. Add your API key to local UI and sync your data to Cogwit
### Code of Conduct
We're committed to fostering an inclusive and respectful community. Read our [Code of Conduct](https://github.com/topoteretes/cognee/blob/main/CODE_OF_CONDUCT.md) for guidelines.
## Research & Citation
## Demos
1. Cogwit Beta demo:
[Cogwit Beta](https://github.com/user-attachments/assets/fa520cd2-2913-4246-a444-902ea5242cb0)
2. Simple GraphRAG demo
[Simple GraphRAG demo](https://github.com/user-attachments/assets/d80b0776-4eb9-4b8e-aa22-3691e2d44b8f)
3. cognee with Ollama
[cognee with local models](https://github.com/user-attachments/assets/8621d3e8-ecb8-4860-afb2-5594f2ee17db)
## Contributing
Your contributions are at the core of making this a true open source project. Any contributions you make are **greatly appreciated**. See [`CONTRIBUTING.md`](CONTRIBUTING.md) for more information.
## Code of Conduct
We are committed to making open source an enjoyable and respectful experience for our community. See <a href="https://github.com/topoteretes/cognee/blob/main/CODE_OF_CONDUCT.md"><code>CODE_OF_CONDUCT</code></a> for more information.
## Citation
We now have a paper you can cite:
We recently published a research paper on optimizing knowledge graphs for LLM reasoning:
```bibtex
@misc{markovic2025optimizinginterfaceknowledgegraphs,

View file

@ -445,16 +445,22 @@ The MCP server exposes its functionality through tools. Call them from any MCP c
- **cognify**: Turns your data into a structured knowledge graph and stores it in memory
- **cognee_add_developer_rules**: Ingest core developer rule files into memory
- **codify**: Analyse a code repository, build a code graph, stores it in memory
- **search**: Query memory supports GRAPH_COMPLETION, RAG_COMPLETION, CODE, CHUNKS
- **list_data**: List all datasets and their data items with IDs for deletion operations
- **delete**: Delete specific data from a dataset (supports soft/hard deletion modes)
- **get_developer_rules**: Retrieve all developer rules that were generated based on previous interactions
- **list_data**: List all datasets and their data items with IDs for deletion operations
- **save_interaction**: Logs user-agent interactions and query-answer pairs
- **prune**: Reset cognee for a fresh start (removes all data)
- **search**: Query memory supports GRAPH_COMPLETION, RAG_COMPLETION, CODE, CHUNKS, SUMMARIES, CYPHER, and FEELING_LUCKY
- **cognify_status / codify_status**: Track pipeline progress
**Data Management Examples:**

View file

@ -124,7 +124,10 @@ class OllamaEmbeddingEngine(EmbeddingEngine):
self.endpoint, json=payload, headers=headers, timeout=60.0
) as response:
data = await response.json()
return data["embeddings"][0]
if "embeddings" in data:
return data["embeddings"][0]
else:
return data["data"][0]["embedding"]
def get_vector_size(self) -> int:
"""