Compare commits

...
Sign in to create a new pull request.

79 commits

Author SHA1 Message Date
Vasilije
a505eef95b
Enhance README with Cognee description
Added a brief description of Cognee's features and capabilities.
2026-01-17 21:00:31 +00:00
Vasilije
8cd3aab1ef
Clarify Cognee Open Source and Cloud descriptions
Updated the README to clarify the differences between Cognee Open Source and Cognee Cloud offerings.
2026-01-17 20:58:55 +00:00
Vasilije
8a96a351e2
chore(deps): bump the npm_and_yarn group across 1 directory with 2 updates (#1974)
Bumps the npm_and_yarn group with 2 updates in the /cognee-frontend
directory: [next](https://github.com/vercel/next.js) and
[preact](https://github.com/preactjs/preact).

Updates `next` from 16.0.4 to 16.1.1
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/vercel/next.js/releases">next's
releases</a>.</em></p>
<blockquote>
<h2>v16.1.1</h2>
<blockquote>
<p>[!NOTE]
This release is backporting bug fixes. It does <strong>not</strong>
include all pending features/changes on canary.</p>
</blockquote>
<h3>Core Changes</h3>
<ul>
<li>Turbopack: Create junction points instead of symlinks on Windows (<a
href="https://redirect.github.com/vercel/next.js/issues/87606">#87606</a>)</li>
</ul>
<h3>Credits</h3>
<p>Huge thanks to <a
href="https://github.com/sokra"><code>@​sokra</code></a> and <a
href="https://github.com/ztanner"><code>@​ztanner</code></a> for
helping!</p>
<h2>v16.1.1-canary.16</h2>
<h3>Core Changes</h3>
<ul>
<li>Add maximum size limit for postponed body parsing: <a
href="https://redirect.github.com/vercel/next.js/issues/88175">#88175</a></li>
<li>metadata: use fixed segment in dynamic routes with static metadata
files: <a
href="https://redirect.github.com/vercel/next.js/issues/88113">#88113</a></li>
<li>feat: add --experimental-cpu-prof flag for dev, build, and start: <a
href="https://redirect.github.com/vercel/next.js/issues/87946">#87946</a></li>
<li>Add experimental option to use no-cache instead of no-store in dev:
<a
href="https://redirect.github.com/vercel/next.js/issues/88182">#88182</a></li>
</ul>
<h3>Misc Changes</h3>
<ul>
<li>fix: move conductor.json to repo root for proper detection: <a
href="https://redirect.github.com/vercel/next.js/issues/88184">#88184</a></li>
<li>Turbopack: Update to swc_core v50.2.3: <a
href="https://redirect.github.com/vercel/next.js/issues/87841">#87841</a></li>
<li>Update generateMetadata in client component error: <a
href="https://redirect.github.com/vercel/next.js/issues/88172">#88172</a></li>
<li>ci: run stats on canary pushes for historical trend tracking: <a
href="https://redirect.github.com/vercel/next.js/issues/88157">#88157</a></li>
<li>perf: improve stats thresholds to reduce CI noise: <a
href="https://redirect.github.com/vercel/next.js/issues/88158">#88158</a></li>
</ul>
<h3>Credits</h3>
<p>Huge thanks to <a
href="https://github.com/wyattjoh"><code>@​wyattjoh</code></a>, <a
href="https://github.com/bgw"><code>@​bgw</code></a>, <a
href="https://github.com/timneutkens"><code>@​timneutkens</code></a>, <a
href="https://github.com/feedthejim"><code>@​feedthejim</code></a>, and
<a href="https://github.com/huozhi"><code>@​huozhi</code></a> for
helping!</p>
<h2>v16.1.1-canary.15</h2>
<h3>Core Changes</h3>
<ul>
<li>add compilation error for taint when not enabled: <a
href="https://redirect.github.com/vercel/next.js/issues/88173">#88173</a></li>
<li>feat(next/image)!: add <code>images.maximumResponseBody</code>
config: <a
href="https://redirect.github.com/vercel/next.js/issues/88183">#88183</a></li>
</ul>
<h3>Misc Changes</h3>
<ul>
<li>Update Rspack production test manifest: <a
href="https://redirect.github.com/vercel/next.js/issues/88137">#88137</a></li>
<li>Update Rspack development test manifest: <a
href="https://redirect.github.com/vercel/next.js/issues/88138">#88138</a></li>
<li>Fix compile error when running next-custom-transform tests: <a
href="https://redirect.github.com/vercel/next.js/issues/83715">#83715</a></li>
<li>chore: add Conductor configuration for parallel development: <a
href="https://redirect.github.com/vercel/next.js/issues/88116">#88116</a></li>
<li>[docs] add get_routes in mcp available tools: <a
href="https://redirect.github.com/vercel/next.js/issues/88181">#88181</a></li>
</ul>
<h3>Credits</h3>
<p>Huge thanks to <a
href="https://github.com/vercel-release-bot"><code>@​vercel-release-bot</code></a>,
<a href="https://github.com/ztanner"><code>@​ztanner</code></a>, <a
href="https://github.com/mischnic"><code>@​mischnic</code></a>, <a
href="https://github.com/wyattjoh"><code>@​wyattjoh</code></a>, <a
href="https://github.com/huozhi"><code>@​huozhi</code></a>, and <a
href="https://github.com/styfle"><code>@​styfle</code></a> for
helping!</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="3aa53984e9"><code>3aa5398</code></a>
v16.1.1</li>
<li><a
href="d1bd5b5810"><code>d1bd5b5</code></a>
Turbopack: Create junction points instead of symlinks on Windows (<a
href="https://redirect.github.com/vercel/next.js/issues/87606">#87606</a>)</li>
<li><a
href="a67ee72788"><code>a67ee72</code></a>
setup release branch</li>
<li><a
href="34916762cd"><code>3491676</code></a>
v16.1.0</li>
<li><a
href="58e8f8c7e5"><code>58e8f8c</code></a>
v16.1.0-canary.34</li>
<li><a
href="8a8a00d5d0"><code>8a8a00d</code></a>
Revert &quot;Move next-env.d.ts to dist dir&quot; (<a
href="https://redirect.github.com/vercel/next.js/issues/87311">#87311</a>)</li>
<li><a
href="3284587f8e"><code>3284587</code></a>
v16.1.0-canary.33</li>
<li><a
href="25da5f0426"><code>25da5f0</code></a>
Move next-env.d.ts to dist dir (<a
href="https://redirect.github.com/vercel/next.js/issues/86752">#86752</a>)</li>
<li><a
href="aa8a243e72"><code>aa8a243</code></a>
feat: use Rspack persistent cache by default (<a
href="https://redirect.github.com/vercel/next.js/issues/81399">#81399</a>)</li>
<li><a
href="754db28e52"><code>754db28</code></a>
bundle analyzer: remove geist font in favor of system ui fonts (<a
href="https://redirect.github.com/vercel/next.js/issues/87292">#87292</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/vercel/next.js/compare/v16.0.4...v16.1.1">compare
view</a></li>
</ul>
</details>
<br />

Updates `preact` from 10.27.2 to 10.28.2
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/preactjs/preact/releases">preact's
releases</a>.</em></p>
<blockquote>
<h2>10.28.2</h2>
<h2>Fixes</h2>
<ul>
<li>Enforce strict equality for VNode object constructors</li>
</ul>
<h2>10.28.1</h2>
<h2>Fixes</h2>
<ul>
<li>Fix erroneous diffing w/ growing list (<a
href="https://redirect.github.com/preactjs/preact/issues/4975">#4975</a>,
thanks <a
href="https://github.com/JoviDeCroock"><code>@​JoviDeCroock</code></a>)</li>
</ul>
<h2>10.28.0</h2>
<h2>Types</h2>
<ul>
<li>Updates dangerouslySetInnerHTML type so future TS will accept
Trusted… (<a
href="https://redirect.github.com/preactjs/preact/issues/4931">#4931</a>,
thanks <a
href="https://github.com/lukewarlow"><code>@​lukewarlow</code></a>)</li>
<li>Adds snap events (<a
href="https://redirect.github.com/preactjs/preact/issues/4947">#4947</a>,
thanks <a
href="https://github.com/argyleink"><code>@​argyleink</code></a>)</li>
<li>Remove missed jsx duplicates (<a
href="https://redirect.github.com/preactjs/preact/issues/4950">#4950</a>,
thanks <a
href="https://github.com/rschristian"><code>@​rschristian</code></a>)</li>
<li>Fix scroll events (<a
href="https://redirect.github.com/preactjs/preact/issues/4949">#4949</a>,
thanks <a
href="https://github.com/rschristian"><code>@​rschristian</code></a>)</li>
</ul>
<h2>Fixes</h2>
<ul>
<li>Fix cascading renders with signals (<a
href="https://redirect.github.com/preactjs/preact/issues/4966">#4966</a>,
thanks <a
href="https://github.com/JoviDeCroock"><code>@​JoviDeCroock</code></a>)</li>
<li>add <code>commpat/server.browser</code> entry (<a
href="https://redirect.github.com/preactjs/preact/issues/4941">#4941</a>
&amp; <a
href="https://redirect.github.com/preactjs/preact/issues/4940">#4940</a>,
thanks <a
href="https://github.com/marvinhagemeister"><code>@​marvinhagemeister</code></a>)</li>
<li>Avoid lazy components without result going in throw loop (<a
href="https://redirect.github.com/preactjs/preact/issues/4937">#4937</a>,
thanks <a
href="https://github.com/JoviDeCroock"><code>@​JoviDeCroock</code></a>)</li>
</ul>
<h2>Performance</h2>
<ul>
<li>Backport some v11 optimizations (<a
href="https://redirect.github.com/preactjs/preact/issues/4967">#4967</a>,
thanks <a
href="https://github.com/JoviDeCroock"><code>@​JoviDeCroock</code></a>)</li>
</ul>
<h2>10.27.3</h2>
<h2>Fixes</h2>
<ul>
<li>Enforce strict equality for VNode object constructors</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="6f914464b3"><code>6f91446</code></a>
10.28.2</li>
<li><a
href="37c3e030ab"><code>37c3e03</code></a>
Strict equality check on constructor (<a
href="https://redirect.github.com/preactjs/preact/issues/4985">#4985</a>)</li>
<li><a
href="6670a4a70b"><code>6670a4a</code></a>
chore: Adjust TS linting setup (<a
href="https://redirect.github.com/preactjs/preact/issues/4982">#4982</a>)</li>
<li><a
href="2af522b2c2"><code>2af522b</code></a>
10.28.1 (<a
href="https://redirect.github.com/preactjs/preact/issues/4978">#4978</a>)</li>
<li><a
href="f7693b72ec"><code>f7693b7</code></a>
Fix erroneous diffing w/ growing list (<a
href="https://redirect.github.com/preactjs/preact/issues/4975">#4975</a>)</li>
<li><a
href="b36b6a7148"><code>b36b6a7</code></a>
10.28.0 (<a
href="https://redirect.github.com/preactjs/preact/issues/4968">#4968</a>)</li>
<li><a
href="4d40e96f43"><code>4d40e96</code></a>
Backport some v11 optimizations (<a
href="https://redirect.github.com/preactjs/preact/issues/4967">#4967</a>)</li>
<li><a
href="7b74b406e2"><code>7b74b40</code></a>
Fix cascading renders with signals (<a
href="https://redirect.github.com/preactjs/preact/issues/4966">#4966</a>)</li>
<li><a
href="3ab5c6fbbb"><code>3ab5c6f</code></a>
Updates dangerouslySetInnerHTML type so future TS will accept Trusted…
(<a
href="https://redirect.github.com/preactjs/preact/issues/4931">#4931</a>)</li>
<li><a
href="ff30c2b5c4"><code>ff30c2b</code></a>
Adds snap events (<a
href="https://redirect.github.com/preactjs/preact/issues/4947">#4947</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/preactjs/preact/compare/10.27.2...10.28.2">compare
view</a></li>
</ul>
</details>
<br />


Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/topoteretes/cognee/network/alerts).

</details>
2026-01-08 15:55:56 +01:00
Vasilije
39613997d6
docs: clarify dev branching and fix contributing text (#1976)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Acceptance Criteria
<!--
* Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements;
* Include instructions on how to verify the changes. Describe how to
test it locally;
* Proof that it's sufficiently tested.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [x] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2026-01-08 15:53:18 +01:00
Vasilije
a5fc6165c1
refactor: Use same default_k value in MCP as for Cognee (#1977)
<!-- .github/pull_request_template.md -->

Set default top_k value for MCP to be the same as the Cognee default
top_k value

## Acceptance Criteria
<!--
* Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements;
* Include instructions on how to verify the changes. Describe how to
test it locally;
* Proof that it's sufficiently tested.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Improvements**
* Search operations now return 10 results by default instead of 5,
providing more comprehensive search results.

* **Style**
* Minor internal formatting cleanup in the client path with no
user-visible behavior changes.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-08 15:52:15 +01:00
Igor Ilic
69fe35bdee refactor: add ruff formatting 2026-01-08 13:32:15 +01:00
Igor Ilic
be738df88a refactor: Use same default_k value in MCP as for Cognee 2026-01-08 12:47:42 +01:00
Babar Ali
01a39dff22 docs: clarify dev branching and fix contributing text
Signed-off-by: Babar Ali <148423037+Babarali2k21@users.noreply.github.com>
2026-01-08 10:15:42 +01:00
dependabot[bot]
53f96f3e29
chore(deps): bump the npm_and_yarn group across 1 directory with 2 updates
Bumps the npm_and_yarn group with 2 updates in the /cognee-frontend directory: [next](https://github.com/vercel/next.js) and [preact](https://github.com/preactjs/preact).


Updates `next` from 16.0.4 to 16.1.1
- [Release notes](https://github.com/vercel/next.js/releases)
- [Changelog](https://github.com/vercel/next.js/blob/canary/release.js)
- [Commits](https://github.com/vercel/next.js/compare/v16.0.4...v16.1.1)

Updates `preact` from 10.27.2 to 10.28.2
- [Release notes](https://github.com/preactjs/preact/releases)
- [Commits](https://github.com/preactjs/preact/compare/10.27.2...10.28.2)

---
updated-dependencies:
- dependency-name: next
  dependency-version: 16.1.1
  dependency-type: direct:production
  dependency-group: npm_and_yarn
- dependency-name: preact
  dependency-version: 10.28.2
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-01-07 19:36:40 +00:00
Vasilije
b339529621
Fix: Add top_k parameter support to MCP search tool (#1954)
## Problem

The MCP search wrapper doesn't expose the `top_k` parameter, causing
critical performance and usability issues:

- **Unlimited result returns**: CHUNKS search returns 113KB+ responses
with hundreds of text chunks
- **Extreme performance degradation**: GRAPH_COMPLETION takes 30+
seconds to complete
- **Context window exhaustion**: Responses quickly consume the entire
context budget
- **Production unusability**: Search functionality is impractical for
real-world MCP client usage

### Root Cause

The MCP tool definition in `server.py` doesn't expose the `top_k`
parameter that exists in the underlying `cognee.search()` API.
Additionally, `cognee_client.py` ignores the parameter in direct mode
(line 194).

## Solution

This PR adds proper `top_k` parameter support throughout the MCP call
chain:

### Changes

1. **server.py (line 319)**: Add `top_k: int = 5` parameter to MCP
`search` tool
2. **server.py (line 428)**: Update `search_task` signature to accept
`top_k`
3. **server.py (line 433)**: Pass `top_k` to `cognee_client.search()`
4. **server.py (line 468)**: Pass `top_k` to `search_task` call
5. **cognee_client.py (line 194)**: Forward `top_k` parameter to
`cognee.search()`

### Parameter Flow
```
MCP Client (Claude Code, etc.)
    ↓ search(query, type, top_k=5)
server.py::search()
    ↓
server.py::search_task()
    ↓
cognee_client.search()
    ↓
cognee.search()  ← Core library
```

## Impact

### Performance Improvements

| Metric | Before | After (top_k=5) | Improvement |
|--------|--------|-----------------|-------------|
| Response Size (CHUNKS) | 113KB+ | ~3KB | 97% reduction |
| Response Size (GRAPH_COMPLETION) | 100KB+ | ~5KB | 95% reduction |
| Latency (GRAPH_COMPLETION) | 30+ seconds | 2-5 seconds | 80-90% faster
|
| Context Window Usage | Rapidly exhausted | Sustainable | Dramatic
improvement |

### User Control

Users can now control result granularity:
- `top_k=3` - Quick answers, minimal context
- `top_k=5` - Balanced (default)
- `top_k=10` - More comprehensive
- `top_k=20` - Maximum context (still reasonable)

### Backward Compatibility

 **Fully backward compatible**
- Default `top_k=5` maintains sensible behavior
- Existing MCP clients work without changes
- No breaking API changes

## Testing

### Code Review
-  All function signatures updated correctly
-  Parameter properly threaded through call chain
-  Default value provides sensible behavior
-  No syntax errors or type issues

### Production Usage
-  Patches in production use since issue discovery
-  Confirmed dramatic performance improvements
-  Successfully tested with CHUNKS, GRAPH_COMPLETION, and
RAG_COMPLETION search types
-  Vertex AI backend compatibility validated

## Additional Context

This issue particularly affects users of:
- Non-OpenAI LLM backends (Vertex AI, Claude, etc.)
- Production MCP deployments
- Context-sensitive applications (Claude Code, etc.)

The fix enables Cognee MCP to be practically usable in production
environments where context window management and response latency are
critical.

---

**Generated with** [Claude Code](https://claude.com/claude-code)

**Co-Authored-By**: Claude Sonnet 4.5 <noreply@anthropic.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Search queries now support a configurable results limit (defaults to
5), letting users control how many results are returned.
* The results limit is consistently applied across search modes so
returned results match the requested maximum.

* **Documentation**
* Clarified description of the results limit and its impact on result
sets and context usage.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-03 10:57:10 +01:00
AnveshJarabani
6a5ba70ced
docs: Add comprehensive docstrings and fix default top_k consistency
Address PR feedback from CodeRabbit AI:
- Add detailed docstring for search_task internal function
- Document top_k parameter in main search function docstring
- Fix default top_k inconsistency (was 10 in client, now 5 everywhere)
- Clarify performance implications of different top_k values

Changes:
- server.py: Add top_k parameter documentation and search_task docstring
- cognee_client.py: Change default top_k from 10 to 5 for consistency

This ensures consistent behavior across the MCP call chain and
provides clear guidance for users on choosing appropriate top_k values.
2026-01-03 01:33:13 -06:00
AnveshJarabani
7ee36f883b
Fix: Add top_k parameter support to MCP search tool
## Problem
The MCP search wrapper doesn't expose the top_k parameter, causing:
- Unlimited result returns (113KB+ responses)
- Extremely slow search performance (30+ seconds for GRAPH_COMPLETION)
- Context window exhaustion in production use

## Solution
1. Add top_k parameter (default=5) to MCP search tool in server.py
2. Thread parameter through search_task internal function
3. Forward top_k to cognee_client.search() call
4. Update cognee_client.py to pass top_k to core cognee.search()

## Impact
- **Performance**: 97% reduction in response size (113KB → 3KB)
- **Latency**: 80-90% faster (30s → 2-5s for GRAPH_COMPLETION)
- **Backward Compatible**: Default top_k=5 maintains existing behavior
- **User Control**: Configurable from top_k=3 (quick) to top_k=20 (comprehensive)

## Testing
-  Code review validates proper parameter threading
-  Backward compatible (default value ensures no breaking changes)
-  Production usage confirms performance improvements

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-03 01:27:16 -06:00
Vasilije
5b42b21af5
Enhance CONTRIBUTING.md with example setup instructions
Added instructions for running a simple example and setting up the environment.
2025-12-29 18:00:08 +01:00
Vasilije
1061258fde
Fix Python 3.12 SyntaxError caused by JS regex escape sequences (#1934)
### Fix Python 3.12 SyntaxError caused by invalid escape sequences

#### Problem
Importing `cognee` on Python 3.12+ raises:

SyntaxError: invalid escape sequence '\s'

This occurs because JavaScript regex patterns (e.g. `/\s+/g`) are
embedded
inside a regular Python triple-quoted string, which Python 3.12 strictly
validates.

#### Solution
Converted the HTML template string to a raw string (`r"""`) so Python
does not
interpret JavaScript regex escape sequences.

#### Files Changed
- cognee/modules/visualization/cognee_network_visualization.py

Fixes #1929


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Bug Fixes**
* Improved stability of the network visualization module through
enhanced HTML template handling.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-23 14:23:25 +01:00
Uday Gupta
7019a91f7c Fix Python 3.12 SyntaxError caused by JS regex escape sequences 2025-12-23 15:51:07 +05:30
Vasilije
e0644285d4
fix: Resolve issue with migrations for docker (#1932)
<!-- .github/pull_request_template.md -->

## Description
Try database creation if migrations can't run in Cognee

Until we rework the way migrations work in Cognee we can't create the
database first and then run migrations so this is the best we can do
until then

This will unfortunately eat all potential migrations issues, we should
make it work without eating exceptions when we rework migrations

## Acceptance Criteria
<!--
* Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements;
* Include instructions on how to verify the changes. Describe how to
test it locally;
* Proof that it's sufficiently tested.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Bug Fixes**
* Improved database migration resilience with automatic fallback
initialization when migrations encounter unexpected errors
* Enhanced startup error handling to ensure graceful recovery during
database setup
* Database initialization now attempts automatic recovery before
reporting startup failures
* Added explicit success messaging when migrations complete without
issues

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-22 15:41:55 +01:00
Igor Ilic
f1526a6660 fix: Resolve issue with migrations for docker 2025-12-22 14:54:11 +01:00
Vasilije
d8d3844805
refactor: Update examples to use pprint (#1921)
<!-- .github/pull_request_template.md -->

## Description
Update examples to use pprint

## Acceptance Criteria
<!--
* Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements;
* Include instructions on how to verify the changes. Describe how to
test it locally;
* Proof that it's sufficiently tested.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Style**
* Updated Python examples to use prettier, more readable output
formatting for search and example results.
* **Documentation**
* README updated to reflect improved example output presentation, making
demonstrations easier to read and verify.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-18 17:59:24 +01:00
Pavel Zorin
23d55a45d4 Release v0.5.1 2025-12-18 16:14:47 +01:00
Igor Ilic
1724997683 docs: Update README.md 2025-12-18 14:46:21 +01:00
Igor Ilic
eda9f26b2b
Merge branch 'main' into human-readable-search 2025-12-18 14:24:09 +01:00
Igor Ilic
cc41ef853c refactor: Update examples to use pprint 2025-12-18 14:17:24 +01:00
Vasilije
4caac4b8f0
refactor: Make graphs return optional in search (#1919)
<!-- .github/pull_request_template.md -->

## Description
- Have search results be more human readable by making graphs return
information optional

## Acceptance Criteria
<!--
* Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements;
* Include instructions on how to verify the changes. Describe how to
test it locally;
* Proof that it's sufficiently tested.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added an optional verbose mode to search. When enabled, results
include additional graph details; disabled by default for cleaner
responses.

* **Tests**
* Added unit tests verifying access-controlled search returns correctly
shaped results for both verbose and non-verbose modes, including
presence/absence of graph details.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-18 13:50:33 +01:00
Igor Ilic
b5949580de refactor: add note about verbose in combined context search 2025-12-18 13:45:20 +01:00
Igor Ilic
986b93fee4 docs: add docstring update for search 2025-12-18 13:24:39 +01:00
Igor Ilic
31e491bc88 test: Add test for verbose search 2025-12-18 13:04:17 +01:00
Igor Ilic
f2bc7ca992 refactor: change comment 2025-12-18 12:00:06 +01:00
Igor Ilic
dd9aad90cb refactor: Make graphs return optional 2025-12-18 11:57:40 +01:00
Vasilije
c7e94e296e
fix: Fix MCP on main branch (#1906)
<!-- .github/pull_request_template.md -->

## Description
Fixes MCP on main branch

## Acceptance Criteria
<!--
* Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements;
* Include instructions on how to verify the changes. Describe how to
test it locally;
* Proof that it's sufficiently tested.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Chores**
  * Updated core dependency to a newer version.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-16 12:22:21 +01:00
Igor Ilic
109b889b68 fix: Fix MCP on main branch 2025-12-16 10:47:10 +01:00
Pavel Zorin
fdbee66985 Release: add push to main docker workflow 2025-12-15 20:46:20 +01:00
Pavel Zorin
077e0a4937
Release 0.5.0: uv lock (#1902)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Acceptance Criteria
<!--
* Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements;
* Include instructions on how to verify the changes. Describe how to
test it locally;
* Proof that it's sufficiently tested.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-12-15 20:35:48 +01:00
Pavel Zorin
40e1b9bfed mcp: uv lock 2025-12-15 20:34:51 +01:00
Pavel Zorin
4424ebf3c9 Release 0.5.0: uv lock 2025-12-15 20:33:02 +01:00
Pavel Zorin
96107b9f2f
Bump cognee and MCP versions to 0.5.0 (#1901)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Acceptance Criteria
<!--
* Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements;
* Include instructions on how to verify the changes. Describe how to
test it locally;
* Proof that it's sufficiently tested.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-12-15 20:30:19 +01:00
Pavel Zorin
b42ea166a8 Release: Bump cognee and MCP versions to 0.5.0 2025-12-15 20:29:29 +01:00
Pavel Zorin
4936551a90 CI: Update release workflow 2025-12-15 20:18:51 +01:00
Pavel Zorin
405c925b70
Release 0.5.0 Merge dev to main (#1900)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-12-15 19:57:39 +01:00
Pavel Zorin
78028b819f update dev uv.lock 2025-12-15 18:42:02 +01:00
Pavel Zorin
67af8a7cb4
Bump version from 0.5.0.dev0 to 0.5.0.dev1 2025-12-15 18:36:15 +01:00
hajdul88
622f8fa79e
chore: introduces 1 file upload in ontology endpoint (#1899)
<!-- .github/pull_request_template.md -->

## Description
This PR fixes the ontology upload endpoint by forcing 1 file upload at
the time. Tests are adjusted in both server start and ontology endpoint
unit test. API was tested.

Do not merge it together with
https://github.com/topoteretes/cognee/pull/1898 its either that or this
one.



## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **API Changes**
* Ontology upload now accepts exactly one file per request; field
renamed from "descriptions" to "description" and validated as a plain
string.
* Stricter form validation and tighter 400/500 error handling for
malformed submissions.

* **Tests**
* Tests converted to real HTTP-style interactions using a shared test
client and dependency overrides.
* Payloads now use plain string fields; added coverage for single-file
constraints and specific error responses.

* **Style**
  * Minor formatting cleanups with no functional impact.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-15 18:30:35 +01:00
Igor Ilic
14d9540d1b
feat: Add database deletion on dataset delete (#1893)
<!-- .github/pull_request_template.md -->

## Description
- Add support for database deletion when dataset is deleted
- Simplify dataset handler usage in Cognee

## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Improved dataset deletion: stronger authorization checks and reliable
removal of associated graph and vector storage.

* **Tests**
* Added end-to-end test to verify complete dataset deletion and cleanup
of all related storage components.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-15 18:15:48 +01:00
Vasilije
6b86f423ff
COG-3546: Initial release pipeline (#1883)
<!-- .github/pull_request_template.md -->

## Description

### Inputs: 
`flavour`: `dev` or `main`
`test_mode`: `Boolean`. Aka Dry Run. If true, it won't affect public
indices or repositories

### Jobs
* Create GitHub Release
* Release PyPI Package
* Release Docker Image

[Test
run](https://github.com/topoteretes/cognee/actions/runs/20167985120)

<img width="551" height="149" alt="Screenshot 2025-12-12 at 14 31 06"
src="https://github.com/user-attachments/assets/37648a35-0af8-4474-b051-4a81f8f8cfe7"
/>

#### Create GitHub Release
Gets the `version` from `pyproject.toml`
Creates a tag and release based on the version. The version in
`pyproject.toml` **must** be already correct!
If the version not updated and tag already exists - the process will
take no effect and be failed.

Test release was deleted. Here is the screenshot:
<img width="818" height="838" alt="Screenshot 2025-12-12 at 14 19 06"
src="https://github.com/user-attachments/assets/c2009f9f-1411-494e-8268-6489f6bdd959"
/>

#### Release PyPI Package

Publishes the `dist` artifacts to pypi.org. If `test_mode` enabled - it
will publish it to test.pypi.org
([example](https://test.pypi.org/project/cognee/0.5.0.dev0/)). <<<
That's how I tested it.

#### Release Docker Image
**IMPORTANT!**
Builds the image and tags it with the version from `pyproject.toml`. For
example:
* `cognee/cognee:0.5.1.dev0`
* `cognee/cognee:0.5.1`
ONLY **main** flavor adds the `latest` tag!. If a user does `docker pull
cognee/cognee:latest` - the latest `main` release image will be pulled

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [x] Other (please specify): CI

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Added an automated release pipeline that creates versioned GitHub
releases, publishes Python packages to TestPyPI/PyPI, and builds/pushes
Docker images.
* Supports selectable dev/main flavors, includes flavor-specific image
tagging and labels, wires version/tag outputs for downstream steps, and
offers a test-mode dry run that skips external publishing.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-15 17:13:40 +01:00
Vasilije
821852f2c3
feat: add support to pass custom parameters in llm adapters during cognify (#1802)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->
A user raised this issue/feature request:
https://github.com/topoteretes/cognee/issues/1784
This PR implements this in a way, but I may have misunderstood the
request. If I'm correct, I think it makes sense to support this, if our
users need it.

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-12-15 17:09:13 +01:00
Andrej Milicevic
433170fe09 merge dev 2025-12-15 17:06:20 +01:00
hajdul88
bad22ba26b
chore: adds id generation to memify triplet embedding pipeline (#1895)
<!-- .github/pull_request_template.md -->

## Description
This PR adds id generation to the Triplet objects in triplet embedding
memify pipeline. In some edge cases duplicated elements could have been
ingested into the collection



## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **Enhancements**
* Relationship data now includes unique identifiers for improved
tracking and data management capabilities.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-15 15:45:35 +01:00
Vasilije
69e36cc834
feat: add bedrock as supported llm provider (#1830)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->
Added support for AWS Bedrock, and the models that are available there.
This was a contributor PR that was never finished, so now I polished it
up and made it work.

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [x] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added AWS Bedrock as a new LLM provider with support for multiple
authentication methods.
* Integrated three new AI models: Claude 4.5 Sonnet, Claude 4.5 Haiku,
and Amazon Nova Lite.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-15 14:33:57 +01:00
Vasilije
d5bd75c627
test: remove anything codify related from mcp test (#1890)
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->
After removing codify, I didn't remove codify-related stuff from mcp
test. Hopefully now I've removed everything that was failing.

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
  * Removed test coverage for codify and developer rules functionality.
* Updated search functionality test to exclude additional search type
filtering.

* **Chores**
  * Increased default logging verbosity from ERROR to INFO level.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-15 14:33:22 +01:00
Igor Ilic
c94225f505
fix: make ontology key an optional param in cognify (#1894)
<!-- .github/pull_request_template.md -->

## Description
Make ontology key optional in Swagger and None by default (it was
"string" by default before change which was causing issues when running
cognify endpoint)

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
* Enhanced API documentation with additional examples and validation
metadata to improve request clarity and validation guidance.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-15 14:30:22 +01:00
Pavel Zorin
a6bc27afaa Cleanup 2025-12-12 17:31:54 +01:00
Pavel Zorin
14ff94f269 Initial release pipeline 2025-12-12 17:09:10 +01:00
Andrej Milicevic
116b6f1eeb chore: formatting 2025-12-12 13:46:16 +01:00
Andrej Milicevic
a225d7fc61 test: revert some changes 2025-12-12 13:44:58 +01:00
Andrej Milicevic
a337f4e54c test: testing logger 2025-12-12 13:02:55 +01:00
Andrej Milicevic
bce6094010 test: change logger 2025-12-12 12:43:54 +01:00
Andrej Milicevic
c48b274571 test: remove delete error from mcp test 2025-12-12 11:53:40 +01:00
Andrej Milicevic
3b8a607b5f test: fix errors in mcp test 2025-12-12 11:37:27 +01:00
Andrej Milicevic
e211e66275 chore: remove quick option to isolate mcp CI test 2025-12-11 18:29:17 +01:00
Andrej Milicevic
0f50c993ac chore: add quick option to isolate mcp CI test 2025-12-11 18:20:07 +01:00
Andrej Milicevic
248ba74592 test: remove codify-related stuff from mcp test 2025-12-11 18:18:42 +01:00
Andrej Milicevic
af8c5bedcc feat: add kwargs to other adapters 2025-12-11 17:47:23 +01:00
Boris
8cad9ef225
Merge branch 'dev' into feature/cog-3409-add-bedrock-as-supported-llm-provider 2025-12-03 14:58:00 +01:00
Andrej Milicevic
aa8afefe8a feat: add kwargs to cognify and related tasks 2025-11-27 17:05:37 +01:00
Andrej Milicevic
c649900042 Merge branch 'dev' into feature/cog-3396-add-support-to-pass-custom-parameters-in-openai-adapter 2025-11-27 16:59:43 +01:00
Andrej Milicevic
700362a233 fix: fix model names and test names 2025-11-26 13:35:56 +01:00
Andrej Milicevic
02b9fa485c fix: remove random addition to pyproject file 2025-11-26 12:33:22 +01:00
Andrej Milicevic
7c5a17ecb5 test: add extra dependency to bedrock ci test 2025-11-26 11:02:36 +01:00
Andrej Milicevic
9e652a3a93 fix: use uv instead of poetry in CI tests 2025-11-25 13:34:18 +01:00
Andrej Milicevic
4c6bed885e chore: ruff format 2025-11-25 13:02:26 +01:00
Andrej Milicevic
f22330a7b6 Merge branch 'dev' into feature/bedrock-llm-provider 2025-11-25 13:02:07 +01:00
Andrej Milicevic
e0d48c043a fix: fixes to adapter and tests 2025-11-25 12:58:07 +01:00
Andrej Milicevic
d97acba78e Merge branch 'dev' into feature/bedrock-llm-provider 2025-11-24 16:48:45 +01:00
Andrej Milicevic
f732fbf55f merge dev 2025-11-24 16:47:59 +01:00
Andrej Milicevic
3b78eb88bd fix: use s3 config 2025-11-24 16:38:23 +01:00
Andrej Milicevic
0a4b1068a2 feat: add kwargs to openai adapter functions 2025-11-17 17:42:22 +01:00
xdurawa
c91d1ff0ae Remove documentation changes as requested by reviewers
- Reverted README.md to original state
- Reverted cognee-starter-kit/README.md to original state
- Documentation will be updated separately by maintainers
2025-09-03 01:34:35 -04:00
xavierdurawa
06a3458982
Merge branch 'dev' into feature/bedrock-llm-provider 2025-09-02 23:08:38 -04:00
xdurawa
3821966d89 Merge feature/bedrock_llm_client into feature/bedrock-llm-provider
- Resolved conflicts in test_suites.yml to include both OpenRouter and Bedrock tests
- Updated add.py to include bedrock provider with latest dev branch model defaults
- Resolved uv.lock conflicts by using dev branch version
2025-09-01 20:55:20 -04:00
xdurawa
2a46208569 Adding AWS Bedrock support as a LLM provider
Signed-off-by: xdurawa <xavierdurawa@gmail.com>
2025-09-01 20:47:26 -04:00
58 changed files with 5000 additions and 4156 deletions

View file

@ -237,6 +237,31 @@ jobs:
EMBEDDING_API_VERSION: ${{ secrets.EMBEDDING_API_VERSION }}
run: uv run python ./cognee/tests/test_dataset_database_handler.py
test-dataset-database-deletion:
name: Test dataset database deletion in Cognee
runs-on: ubuntu-22.04
steps:
- name: Check out repository
uses: actions/checkout@v4
- name: Cognee Setup
uses: ./.github/actions/cognee_setup
with:
python-version: '3.11.x'
- name: Run dataset databases deletion test
env:
ENV: 'dev'
LLM_MODEL: ${{ secrets.LLM_MODEL }}
LLM_ENDPOINT: ${{ secrets.LLM_ENDPOINT }}
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
LLM_API_VERSION: ${{ secrets.LLM_API_VERSION }}
EMBEDDING_MODEL: ${{ secrets.EMBEDDING_MODEL }}
EMBEDDING_ENDPOINT: ${{ secrets.EMBEDDING_ENDPOINT }}
EMBEDDING_API_KEY: ${{ secrets.EMBEDDING_API_KEY }}
EMBEDDING_API_VERSION: ${{ secrets.EMBEDDING_API_VERSION }}
run: uv run python ./cognee/tests/test_dataset_delete.py
test-permissions:
name: Test permissions with different situations in Cognee
runs-on: ubuntu-22.04

138
.github/workflows/release.yml vendored Normal file
View file

@ -0,0 +1,138 @@
name: release.yml
on:
workflow_dispatch:
inputs:
flavour:
required: true
default: dev
type: choice
options:
- dev
- main
description: Dev or Main release
jobs:
release-github:
name: Create GitHub Release from ${{ inputs.flavour }}
outputs:
tag: ${{ steps.create_tag.outputs.tag }}
version: ${{ steps.create_tag.outputs.version }}
permissions:
contents: write
runs-on: ubuntu-latest
steps:
- name: Check out ${{ inputs.flavour }}
uses: actions/checkout@v4
with:
ref: ${{ inputs.flavour }}
- name: Install uv
uses: astral-sh/setup-uv@v7
- name: Create and push git tag
id: create_tag
run: |
VERSION="$(uv version --short)"
TAG="v${VERSION}"
echo "Tag to create: ${TAG}"
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
echo "tag=${TAG}" >> "$GITHUB_OUTPUT"
echo "version=${VERSION}" >> "$GITHUB_OUTPUT"
git tag "${TAG}"
git push origin "${TAG}"
- name: Create GitHub Release
uses: softprops/action-gh-release@v2
with:
tag_name: ${{ steps.create_tag.outputs.tag }}
generate_release_notes: true
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
release-pypi-package:
needs: release-github
name: Release PyPI Package from ${{ inputs.flavour }}
permissions:
contents: read
runs-on: ubuntu-latest
steps:
- name: Check out ${{ inputs.flavour }}
uses: actions/checkout@v4
with:
ref: ${{ inputs.flavour }}
- name: Install uv
uses: astral-sh/setup-uv@v7
- name: Install Python
run: uv python install
- name: Install dependencies
run: uv sync --locked --all-extras
- name: Build distributions
run: uv build
- name: Publish ${{ inputs.flavour }} release to PyPI
env:
UV_PUBLISH_TOKEN: ${{ secrets.PYPI_TOKEN }}
run: uv publish
release-docker-image:
needs: release-github
name: Release Docker Image from ${{ inputs.flavour }}
permissions:
contents: read
runs-on: ubuntu-latest
steps:
- name: Check out ${{ inputs.flavour }}
uses: actions/checkout@v4
with:
ref: ${{ inputs.flavour }}
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Build and push Dev Docker Image
if: ${{ inputs.flavour == 'dev' }}
uses: docker/build-push-action@v5
with:
context: .
platforms: linux/amd64,linux/arm64
push: true
tags: cognee/cognee:${{ needs.release-github.outputs.version }}
labels: |
version=${{ needs.release-github.outputs.version }}
flavour=${{ inputs.flavour }}
cache-from: type=registry,ref=cognee/cognee:buildcache
cache-to: type=registry,ref=cognee/cognee:buildcache,mode=max
- name: Build and push Main Docker Image
if: ${{ inputs.flavour == 'main' }}
uses: docker/build-push-action@v5
with:
context: .
platforms: linux/amd64,linux/arm64
push: true
tags: |
cognee/cognee:${{ needs.release-github.outputs.version }}
cognee/cognee:latest
labels: |
version=${{ needs.release-github.outputs.version }}
flavour=${{ inputs.flavour }}
cache-from: type=registry,ref=cognee/cognee:buildcache
cache-to: type=registry,ref=cognee/cognee:buildcache,mode=max

View file

@ -84,3 +84,93 @@ jobs:
EMBEDDING_DIMENSIONS: "3072"
EMBEDDING_MAX_TOKENS: "8191"
run: uv run python ./examples/python/simple_example.py
test-bedrock-api-key:
name: Run Bedrock API Key Test
runs-on: ubuntu-22.04
steps:
- name: Check out repository
uses: actions/checkout@v4
- name: Cognee Setup
uses: ./.github/actions/cognee_setup
with:
python-version: '3.11.x'
extra-dependencies: "aws"
- name: Run Bedrock API Key Simple Example
env:
LLM_PROVIDER: "bedrock"
LLM_API_KEY: ${{ secrets.BEDROCK_API_KEY }}
LLM_MODEL: "eu.anthropic.claude-sonnet-4-5-20250929-v1:0"
LLM_MAX_TOKENS: "16384"
AWS_REGION_NAME: "eu-west-1"
EMBEDDING_PROVIDER: "bedrock"
EMBEDDING_API_KEY: ${{ secrets.BEDROCK_API_KEY }}
EMBEDDING_MODEL: "amazon.titan-embed-text-v2:0"
EMBEDDING_DIMENSIONS: "1024"
EMBEDDING_MAX_TOKENS: "8191"
run: uv run python ./examples/python/simple_example.py
test-bedrock-aws-credentials:
name: Run Bedrock AWS Credentials Test
runs-on: ubuntu-22.04
steps:
- name: Check out repository
uses: actions/checkout@v4
- name: Cognee Setup
uses: ./.github/actions/cognee_setup
with:
python-version: '3.11.x'
extra-dependencies: "aws"
- name: Run Bedrock AWS Credentials Simple Example
env:
LLM_PROVIDER: "bedrock"
LLM_MODEL: "eu.anthropic.claude-sonnet-4-5-20250929-v1:0"
LLM_MAX_TOKENS: "16384"
AWS_REGION_NAME: "eu-west-1"
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
EMBEDDING_PROVIDER: "bedrock"
EMBEDDING_API_KEY: ${{ secrets.BEDROCK_API_KEY }}
EMBEDDING_MODEL: "amazon.titan-embed-text-v2:0"
EMBEDDING_DIMENSIONS: "1024"
EMBEDDING_MAX_TOKENS: "8191"
run: uv run python ./examples/python/simple_example.py
test-bedrock-aws-profile:
name: Run Bedrock AWS Profile Test
runs-on: ubuntu-22.04
steps:
- name: Check out repository
uses: actions/checkout@v4
- name: Cognee Setup
uses: ./.github/actions/cognee_setup
with:
python-version: '3.11.x'
extra-dependencies: "aws"
- name: Configure AWS Profile
run: |
mkdir -p ~/.aws
cat > ~/.aws/credentials << EOF
[bedrock-test]
aws_access_key_id = ${{ secrets.AWS_ACCESS_KEY_ID }}
aws_secret_access_key = ${{ secrets.AWS_SECRET_ACCESS_KEY }}
EOF
- name: Run Bedrock AWS Profile Simple Example
env:
LLM_PROVIDER: "bedrock"
LLM_MODEL: "eu.anthropic.claude-sonnet-4-5-20250929-v1:0"
LLM_MAX_TOKENS: "16384"
AWS_PROFILE_NAME: "bedrock-test"
AWS_REGION_NAME: "eu-west-1"
EMBEDDING_PROVIDER: "bedrock"
EMBEDDING_MODEL: "amazon.titan-embed-text-v2:0"
EMBEDDING_DIMENSIONS: "1024"
EMBEDDING_MAX_TOKENS: "8191"
run: uv run python ./examples/python/simple_example.py

View file

@ -71,7 +71,7 @@ git clone https://github.com/<your-github-username>/cognee.git
cd cognee
```
In case you are working on Vector and Graph Adapters
1. Fork the [**cognee**](https://github.com/topoteretes/cognee-community) repository
1. Fork the [**cognee-community**](https://github.com/topoteretes/cognee-community) repository
2. Clone your fork:
```shell
git clone https://github.com/<your-github-username>/cognee-community.git
@ -97,6 +97,21 @@ git checkout -b feature/your-feature-name
python cognee/cognee/tests/test_library.py
```
### Running Simple Example
Change .env.example into .env and provide your OPENAI_API_KEY as LLM_API_KEY
Make sure to run ```shell uv sync ``` in the root cloned folder or set up a virtual environment to run cognee
```shell
python cognee/cognee/examples/python/simple_example.py
```
or
```shell
uv run python cognee/cognee/examples/python/simple_example.py
```
## 4. 📤 Submitting Changes
1. Install ruff on your system

View file

@ -66,13 +66,10 @@ Use your data to build personalized and dynamic memory for AI Agents. Cognee let
## About Cognee
Cognee is an open-source tool and platform that transforms your raw data into persistent and dynamic AI memory for Agents. It combines vector search with graph databases to make your documents both searchable by meaning and connected by relationships.
Cognee offers default memory creation and search which we describe bellow. But with Cognee you can build your own!
You can use Cognee in two ways:
1. [Self-host Cognee Open Source](https://docs.cognee.ai/getting-started/installation), which stores all data locally by default.
2. [Connect to Cognee Cloud](https://platform.cognee.ai/), and get the same OSS stack on managed infrastructure for easier development and productionization.
### Cognee Open Source (self-hosted):
### Cognee Open Source:
- Interconnects any type of data — including past conversations, files, images, and audio transcriptions
- Replaces traditional RAG systems with a unified memory layer built on graphs and vectors
@ -80,11 +77,6 @@ You can use Cognee in two ways:
- Provides Pythonic data pipelines for ingestion from 30+ data sources
- Offers high customizability through user-defined tasks, modular pipelines, and built-in search endpoints
### Cognee Cloud (managed):
- Hosted web UI dashboard
- Automatic version updates
- Resource usage analytics
- GDPR compliant, enterprise-grade security
## Basic Usage & Feature Guide
@ -126,6 +118,7 @@ Now, run a minimal pipeline:
```python
import cognee
import asyncio
from pprint import pprint
async def main():
@ -143,7 +136,7 @@ async def main():
# Display the results
for result in results:
print(result)
pprint(result)
if __name__ == '__main__':

View file

@ -12,7 +12,7 @@
"classnames": "^2.5.1",
"culori": "^4.0.1",
"d3-force-3d": "^3.0.6",
"next": "16.0.4",
"next": "16.1.1",
"react": "^19.2.0",
"react-dom": "^19.2.0",
"react-force-graph-2d": "^1.27.1",
@ -96,7 +96,6 @@
"integrity": "sha512-e7jT4DxYvIDLk1ZHmU/m/mB19rex9sv0c2ftBtjSBv+kVM/902eh0fINUzD7UwLLNR+jU585GxUJ8/EBfAM5fw==",
"dev": true,
"license": "MIT",
"peer": true,
"dependencies": {
"@babel/code-frame": "^7.27.1",
"@babel/generator": "^7.28.5",
@ -1074,9 +1073,9 @@
}
},
"node_modules/@next/env": {
"version": "16.0.4",
"resolved": "https://registry.npmjs.org/@next/env/-/env-16.0.4.tgz",
"integrity": "sha512-FDPaVoB1kYhtOz6Le0Jn2QV7RZJ3Ngxzqri7YX4yu3Ini+l5lciR7nA9eNDpKTmDm7LWZtxSju+/CQnwRBn2pA==",
"version": "16.1.1",
"resolved": "https://registry.npmjs.org/@next/env/-/env-16.1.1.tgz",
"integrity": "sha512-3oxyM97Sr2PqiVyMyrZUtrtM3jqqFxOQJVuKclDsgj/L728iZt/GyslkN4NwarledZATCenbk4Offjk1hQmaAA==",
"license": "MIT"
},
"node_modules/@next/eslint-plugin-next": {
@ -1090,9 +1089,9 @@
}
},
"node_modules/@next/swc-darwin-arm64": {
"version": "16.0.4",
"resolved": "https://registry.npmjs.org/@next/swc-darwin-arm64/-/swc-darwin-arm64-16.0.4.tgz",
"integrity": "sha512-TN0cfB4HT2YyEio9fLwZY33J+s+vMIgC84gQCOLZOYusW7ptgjIn8RwxQt0BUpoo9XRRVVWEHLld0uhyux1ZcA==",
"version": "16.1.1",
"resolved": "https://registry.npmjs.org/@next/swc-darwin-arm64/-/swc-darwin-arm64-16.1.1.tgz",
"integrity": "sha512-JS3m42ifsVSJjSTzh27nW+Igfha3NdBOFScr9C80hHGrWx55pTrVL23RJbqir7k7/15SKlrLHhh/MQzqBBYrQA==",
"cpu": [
"arm64"
],
@ -1106,9 +1105,9 @@
}
},
"node_modules/@next/swc-darwin-x64": {
"version": "16.0.4",
"resolved": "https://registry.npmjs.org/@next/swc-darwin-x64/-/swc-darwin-x64-16.0.4.tgz",
"integrity": "sha512-XsfI23jvimCaA7e+9f3yMCoVjrny2D11G6H8NCcgv+Ina/TQhKPXB9P4q0WjTuEoyZmcNvPdrZ+XtTh3uPfH7Q==",
"version": "16.1.1",
"resolved": "https://registry.npmjs.org/@next/swc-darwin-x64/-/swc-darwin-x64-16.1.1.tgz",
"integrity": "sha512-hbyKtrDGUkgkyQi1m1IyD3q4I/3m9ngr+V93z4oKHrPcmxwNL5iMWORvLSGAf2YujL+6HxgVvZuCYZfLfb4bGw==",
"cpu": [
"x64"
],
@ -1122,9 +1121,9 @@
}
},
"node_modules/@next/swc-linux-arm64-gnu": {
"version": "16.0.4",
"resolved": "https://registry.npmjs.org/@next/swc-linux-arm64-gnu/-/swc-linux-arm64-gnu-16.0.4.tgz",
"integrity": "sha512-uo8X7qHDy4YdJUhaoJDMAbL8VT5Ed3lijip2DdBHIB4tfKAvB1XBih6INH2L4qIi4jA0Qq1J0ErxcOocBmUSwg==",
"version": "16.1.1",
"resolved": "https://registry.npmjs.org/@next/swc-linux-arm64-gnu/-/swc-linux-arm64-gnu-16.1.1.tgz",
"integrity": "sha512-/fvHet+EYckFvRLQ0jPHJCUI5/B56+2DpI1xDSvi80r/3Ez+Eaa2Yq4tJcRTaB1kqj/HrYKn8Yplm9bNoMJpwQ==",
"cpu": [
"arm64"
],
@ -1138,9 +1137,9 @@
}
},
"node_modules/@next/swc-linux-arm64-musl": {
"version": "16.0.4",
"resolved": "https://registry.npmjs.org/@next/swc-linux-arm64-musl/-/swc-linux-arm64-musl-16.0.4.tgz",
"integrity": "sha512-pvR/AjNIAxsIz0PCNcZYpH+WmNIKNLcL4XYEfo+ArDi7GsxKWFO5BvVBLXbhti8Coyv3DE983NsitzUsGH5yTw==",
"version": "16.1.1",
"resolved": "https://registry.npmjs.org/@next/swc-linux-arm64-musl/-/swc-linux-arm64-musl-16.1.1.tgz",
"integrity": "sha512-MFHrgL4TXNQbBPzkKKur4Fb5ICEJa87HM7fczFs2+HWblM7mMLdco3dvyTI+QmLBU9xgns/EeeINSZD6Ar+oLg==",
"cpu": [
"arm64"
],
@ -1154,9 +1153,9 @@
}
},
"node_modules/@next/swc-linux-x64-gnu": {
"version": "16.0.4",
"resolved": "https://registry.npmjs.org/@next/swc-linux-x64-gnu/-/swc-linux-x64-gnu-16.0.4.tgz",
"integrity": "sha512-2hebpsd5MRRtgqmT7Jj/Wze+wG+ZEXUK2KFFL4IlZ0amEEFADo4ywsifJNeFTQGsamH3/aXkKWymDvgEi+pc2Q==",
"version": "16.1.1",
"resolved": "https://registry.npmjs.org/@next/swc-linux-x64-gnu/-/swc-linux-x64-gnu-16.1.1.tgz",
"integrity": "sha512-20bYDfgOQAPUkkKBnyP9PTuHiJGM7HzNBbuqmD0jiFVZ0aOldz+VnJhbxzjcSabYsnNjMPsE0cyzEudpYxsrUQ==",
"cpu": [
"x64"
],
@ -1170,9 +1169,9 @@
}
},
"node_modules/@next/swc-linux-x64-musl": {
"version": "16.0.4",
"resolved": "https://registry.npmjs.org/@next/swc-linux-x64-musl/-/swc-linux-x64-musl-16.0.4.tgz",
"integrity": "sha512-pzRXf0LZZ8zMljH78j8SeLncg9ifIOp3ugAFka+Bq8qMzw6hPXOc7wydY7ardIELlczzzreahyTpwsim/WL3Sg==",
"version": "16.1.1",
"resolved": "https://registry.npmjs.org/@next/swc-linux-x64-musl/-/swc-linux-x64-musl-16.1.1.tgz",
"integrity": "sha512-9pRbK3M4asAHQRkwaXwu601oPZHghuSC8IXNENgbBSyImHv/zY4K5udBusgdHkvJ/Tcr96jJwQYOll0qU8+fPA==",
"cpu": [
"x64"
],
@ -1186,9 +1185,9 @@
}
},
"node_modules/@next/swc-win32-arm64-msvc": {
"version": "16.0.4",
"resolved": "https://registry.npmjs.org/@next/swc-win32-arm64-msvc/-/swc-win32-arm64-msvc-16.0.4.tgz",
"integrity": "sha512-7G/yJVzum52B5HOqqbQYX9bJHkN+c4YyZ2AIvEssMHQlbAWOn3iIJjD4sM6ihWsBxuljiTKJovEYlD1K8lCUHw==",
"version": "16.1.1",
"resolved": "https://registry.npmjs.org/@next/swc-win32-arm64-msvc/-/swc-win32-arm64-msvc-16.1.1.tgz",
"integrity": "sha512-bdfQkggaLgnmYrFkSQfsHfOhk/mCYmjnrbRCGgkMcoOBZ4n+TRRSLmT/CU5SATzlBJ9TpioUyBW/vWFXTqQRiA==",
"cpu": [
"arm64"
],
@ -1202,9 +1201,9 @@
}
},
"node_modules/@next/swc-win32-x64-msvc": {
"version": "16.0.4",
"resolved": "https://registry.npmjs.org/@next/swc-win32-x64-msvc/-/swc-win32-x64-msvc-16.0.4.tgz",
"integrity": "sha512-0Vy4g8SSeVkuU89g2OFHqGKM4rxsQtihGfenjx2tRckPrge5+gtFnRWGAAwvGXr0ty3twQvcnYjEyOrLHJ4JWA==",
"version": "16.1.1",
"resolved": "https://registry.npmjs.org/@next/swc-win32-x64-msvc/-/swc-win32-x64-msvc-16.1.1.tgz",
"integrity": "sha512-Ncwbw2WJ57Al5OX0k4chM68DKhEPlrXBaSXDCi2kPi5f4d8b3ejr3RRJGfKBLrn2YJL5ezNS7w2TZLHSti8CMw==",
"cpu": [
"x64"
],
@ -1513,6 +1512,66 @@
"node": ">=14.0.0"
}
},
"node_modules/@tailwindcss/oxide-wasm32-wasi/node_modules/@emnapi/core": {
"version": "1.6.0",
"dev": true,
"inBundle": true,
"license": "MIT",
"optional": true,
"dependencies": {
"@emnapi/wasi-threads": "1.1.0",
"tslib": "^2.4.0"
}
},
"node_modules/@tailwindcss/oxide-wasm32-wasi/node_modules/@emnapi/runtime": {
"version": "1.6.0",
"dev": true,
"inBundle": true,
"license": "MIT",
"optional": true,
"dependencies": {
"tslib": "^2.4.0"
}
},
"node_modules/@tailwindcss/oxide-wasm32-wasi/node_modules/@emnapi/wasi-threads": {
"version": "1.1.0",
"dev": true,
"inBundle": true,
"license": "MIT",
"optional": true,
"dependencies": {
"tslib": "^2.4.0"
}
},
"node_modules/@tailwindcss/oxide-wasm32-wasi/node_modules/@napi-rs/wasm-runtime": {
"version": "1.0.7",
"dev": true,
"inBundle": true,
"license": "MIT",
"optional": true,
"dependencies": {
"@emnapi/core": "^1.5.0",
"@emnapi/runtime": "^1.5.0",
"@tybys/wasm-util": "^0.10.1"
}
},
"node_modules/@tailwindcss/oxide-wasm32-wasi/node_modules/@tybys/wasm-util": {
"version": "0.10.1",
"dev": true,
"inBundle": true,
"license": "MIT",
"optional": true,
"dependencies": {
"tslib": "^2.4.0"
}
},
"node_modules/@tailwindcss/oxide-wasm32-wasi/node_modules/tslib": {
"version": "2.8.1",
"dev": true,
"inBundle": true,
"license": "0BSD",
"optional": true
},
"node_modules/@tailwindcss/oxide-win32-arm64-msvc": {
"version": "4.1.17",
"resolved": "https://registry.npmjs.org/@tailwindcss/oxide-win32-arm64-msvc/-/oxide-win32-arm64-msvc-4.1.17.tgz",
@ -1622,7 +1681,6 @@
"integrity": "sha512-MWtvHrGZLFttgeEj28VXHxpmwYbor/ATPYbBfSFZEIRK0ecCFLl2Qo55z52Hss+UV9CRN7trSeq1zbgx7YDWWg==",
"dev": true,
"license": "MIT",
"peer": true,
"dependencies": {
"csstype": "^3.2.2"
}
@ -1690,7 +1748,6 @@
"integrity": "sha512-jCzKdm/QK0Kg4V4IK/oMlRZlY+QOcdjv89U2NgKHZk1CYTj82/RVSx1mV/0gqCVMJ/DA+Zf/S4NBWNF8GQ+eqQ==",
"dev": true,
"license": "MIT",
"peer": true,
"dependencies": {
"@typescript-eslint/scope-manager": "8.48.0",
"@typescript-eslint/types": "8.48.0",
@ -2199,7 +2256,6 @@
"integrity": "sha512-NZyJarBfL7nWwIq+FDL6Zp/yHEhePMNnnJ0y3qfieCrmNvYct8uvtiV41UvlSe6apAfk0fY1FbWx+NwfmpvtTg==",
"dev": true,
"license": "MIT",
"peer": true,
"bin": {
"acorn": "bin/acorn"
},
@ -2491,7 +2547,6 @@
"version": "2.8.31",
"resolved": "https://registry.npmjs.org/baseline-browser-mapping/-/baseline-browser-mapping-2.8.31.tgz",
"integrity": "sha512-a28v2eWrrRWPpJSzxc+mKwm0ZtVx/G8SepdQZDArnXYU/XS+IF6mp8aB/4E+hH1tyGCoDo3KlUCdlSxGDsRkAw==",
"dev": true,
"license": "Apache-2.0",
"bin": {
"baseline-browser-mapping": "dist/cli.js"
@ -2551,7 +2606,6 @@
}
],
"license": "MIT",
"peer": true,
"dependencies": {
"baseline-browser-mapping": "^2.8.25",
"caniuse-lite": "^1.0.30001754",
@ -2896,7 +2950,6 @@
"resolved": "https://registry.npmjs.org/d3-selection/-/d3-selection-3.0.0.tgz",
"integrity": "sha512-fmTRWbNMmsmWq6xJV8D19U/gw/bwrHfNXxrIN+HfZgnzqTHp9jOmKMhsTUjXOJnZOdZY9Q28y4yebKzqDKlxlQ==",
"license": "ISC",
"peer": true,
"engines": {
"node": ">=12"
}
@ -3372,7 +3425,6 @@
"integrity": "sha512-BhHmn2yNOFA9H9JmmIVKJmd288g9hrVRDkdoIgRCRuSySRUHH7r/DI6aAXW9T1WwUuY3DFgrcaqB+deURBLR5g==",
"dev": true,
"license": "MIT",
"peer": true,
"dependencies": {
"@eslint-community/eslint-utils": "^4.8.0",
"@eslint-community/regexpp": "^4.12.1",
@ -5411,14 +5463,14 @@
"license": "MIT"
},
"node_modules/next": {
"version": "16.0.4",
"resolved": "https://registry.npmjs.org/next/-/next-16.0.4.tgz",
"integrity": "sha512-vICcxKusY8qW7QFOzTvnRL1ejz2ClTqDKtm1AcUjm2mPv/lVAdgpGNsftsPRIDJOXOjRQO68i1dM8Lp8GZnqoA==",
"version": "16.1.1",
"resolved": "https://registry.npmjs.org/next/-/next-16.1.1.tgz",
"integrity": "sha512-QI+T7xrxt1pF6SQ/JYFz95ro/mg/1Znk5vBebsWwbpejj1T0A23hO7GYEaVac9QUOT2BIMiuzm0L99ooq7k0/w==",
"license": "MIT",
"peer": true,
"dependencies": {
"@next/env": "16.0.4",
"@next/env": "16.1.1",
"@swc/helpers": "0.5.15",
"baseline-browser-mapping": "^2.8.3",
"caniuse-lite": "^1.0.30001579",
"postcss": "8.4.31",
"styled-jsx": "5.1.6"
@ -5430,14 +5482,14 @@
"node": ">=20.9.0"
},
"optionalDependencies": {
"@next/swc-darwin-arm64": "16.0.4",
"@next/swc-darwin-x64": "16.0.4",
"@next/swc-linux-arm64-gnu": "16.0.4",
"@next/swc-linux-arm64-musl": "16.0.4",
"@next/swc-linux-x64-gnu": "16.0.4",
"@next/swc-linux-x64-musl": "16.0.4",
"@next/swc-win32-arm64-msvc": "16.0.4",
"@next/swc-win32-x64-msvc": "16.0.4",
"@next/swc-darwin-arm64": "16.1.1",
"@next/swc-darwin-x64": "16.1.1",
"@next/swc-linux-arm64-gnu": "16.1.1",
"@next/swc-linux-arm64-musl": "16.1.1",
"@next/swc-linux-x64-gnu": "16.1.1",
"@next/swc-linux-x64-musl": "16.1.1",
"@next/swc-win32-arm64-msvc": "16.1.1",
"@next/swc-win32-x64-msvc": "16.1.1",
"sharp": "^0.34.4"
},
"peerDependencies": {
@ -5809,9 +5861,9 @@
}
},
"node_modules/preact": {
"version": "10.27.2",
"resolved": "https://registry.npmjs.org/preact/-/preact-10.27.2.tgz",
"integrity": "sha512-5SYSgFKSyhCbk6SrXyMpqjb5+MQBgfvEKE/OC+PujcY34sOpqtr+0AZQtPYx5IA6VxynQ7rUPCtKzyovpj9Bpg==",
"version": "10.28.2",
"resolved": "https://registry.npmjs.org/preact/-/preact-10.28.2.tgz",
"integrity": "sha512-lbteaWGzGHdlIuiJ0l2Jq454m6kcpI1zNje6d8MlGAFlYvP2GO4ibnat7P74Esfz4sPTdM6UxtTwh/d3pwM9JA==",
"license": "MIT",
"funding": {
"type": "opencollective",
@ -5875,7 +5927,6 @@
"resolved": "https://registry.npmjs.org/react/-/react-19.2.0.tgz",
"integrity": "sha512-tmbWg6W31tQLeB5cdIBOicJDJRR2KzXsV7uSK9iNfLWQ5bIZfxuPEHp7M8wiHyHnn0DD1i7w3Zmin0FtkrwoCQ==",
"license": "MIT",
"peer": true,
"engines": {
"node": ">=0.10.0"
}
@ -5885,7 +5936,6 @@
"resolved": "https://registry.npmjs.org/react-dom/-/react-dom-19.2.0.tgz",
"integrity": "sha512-UlbRu4cAiGaIewkPyiRGJk0imDN2T3JjieT6spoL2UeSf5od4n5LB/mQ4ejmxhCFT1tYe8IvaFulzynWovsEFQ==",
"license": "MIT",
"peer": true,
"dependencies": {
"scheduler": "^0.27.0"
},
@ -6624,7 +6674,6 @@
"integrity": "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q==",
"dev": true,
"license": "MIT",
"peer": true,
"engines": {
"node": ">=12"
},
@ -6787,7 +6836,6 @@
"integrity": "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==",
"dev": true,
"license": "Apache-2.0",
"peer": true,
"bin": {
"tsc": "bin/tsc",
"tsserver": "bin/tsserver"
@ -7085,7 +7133,6 @@
"integrity": "sha512-AvvthqfqrAhNH9dnfmrfKzX5upOdjUVJYFqNSlkmGf64gRaTzlPwz99IHYnVs28qYAybvAlBV+H7pn0saFY4Ig==",
"dev": true,
"license": "MIT",
"peer": true,
"funding": {
"url": "https://github.com/sponsors/colinhacks"
}

View file

@ -13,7 +13,7 @@
"classnames": "^2.5.1",
"culori": "^4.0.1",
"d3-force-3d": "^3.0.6",
"next": "16.0.4",
"next": "16.1.1",
"react": "^19.2.0",
"react-dom": "^19.2.0",
"react-force-graph-2d": "^1.27.1",

View file

@ -1,6 +1,6 @@
[project]
name = "cognee-mcp"
version = "0.4.0"
version = "0.5.0"
description = "Cognee MCP server"
readme = "README.md"
requires-python = ">=3.10"
@ -9,7 +9,7 @@ dependencies = [
# For local cognee repo usage remove comment bellow and add absolute path to cognee. Then run `uv sync --reinstall` in the mcp folder on local cognee changes.
#"cognee[postgres,codegraph,gemini,huggingface,docs,neo4j] @ file:/Users/igorilic/Desktop/cognee",
# TODO: Remove gemini from optional dependecnies for new Cognee version after 0.3.4
"cognee[postgres,docs,neo4j]==0.3.7",
"cognee[postgres,docs,neo4j]==0.5.0",
"fastmcp>=2.10.0,<3.0.0",
"mcp>=1.12.0,<2.0.0",
"uv>=0.6.3,<1.0.0",

View file

@ -151,7 +151,7 @@ class CogneeClient:
query_type: str,
datasets: Optional[List[str]] = None,
system_prompt: Optional[str] = None,
top_k: int = 10,
top_k: int = 5,
) -> Any:
"""
Search the knowledge graph.
@ -192,7 +192,7 @@ class CogneeClient:
with redirect_stdout(sys.stderr):
results = await self.cognee.search(
query_type=SearchType[query_type.upper()], query_text=query_text
query_type=SearchType[query_type.upper()], query_text=query_text, top_k=top_k
)
return results

View file

@ -316,7 +316,7 @@ async def save_interaction(data: str) -> list:
@mcp.tool()
async def search(search_query: str, search_type: str) -> list:
async def search(search_query: str, search_type: str, top_k: int = 10) -> list:
"""
Search and query the knowledge graph for insights, information, and connections.
@ -389,6 +389,13 @@ async def search(search_query: str, search_type: str) -> list:
The search_type is case-insensitive and will be converted to uppercase.
top_k : int, optional
Maximum number of results to return (default: 10).
Controls the amount of context retrieved from the knowledge graph.
- Lower values (3-5): Faster, more focused results
- Higher values (10-20): More comprehensive, but slower and more context-heavy
Helps manage response size and context window usage in MCP clients.
Returns
-------
list
@ -425,13 +432,32 @@ async def search(search_query: str, search_type: str) -> list:
"""
async def search_task(search_query: str, search_type: str) -> str:
"""Search the knowledge graph"""
async def search_task(search_query: str, search_type: str, top_k: int) -> str:
"""
Internal task to execute knowledge graph search with result formatting.
Handles the actual search execution and formats results appropriately
for MCP clients based on the search type and execution mode (API vs direct).
Parameters
----------
search_query : str
The search query in natural language
search_type : str
Type of search to perform (GRAPH_COMPLETION, CHUNKS, etc.)
top_k : int
Maximum number of results to return
Returns
-------
str
Formatted search results as a string, with format depending on search_type
"""
# NOTE: MCP uses stdout to communicate, we must redirect all output
# going to stdout ( like the print function ) to stderr.
with redirect_stdout(sys.stderr):
search_results = await cognee_client.search(
query_text=search_query, query_type=search_type
query_text=search_query, query_type=search_type, top_k=top_k
)
# Handle different result formats based on API vs direct mode
@ -465,7 +491,7 @@ async def search(search_query: str, search_type: str) -> list:
else:
return str(search_results)
search_results = await search_task(search_query, search_type)
search_results = await search_task(search_query, search_type, top_k)
return [types.TextContent(type="text", text=search_results)]

View file

@ -3,7 +3,7 @@
Test client for Cognee MCP Server functionality.
This script tests all the tools and functions available in the Cognee MCP server,
including cognify, codify, search, prune, status checks, and utility functions.
including cognify, search, prune, status checks, and utility functions.
Usage:
# Set your OpenAI API key first
@ -23,6 +23,7 @@ import tempfile
import time
from contextlib import asynccontextmanager
from cognee.shared.logging_utils import setup_logging
from logging import ERROR, INFO
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
@ -35,7 +36,7 @@ from src.server import (
load_class,
)
# Set timeout for cognify/codify to complete in
# Set timeout for cognify to complete in
TIMEOUT = 5 * 60 # 5 min in seconds
@ -151,12 +152,9 @@ DEBUG = True
expected_tools = {
"cognify",
"codify",
"search",
"prune",
"cognify_status",
"codify_status",
"cognee_add_developer_rules",
"list_data",
"delete",
}
@ -247,106 +245,6 @@ DEBUG = True
}
print(f"{test_name} test failed: {e}")
async def test_codify(self):
"""Test the codify functionality using MCP client."""
print("\n🧪 Testing codify functionality...")
try:
async with self.mcp_server_session() as session:
codify_result = await session.call_tool(
"codify", arguments={"repo_path": self.test_repo_dir}
)
start = time.time() # mark the start
while True:
try:
# Wait a moment
await asyncio.sleep(5)
# Check if codify processing is finished
status_result = await session.call_tool("codify_status", arguments={})
if hasattr(status_result, "content") and status_result.content:
status_text = (
status_result.content[0].text
if status_result.content
else str(status_result)
)
else:
status_text = str(status_result)
if str(PipelineRunStatus.DATASET_PROCESSING_COMPLETED) in status_text:
break
elif time.time() - start > TIMEOUT:
raise TimeoutError("Codify did not complete in 5min")
except DatabaseNotCreatedError:
if time.time() - start > TIMEOUT:
raise TimeoutError("Database was not created in 5min")
self.test_results["codify"] = {
"status": "PASS",
"result": codify_result,
"message": "Codify executed successfully",
}
print("✅ Codify test passed")
except Exception as e:
self.test_results["codify"] = {
"status": "FAIL",
"error": str(e),
"message": "Codify test failed",
}
print(f"❌ Codify test failed: {e}")
async def test_cognee_add_developer_rules(self):
"""Test the cognee_add_developer_rules functionality using MCP client."""
print("\n🧪 Testing cognee_add_developer_rules functionality...")
try:
async with self.mcp_server_session() as session:
result = await session.call_tool(
"cognee_add_developer_rules", arguments={"base_path": self.test_data_dir}
)
start = time.time() # mark the start
while True:
try:
# Wait a moment
await asyncio.sleep(5)
# Check if developer rule cognify processing is finished
status_result = await session.call_tool("cognify_status", arguments={})
if hasattr(status_result, "content") and status_result.content:
status_text = (
status_result.content[0].text
if status_result.content
else str(status_result)
)
else:
status_text = str(status_result)
if str(PipelineRunStatus.DATASET_PROCESSING_COMPLETED) in status_text:
break
elif time.time() - start > TIMEOUT:
raise TimeoutError(
"Cognify of developer rules did not complete in 5min"
)
except DatabaseNotCreatedError:
if time.time() - start > TIMEOUT:
raise TimeoutError("Database was not created in 5min")
self.test_results["cognee_add_developer_rules"] = {
"status": "PASS",
"result": result,
"message": "Developer rules addition executed successfully",
}
print("✅ Developer rules test passed")
except Exception as e:
self.test_results["cognee_add_developer_rules"] = {
"status": "FAIL",
"error": str(e),
"message": "Developer rules test failed",
}
print(f"❌ Developer rules test failed: {e}")
async def test_search_functionality(self):
"""Test the search functionality with different search types using MCP client."""
print("\n🧪 Testing search functionality...")
@ -359,7 +257,11 @@ DEBUG = True
# Go through all Cognee search types
for search_type in SearchType:
# Don't test these search types
if search_type in [SearchType.NATURAL_LANGUAGE, SearchType.CYPHER]:
if search_type in [
SearchType.NATURAL_LANGUAGE,
SearchType.CYPHER,
SearchType.TRIPLET_COMPLETION,
]:
break
try:
async with self.mcp_server_session() as session:
@ -681,9 +583,6 @@ class TestModel:
test_name="Cognify2",
)
await self.test_codify()
await self.test_cognee_add_developer_rules()
# Test list_data and delete functionality
await self.test_list_data()
await self.test_delete()
@ -739,7 +638,5 @@ async def main():
if __name__ == "__main__":
from logging import ERROR
logger = setup_logging(log_level=ERROR)
asyncio.run(main())

7633
cognee-mcp/uv.lock generated

File diff suppressed because it is too large Load diff

View file

@ -155,7 +155,7 @@ async def add(
- LLM_API_KEY: API key for your LLM provider (OpenAI, Anthropic, etc.)
Optional:
- LLM_PROVIDER: "openai" (default), "anthropic", "gemini", "ollama", "mistral"
- LLM_PROVIDER: "openai" (default), "anthropic", "gemini", "ollama", "mistral", "bedrock"
- LLM_MODEL: Model name (default: "gpt-5-mini")
- DEFAULT_USER_EMAIL: Custom default user email
- DEFAULT_USER_PASSWORD: Custom default user password

View file

@ -53,6 +53,7 @@ async def cognify(
custom_prompt: Optional[str] = None,
temporal_cognify: bool = False,
data_per_batch: int = 20,
**kwargs,
):
"""
Transform ingested data into a structured knowledge graph.
@ -223,6 +224,7 @@ async def cognify(
config=config,
custom_prompt=custom_prompt,
chunks_per_batch=chunks_per_batch,
**kwargs,
)
# By calling get pipeline executor we get a function that will have the run_pipeline run in the background or a function that we will need to wait for
@ -251,6 +253,7 @@ async def get_default_tasks( # TODO: Find out a better way to do this (Boris's
config: Config = None,
custom_prompt: Optional[str] = None,
chunks_per_batch: int = 100,
**kwargs,
) -> list[Task]:
if config is None:
ontology_config = get_ontology_env_config()
@ -288,6 +291,7 @@ async def get_default_tasks( # TODO: Find out a better way to do this (Boris's
config=config,
custom_prompt=custom_prompt,
task_config={"batch_size": chunks_per_batch},
**kwargs,
), # Generate knowledge graphs from the document chunks.
Task(
summarize_text,

View file

@ -42,7 +42,9 @@ class CognifyPayloadDTO(InDTO):
default="", description="Custom prompt for entity extraction and graph generation"
)
ontology_key: Optional[List[str]] = Field(
default=None, description="Reference to one or more previously uploaded ontologies"
default=None,
examples=[[]],
description="Reference to one or more previously uploaded ontologies",
)

View file

@ -208,14 +208,14 @@ def get_datasets_router() -> APIRouter:
},
)
from cognee.modules.data.methods import get_dataset, delete_dataset
from cognee.modules.data.methods import delete_dataset
dataset = await get_dataset(user.id, dataset_id)
dataset = await get_authorized_existing_datasets([dataset_id], "delete", user)
if dataset is None:
raise DatasetNotFoundError(message=f"Dataset ({str(dataset_id)}) not found.")
await delete_dataset(dataset)
await delete_dataset(dataset[0])
@router.delete(
"/{dataset_id}/data/{data_id}",

View file

@ -1,4 +1,4 @@
from fastapi import APIRouter, File, Form, UploadFile, Depends, HTTPException
from fastapi import APIRouter, File, Form, UploadFile, Depends, Request
from fastapi.responses import JSONResponse
from typing import Optional, List
@ -15,28 +15,25 @@ def get_ontology_router() -> APIRouter:
@router.post("", response_model=dict)
async def upload_ontology(
request: Request,
ontology_key: str = Form(...),
ontology_file: List[UploadFile] = File(...),
descriptions: Optional[str] = Form(None),
ontology_file: UploadFile = File(...),
description: Optional[str] = Form(None),
user: User = Depends(get_authenticated_user),
):
"""
Upload ontology files with their respective keys for later use in cognify operations.
Supports both single and multiple file uploads:
- Single file: ontology_key=["key"], ontology_file=[file]
- Multiple files: ontology_key=["key1", "key2"], ontology_file=[file1, file2]
Upload a single ontology file for later use in cognify operations.
## Request Parameters
- **ontology_key** (str): JSON array string of user-defined identifiers for the ontologies
- **ontology_file** (List[UploadFile]): OWL format ontology files
- **descriptions** (Optional[str]): JSON array string of optional descriptions
- **ontology_key** (str): User-defined identifier for the ontology.
- **ontology_file** (UploadFile): Single OWL format ontology file
- **description** (Optional[str]): Optional description for the ontology.
## Response
Returns metadata about uploaded ontologies including keys, filenames, sizes, and upload timestamps.
Returns metadata about the uploaded ontology including key, filename, size, and upload timestamp.
## Error Codes
- **400 Bad Request**: Invalid file format, duplicate keys, array length mismatches, file size exceeded
- **400 Bad Request**: Invalid file format, duplicate key, multiple files uploaded
- **500 Internal Server Error**: File system or processing errors
"""
send_telemetry(
@ -49,16 +46,22 @@ def get_ontology_router() -> APIRouter:
)
try:
import json
# Enforce: exactly one uploaded file for "ontology_file"
form = await request.form()
uploaded_files = form.getlist("ontology_file")
if len(uploaded_files) != 1:
raise ValueError("Only one ontology_file is allowed")
ontology_keys = json.loads(ontology_key)
description_list = json.loads(descriptions) if descriptions else None
if ontology_key.strip().startswith(("[", "{")):
raise ValueError("ontology_key must be a string")
if description is not None and description.strip().startswith(("[", "{")):
raise ValueError("description must be a string")
if not isinstance(ontology_keys, list):
raise ValueError("ontology_key must be a JSON array")
results = await ontology_service.upload_ontologies(
ontology_keys, ontology_file, user, description_list
result = await ontology_service.upload_ontology(
ontology_key=ontology_key,
file=ontology_file,
user=user,
description=description,
)
return {
@ -70,10 +73,9 @@ def get_ontology_router() -> APIRouter:
"uploaded_at": result.uploaded_at,
"description": result.description,
}
for result in results
]
}
except (json.JSONDecodeError, ValueError) as e:
except ValueError as e:
return JSONResponse(status_code=400, content={"error": str(e)})
except Exception as e:
return JSONResponse(status_code=500, content={"error": str(e)})

View file

@ -33,6 +33,7 @@ async def search(
session_id: Optional[str] = None,
wide_search_top_k: Optional[int] = 100,
triplet_distance_penalty: Optional[float] = 3.5,
verbose: bool = False,
) -> Union[List[SearchResult], CombinedSearchResult]:
"""
Search and query the knowledge graph for insights, information, and connections.
@ -123,6 +124,8 @@ async def search(
session_id: Optional session identifier for caching Q&A interactions. Defaults to 'default_session' if None.
verbose: If True, returns detailed result information including graph representation (when possible).
Returns:
list: Search results in format determined by query_type:
@ -204,6 +207,7 @@ async def search(
session_id=session_id,
wide_search_top_k=wide_search_top_k,
triplet_distance_penalty=triplet_distance_penalty,
verbose=verbose,
)
return filtered_search_results

View file

@ -1,2 +1,4 @@
from .get_or_create_dataset_database import get_or_create_dataset_database
from .resolve_dataset_database_connection_info import resolve_dataset_database_connection_info
from .get_graph_dataset_database_handler import get_graph_dataset_database_handler
from .get_vector_dataset_database_handler import get_vector_dataset_database_handler

View file

@ -0,0 +1,10 @@
from cognee.modules.users.models.DatasetDatabase import DatasetDatabase
def get_graph_dataset_database_handler(dataset_database: DatasetDatabase) -> dict:
from cognee.infrastructure.databases.dataset_database_handler.supported_dataset_database_handlers import (
supported_dataset_database_handlers,
)
handler = supported_dataset_database_handlers[dataset_database.graph_dataset_database_handler]
return handler

View file

@ -0,0 +1,10 @@
from cognee.modules.users.models.DatasetDatabase import DatasetDatabase
def get_vector_dataset_database_handler(dataset_database: DatasetDatabase) -> dict:
from cognee.infrastructure.databases.dataset_database_handler.supported_dataset_database_handlers import (
supported_dataset_database_handlers,
)
handler = supported_dataset_database_handlers[dataset_database.vector_dataset_database_handler]
return handler

View file

@ -1,24 +1,12 @@
from cognee.infrastructure.databases.utils.get_graph_dataset_database_handler import (
get_graph_dataset_database_handler,
)
from cognee.infrastructure.databases.utils.get_vector_dataset_database_handler import (
get_vector_dataset_database_handler,
)
from cognee.modules.users.models.DatasetDatabase import DatasetDatabase
async def _get_vector_db_connection_info(dataset_database: DatasetDatabase) -> DatasetDatabase:
from cognee.infrastructure.databases.dataset_database_handler.supported_dataset_database_handlers import (
supported_dataset_database_handlers,
)
handler = supported_dataset_database_handlers[dataset_database.vector_dataset_database_handler]
return await handler["handler_instance"].resolve_dataset_connection_info(dataset_database)
async def _get_graph_db_connection_info(dataset_database: DatasetDatabase) -> DatasetDatabase:
from cognee.infrastructure.databases.dataset_database_handler.supported_dataset_database_handlers import (
supported_dataset_database_handlers,
)
handler = supported_dataset_database_handlers[dataset_database.graph_dataset_database_handler]
return await handler["handler_instance"].resolve_dataset_connection_info(dataset_database)
async def resolve_dataset_database_connection_info(
dataset_database: DatasetDatabase,
) -> DatasetDatabase:
@ -31,6 +19,12 @@ async def resolve_dataset_database_connection_info(
Returns:
DatasetDatabase instance with resolved connection info
"""
dataset_database = await _get_vector_db_connection_info(dataset_database)
dataset_database = await _get_graph_db_connection_info(dataset_database)
vector_dataset_database_handler = get_vector_dataset_database_handler(dataset_database)
graph_dataset_database_handler = get_graph_dataset_database_handler(dataset_database)
dataset_database = await vector_dataset_database_handler[
"handler_instance"
].resolve_dataset_connection_info(dataset_database)
dataset_database = await graph_dataset_database_handler[
"handler_instance"
].resolve_dataset_connection_info(dataset_database)
return dataset_database

View file

@ -9,6 +9,8 @@ class S3Config(BaseSettings):
aws_access_key_id: Optional[str] = None
aws_secret_access_key: Optional[str] = None
aws_session_token: Optional[str] = None
aws_profile_name: Optional[str] = None
aws_bedrock_runtime_endpoint: Optional[str] = None
model_config = SettingsConfigDict(env_file=".env", extra="allow")

View file

@ -11,7 +11,7 @@ class LLMGateway:
@staticmethod
def acreate_structured_output(
text_input: str, system_prompt: str, response_model: Type[BaseModel]
text_input: str, system_prompt: str, response_model: Type[BaseModel], **kwargs
) -> Coroutine:
llm_config = get_llm_config()
if llm_config.structured_output_framework.upper() == "BAML":
@ -31,7 +31,10 @@ class LLMGateway:
llm_client = get_llm_client()
return llm_client.acreate_structured_output(
text_input=text_input, system_prompt=system_prompt, response_model=response_model
text_input=text_input,
system_prompt=system_prompt,
response_model=response_model,
**kwargs,
)
@staticmethod

View file

@ -10,7 +10,7 @@ from cognee.infrastructure.llm.config import (
async def extract_content_graph(
content: str, response_model: Type[BaseModel], custom_prompt: Optional[str] = None
content: str, response_model: Type[BaseModel], custom_prompt: Optional[str] = None, **kwargs
):
if custom_prompt:
system_prompt = custom_prompt
@ -30,7 +30,7 @@ async def extract_content_graph(
system_prompt = render_prompt(prompt_path, {}, base_directory=base_directory)
content_graph = await LLMGateway.acreate_structured_output(
content, system_prompt, response_model
content, system_prompt, response_model, **kwargs
)
return content_graph

View file

@ -52,7 +52,7 @@ class AnthropicAdapter(LLMInterface):
reraise=True,
)
async def acreate_structured_output(
self, text_input: str, system_prompt: str, response_model: Type[BaseModel]
self, text_input: str, system_prompt: str, response_model: Type[BaseModel], **kwargs
) -> BaseModel:
"""
Generate a response from a user query.

View file

@ -0,0 +1,5 @@
"""Bedrock LLM adapter module."""
from .adapter import BedrockAdapter
__all__ = ["BedrockAdapter"]

View file

@ -0,0 +1,153 @@
import litellm
import instructor
from typing import Type
from pydantic import BaseModel
from litellm.exceptions import ContentPolicyViolationError
from instructor.exceptions import InstructorRetryException
from cognee.infrastructure.llm.LLMGateway import LLMGateway
from cognee.infrastructure.llm.structured_output_framework.litellm_instructor.llm.llm_interface import (
LLMInterface,
)
from cognee.infrastructure.llm.exceptions import (
ContentPolicyFilterError,
MissingSystemPromptPathError,
)
from cognee.infrastructure.files.storage.s3_config import get_s3_config
from cognee.infrastructure.llm.structured_output_framework.litellm_instructor.llm.rate_limiter import (
rate_limit_async,
rate_limit_sync,
sleep_and_retry_async,
sleep_and_retry_sync,
)
from cognee.modules.observability.get_observe import get_observe
observe = get_observe()
class BedrockAdapter(LLMInterface):
"""
Adapter for AWS Bedrock API with support for three authentication methods:
1. API Key (Bearer Token)
2. AWS Credentials (access key + secret key)
3. AWS Profile (boto3 credential chain)
"""
name = "Bedrock"
model: str
api_key: str
default_instructor_mode = "json_schema_mode"
MAX_RETRIES = 5
def __init__(
self,
model: str,
api_key: str = None,
max_completion_tokens: int = 16384,
streaming: bool = False,
instructor_mode: str = None,
):
self.instructor_mode = instructor_mode if instructor_mode else self.default_instructor_mode
self.aclient = instructor.from_litellm(
litellm.acompletion, mode=instructor.Mode(self.instructor_mode)
)
self.client = instructor.from_litellm(litellm.completion)
self.model = model
self.api_key = api_key
self.max_completion_tokens = max_completion_tokens
self.streaming = streaming
def _create_bedrock_request(
self, text_input: str, system_prompt: str, response_model: Type[BaseModel]
) -> dict:
"""Create Bedrock request with authentication."""
request_params = {
"model": self.model,
"custom_llm_provider": "bedrock",
"drop_params": True,
"messages": [
{"role": "user", "content": text_input},
{"role": "system", "content": system_prompt},
],
"response_model": response_model,
"max_retries": self.MAX_RETRIES,
"max_completion_tokens": self.max_completion_tokens,
"stream": self.streaming,
}
s3_config = get_s3_config()
# Add authentication parameters
if self.api_key:
request_params["api_key"] = self.api_key
elif s3_config.aws_access_key_id and s3_config.aws_secret_access_key:
request_params["aws_access_key_id"] = s3_config.aws_access_key_id
request_params["aws_secret_access_key"] = s3_config.aws_secret_access_key
if s3_config.aws_session_token:
request_params["aws_session_token"] = s3_config.aws_session_token
elif s3_config.aws_profile_name:
request_params["aws_profile_name"] = s3_config.aws_profile_name
if s3_config.aws_region:
request_params["aws_region_name"] = s3_config.aws_region
# Add optional parameters
if s3_config.aws_bedrock_runtime_endpoint:
request_params["aws_bedrock_runtime_endpoint"] = s3_config.aws_bedrock_runtime_endpoint
return request_params
@observe(as_type="generation")
@sleep_and_retry_async()
@rate_limit_async
async def acreate_structured_output(
self, text_input: str, system_prompt: str, response_model: Type[BaseModel]
) -> BaseModel:
"""Generate structured output from AWS Bedrock API."""
try:
request_params = self._create_bedrock_request(text_input, system_prompt, response_model)
return await self.aclient.chat.completions.create(**request_params)
except (
ContentPolicyViolationError,
InstructorRetryException,
) as error:
if (
isinstance(error, InstructorRetryException)
and "content management policy" not in str(error).lower()
):
raise error
raise ContentPolicyFilterError(
f"The provided input contains content that is not aligned with our content policy: {text_input}"
)
@observe
@sleep_and_retry_sync()
@rate_limit_sync
def create_structured_output(
self, text_input: str, system_prompt: str, response_model: Type[BaseModel]
) -> BaseModel:
"""Generate structured output from AWS Bedrock API (synchronous)."""
request_params = self._create_bedrock_request(text_input, system_prompt, response_model)
return self.client.chat.completions.create(**request_params)
def show_prompt(self, text_input: str, system_prompt: str) -> str:
"""Format and display the prompt for a user query."""
if not text_input:
text_input = "No user input provided."
if not system_prompt:
raise MissingSystemPromptPathError()
system_prompt = LLMGateway.read_query_prompt(system_prompt)
formatted_prompt = (
f"""System Prompt:\n{system_prompt}\n\nUser Input:\n{text_input}\n"""
if system_prompt
else None
)
return formatted_prompt

View file

@ -80,7 +80,7 @@ class GeminiAdapter(LLMInterface):
reraise=True,
)
async def acreate_structured_output(
self, text_input: str, system_prompt: str, response_model: Type[BaseModel]
self, text_input: str, system_prompt: str, response_model: Type[BaseModel], **kwargs
) -> BaseModel:
"""
Generate a response from a user query.

View file

@ -80,7 +80,7 @@ class GenericAPIAdapter(LLMInterface):
reraise=True,
)
async def acreate_structured_output(
self, text_input: str, system_prompt: str, response_model: Type[BaseModel]
self, text_input: str, system_prompt: str, response_model: Type[BaseModel], **kwargs
) -> BaseModel:
"""
Generate a response from a user query.

View file

@ -24,6 +24,7 @@ class LLMProvider(Enum):
- CUSTOM: Represents a custom provider option.
- GEMINI: Represents the Gemini provider.
- MISTRAL: Represents the Mistral AI provider.
- BEDROCK: Represents the AWS Bedrock provider.
"""
OPENAI = "openai"
@ -32,6 +33,7 @@ class LLMProvider(Enum):
CUSTOM = "custom"
GEMINI = "gemini"
MISTRAL = "mistral"
BEDROCK = "bedrock"
def get_llm_client(raise_api_key_error: bool = True):
@ -154,7 +156,7 @@ def get_llm_client(raise_api_key_error: bool = True):
)
elif provider == LLMProvider.MISTRAL:
if llm_config.llm_api_key is None:
if llm_config.llm_api_key is None and raise_api_key_error:
raise LLMAPIKeyNotSetError()
from cognee.infrastructure.llm.structured_output_framework.litellm_instructor.llm.mistral.adapter import (
@ -169,5 +171,21 @@ def get_llm_client(raise_api_key_error: bool = True):
instructor_mode=llm_config.llm_instructor_mode.lower(),
)
elif provider == LLMProvider.BEDROCK:
# if llm_config.llm_api_key is None and raise_api_key_error:
# raise LLMAPIKeyNotSetError()
from cognee.infrastructure.llm.structured_output_framework.litellm_instructor.llm.bedrock.adapter import (
BedrockAdapter,
)
return BedrockAdapter(
model=llm_config.llm_model,
api_key=llm_config.llm_api_key,
max_completion_tokens=max_completion_tokens,
streaming=llm_config.llm_streaming,
instructor_mode=llm_config.llm_instructor_mode.lower(),
)
else:
raise UnsupportedLLMProviderError(provider)

View file

@ -69,7 +69,7 @@ class MistralAdapter(LLMInterface):
reraise=True,
)
async def acreate_structured_output(
self, text_input: str, system_prompt: str, response_model: Type[BaseModel]
self, text_input: str, system_prompt: str, response_model: Type[BaseModel], **kwargs
) -> BaseModel:
"""
Generate a response from the user query.

View file

@ -76,7 +76,7 @@ class OllamaAPIAdapter(LLMInterface):
reraise=True,
)
async def acreate_structured_output(
self, text_input: str, system_prompt: str, response_model: Type[BaseModel]
self, text_input: str, system_prompt: str, response_model: Type[BaseModel], **kwargs
) -> BaseModel:
"""
Generate a structured output from the LLM using the provided text and system prompt.
@ -123,7 +123,7 @@ class OllamaAPIAdapter(LLMInterface):
before_sleep=before_sleep_log(logger, logging.DEBUG),
reraise=True,
)
async def create_transcript(self, input_file: str) -> str:
async def create_transcript(self, input_file: str, **kwargs) -> str:
"""
Generate an audio transcript from a user query.
@ -162,7 +162,7 @@ class OllamaAPIAdapter(LLMInterface):
before_sleep=before_sleep_log(logger, logging.DEBUG),
reraise=True,
)
async def transcribe_image(self, input_file: str) -> str:
async def transcribe_image(self, input_file: str, **kwargs) -> str:
"""
Transcribe content from an image using base64 encoding.

View file

@ -112,7 +112,7 @@ class OpenAIAdapter(LLMInterface):
reraise=True,
)
async def acreate_structured_output(
self, text_input: str, system_prompt: str, response_model: Type[BaseModel]
self, text_input: str, system_prompt: str, response_model: Type[BaseModel], **kwargs
) -> BaseModel:
"""
Generate a response from a user query.
@ -154,6 +154,7 @@ class OpenAIAdapter(LLMInterface):
api_version=self.api_version,
response_model=response_model,
max_retries=self.MAX_RETRIES,
**kwargs,
)
except (
ContentFilterFinishReasonError,
@ -180,6 +181,7 @@ class OpenAIAdapter(LLMInterface):
# api_base=self.fallback_endpoint,
response_model=response_model,
max_retries=self.MAX_RETRIES,
**kwargs,
)
except (
ContentFilterFinishReasonError,
@ -205,7 +207,7 @@ class OpenAIAdapter(LLMInterface):
reraise=True,
)
def create_structured_output(
self, text_input: str, system_prompt: str, response_model: Type[BaseModel]
self, text_input: str, system_prompt: str, response_model: Type[BaseModel], **kwargs
) -> BaseModel:
"""
Generate a response from a user query.
@ -245,6 +247,7 @@ class OpenAIAdapter(LLMInterface):
api_version=self.api_version,
response_model=response_model,
max_retries=self.MAX_RETRIES,
**kwargs,
)
@retry(
@ -254,7 +257,7 @@ class OpenAIAdapter(LLMInterface):
before_sleep=before_sleep_log(logger, logging.DEBUG),
reraise=True,
)
async def create_transcript(self, input):
async def create_transcript(self, input, **kwargs):
"""
Generate an audio transcript from a user query.
@ -281,6 +284,7 @@ class OpenAIAdapter(LLMInterface):
api_base=self.endpoint,
api_version=self.api_version,
max_retries=self.MAX_RETRIES,
**kwargs,
)
return transcription
@ -292,7 +296,7 @@ class OpenAIAdapter(LLMInterface):
before_sleep=before_sleep_log(logger, logging.DEBUG),
reraise=True,
)
async def transcribe_image(self, input) -> BaseModel:
async def transcribe_image(self, input, **kwargs) -> BaseModel:
"""
Generate a transcription of an image from a user query.
@ -337,4 +341,5 @@ class OpenAIAdapter(LLMInterface):
api_version=self.api_version,
max_completion_tokens=300,
max_retries=self.MAX_RETRIES,
**kwargs,
)

View file

@ -5,6 +5,10 @@ from cognee.context_global_variables import backend_access_control_enabled
from cognee.infrastructure.databases.vector import get_vector_engine
from cognee.infrastructure.databases.graph.get_graph_engine import get_graph_engine
from cognee.infrastructure.databases.relational import get_relational_engine
from cognee.infrastructure.databases.utils import (
get_graph_dataset_database_handler,
get_vector_dataset_database_handler,
)
from cognee.shared.cache import delete_cache
from cognee.modules.users.models import DatasetDatabase
from cognee.shared.logging_utils import get_logger
@ -13,22 +17,13 @@ logger = get_logger()
async def prune_graph_databases():
async def _prune_graph_db(dataset_database: DatasetDatabase) -> dict:
from cognee.infrastructure.databases.dataset_database_handler.supported_dataset_database_handlers import (
supported_dataset_database_handlers,
)
handler = supported_dataset_database_handlers[
dataset_database.graph_dataset_database_handler
]
return await handler["handler_instance"].delete_dataset(dataset_database)
db_engine = get_relational_engine()
try:
data = await db_engine.get_all_data_from_table("dataset_database")
dataset_databases = await db_engine.get_all_data_from_table("dataset_database")
# Go through each dataset database and delete the graph database
for data_item in data:
await _prune_graph_db(data_item)
for dataset_database in dataset_databases:
handler = get_graph_dataset_database_handler(dataset_database)
await handler["handler_instance"].delete_dataset(dataset_database)
except (OperationalError, EntityNotFoundError) as e:
logger.debug(
"Skipping pruning of graph DB. Error when accessing dataset_database table: %s",
@ -38,22 +33,13 @@ async def prune_graph_databases():
async def prune_vector_databases():
async def _prune_vector_db(dataset_database: DatasetDatabase) -> dict:
from cognee.infrastructure.databases.dataset_database_handler.supported_dataset_database_handlers import (
supported_dataset_database_handlers,
)
handler = supported_dataset_database_handlers[
dataset_database.vector_dataset_database_handler
]
return await handler["handler_instance"].delete_dataset(dataset_database)
db_engine = get_relational_engine()
try:
data = await db_engine.get_all_data_from_table("dataset_database")
dataset_databases = await db_engine.get_all_data_from_table("dataset_database")
# Go through each dataset database and delete the vector database
for data_item in data:
await _prune_vector_db(data_item)
for dataset_database in dataset_databases:
handler = get_vector_dataset_database_handler(dataset_database)
await handler["handler_instance"].delete_dataset(dataset_database)
except (OperationalError, EntityNotFoundError) as e:
logger.debug(
"Skipping pruning of vector DB. Error when accessing dataset_database table: %s",

View file

@ -1,8 +1,34 @@
from cognee.modules.users.models import DatasetDatabase
from sqlalchemy import select
from cognee.modules.data.models import Dataset
from cognee.infrastructure.databases.utils.get_vector_dataset_database_handler import (
get_vector_dataset_database_handler,
)
from cognee.infrastructure.databases.utils.get_graph_dataset_database_handler import (
get_graph_dataset_database_handler,
)
from cognee.infrastructure.databases.relational import get_relational_engine
async def delete_dataset(dataset: Dataset):
db_engine = get_relational_engine()
async with db_engine.get_async_session() as session:
stmt = select(DatasetDatabase).where(
DatasetDatabase.dataset_id == dataset.id,
)
dataset_database: DatasetDatabase = await session.scalar(stmt)
if dataset_database:
graph_dataset_database_handler = get_graph_dataset_database_handler(dataset_database)
vector_dataset_database_handler = get_vector_dataset_database_handler(dataset_database)
await graph_dataset_database_handler["handler_instance"].delete_dataset(
dataset_database
)
await vector_dataset_database_handler["handler_instance"].delete_dataset(
dataset_database
)
# TODO: Remove dataset from pipeline_run_status in Data objects related to dataset as well
# This blocks recreation of the dataset with the same name and data after deletion as
# it's marked as completed and will be just skipped even though it's empty.
return await db_engine.delete_entity_by_id(dataset.__tablename__, dataset.id)

View file

@ -15,3 +15,9 @@ async def setup():
"""
await create_relational_db_and_tables()
await create_pgvector_db_and_tables()
if __name__ == "__main__":
import asyncio
asyncio.run(setup())

View file

@ -49,6 +49,7 @@ async def search(
session_id: Optional[str] = None,
wide_search_top_k: Optional[int] = 100,
triplet_distance_penalty: Optional[float] = 3.5,
verbose: bool = False,
) -> Union[CombinedSearchResult, List[SearchResult]]:
"""
@ -140,6 +141,7 @@ async def search(
)
if use_combined_context:
# Note: combined context search must always be verbose and return a CombinedSearchResult with graphs info
prepared_search_results = await prepare_search_result(
search_results[0] if isinstance(search_results, list) else search_results
)
@ -173,25 +175,30 @@ async def search(
datasets = prepared_search_results["datasets"]
if only_context:
return_value.append(
{
"search_result": [context] if context else None,
"dataset_id": datasets[0].id,
"dataset_name": datasets[0].name,
"dataset_tenant_id": datasets[0].tenant_id,
"graphs": graphs,
}
)
search_result_dict = {
"search_result": [context] if context else None,
"dataset_id": datasets[0].id,
"dataset_name": datasets[0].name,
"dataset_tenant_id": datasets[0].tenant_id,
}
if verbose:
# Include graphs only in verbose mode
search_result_dict["graphs"] = graphs
return_value.append(search_result_dict)
else:
return_value.append(
{
"search_result": [result] if result else None,
"dataset_id": datasets[0].id,
"dataset_name": datasets[0].name,
"dataset_tenant_id": datasets[0].tenant_id,
"graphs": graphs,
}
)
search_result_dict = {
"search_result": [result] if result else None,
"dataset_id": datasets[0].id,
"dataset_name": datasets[0].name,
"dataset_tenant_id": datasets[0].tenant_id,
}
if verbose:
# Include graphs only in verbose mode
search_result_dict["graphs"] = graphs
return_value.append(search_result_dict)
return return_value
else:
return_value = []

View file

@ -16,6 +16,7 @@ class ModelName(Enum):
anthropic = "anthropic"
gemini = "gemini"
mistral = "mistral"
bedrock = "bedrock"
class LLMConfig(BaseModel):
@ -77,6 +78,10 @@ def get_settings() -> SettingsDict:
"value": "mistral",
"label": "Mistral",
},
{
"value": "bedrock",
"label": "Bedrock",
},
]
return SettingsDict.model_validate(
@ -157,6 +162,20 @@ def get_settings() -> SettingsDict:
"label": "Mistral Large 2.1",
},
],
"bedrock": [
{
"value": "eu.anthropic.claude-sonnet-4-5-20250929-v1:0",
"label": "Claude 4.5 Sonnet",
},
{
"value": "eu.anthropic.claude-haiku-4-5-20251001-v1:0",
"label": "Claude 4.5 Haiku",
},
{
"value": "eu.amazon.nova-lite-v1:0",
"label": "Amazon Nova Lite",
},
],
},
},
vector_db={

View file

@ -92,7 +92,7 @@ async def cognee_network_visualization(graph_data, destination_file_path: str =
}
links_list.append(link_data)
html_template = """
html_template = r"""
<!DOCTYPE html>
<html>
<head>

View file

@ -97,6 +97,7 @@ async def extract_graph_from_data(
graph_model: Type[BaseModel],
config: Config = None,
custom_prompt: Optional[str] = None,
**kwargs,
) -> List[DocumentChunk]:
"""
Extracts and integrates a knowledge graph from the text content of document chunks using a specified graph model.
@ -111,7 +112,7 @@ async def extract_graph_from_data(
chunk_graphs = await asyncio.gather(
*[
extract_content_graph(chunk.text, graph_model, custom_prompt=custom_prompt)
extract_content_graph(chunk.text, graph_model, custom_prompt=custom_prompt, **kwargs)
for chunk in data_chunks
]
)

View file

@ -1,5 +1,6 @@
from typing import AsyncGenerator, Dict, Any, List, Optional
from cognee.infrastructure.databases.graph.get_graph_engine import get_graph_engine
from cognee.modules.engine.utils import generate_node_id
from cognee.shared.logging_utils import get_logger
from cognee.modules.graph.utils.convert_node_to_data_point import get_all_subclasses
from cognee.infrastructure.engine import DataPoint
@ -155,7 +156,12 @@ def _process_single_triplet(
embeddable_text = f"{start_node_text}-{relationship_text}-{end_node_text}".strip()
triplet_obj = Triplet(from_node_id=start_node_id, to_node_id=end_node_id, text=embeddable_text)
relationship_name = relationship.get("relationship_name", "")
triplet_id = generate_node_id(str(start_node_id) + str(relationship_name) + str(end_node_id))
triplet_obj = Triplet(
id=triplet_id, from_node_id=start_node_id, to_node_id=end_node_id, text=embeddable_text
)
return triplet_obj, None

View file

@ -148,8 +148,8 @@ class TestCogneeServerStart(unittest.TestCase):
headers=headers,
files=[("ontology_file", ("test.owl", ontology_content, "application/xml"))],
data={
"ontology_key": json.dumps([ontology_key]),
"description": json.dumps(["Test ontology"]),
"ontology_key": ontology_key,
"description": "Test ontology",
},
)
self.assertEqual(ontology_response.status_code, 200)

View file

@ -0,0 +1,76 @@
import os
import asyncio
import pathlib
from uuid import UUID
import cognee
from cognee.shared.logging_utils import setup_logging, ERROR
from cognee.modules.data.methods.delete_dataset import delete_dataset
from cognee.modules.data.methods.get_dataset import get_dataset
from cognee.modules.users.methods import get_default_user
async def main():
# Set data and system directory paths
data_directory_path = str(
pathlib.Path(
os.path.join(pathlib.Path(__file__).parent, ".data_storage/test_dataset_delete")
).resolve()
)
cognee.config.data_root_directory(data_directory_path)
cognee_directory_path = str(
pathlib.Path(
os.path.join(pathlib.Path(__file__).parent, ".cognee_system/test_dataset_delete")
).resolve()
)
cognee.config.system_root_directory(cognee_directory_path)
# Create a clean slate for cognee -- reset data and system state
print("Resetting cognee data...")
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
print("Data reset complete.\n")
# cognee knowledge graph will be created based on this text
text = """
Natural language processing (NLP) is an interdisciplinary
subfield of computer science and information retrieval.
"""
# Add the text, and make it available for cognify
await cognee.add(text, "nlp_dataset")
await cognee.add("Quantum computing is the study of quantum computers.", "quantum_dataset")
# Use LLMs and cognee to create knowledge graph
ret_val = await cognee.cognify()
user = await get_default_user()
for val in ret_val:
dataset_id = str(val)
vector_db_path = os.path.join(
cognee_directory_path, "databases", str(user.id), dataset_id + ".lance.db"
)
graph_db_path = os.path.join(
cognee_directory_path, "databases", str(user.id), dataset_id + ".pkl"
)
# Check if databases are properly created and exist before deletion
assert os.path.exists(graph_db_path), "Graph database file not found."
assert os.path.exists(vector_db_path), "Vector database file not found."
dataset = await get_dataset(user_id=user.id, dataset_id=UUID(dataset_id))
await delete_dataset(dataset)
# Confirm databases have been deleted
assert not os.path.exists(graph_db_path), "Graph database file found."
assert not os.path.exists(vector_db_path), "Vector database file found."
if __name__ == "__main__":
logger = setup_logging(log_level=ERROR)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())

View file

@ -1,17 +1,28 @@
import pytest
import uuid
from fastapi.testclient import TestClient
from unittest.mock import patch, Mock, AsyncMock
from unittest.mock import Mock
from types import SimpleNamespace
import importlib
from cognee.api.client import app
from cognee.modules.users.methods import get_authenticated_user
gau_mod = importlib.import_module("cognee.modules.users.methods.get_authenticated_user")
@pytest.fixture(scope="session")
def test_client():
# Keep a single TestClient (and event loop) for the whole module.
# Re-creating TestClient repeatedly can break async DB connections (asyncpg loop mismatch).
with TestClient(app) as c:
yield c
@pytest.fixture
def client():
return TestClient(app)
def client(test_client, mock_default_user):
async def override_get_authenticated_user():
return mock_default_user
app.dependency_overrides[get_authenticated_user] = override_get_authenticated_user
yield test_client
app.dependency_overrides.pop(get_authenticated_user, None)
@pytest.fixture
@ -32,12 +43,8 @@ def mock_default_user():
)
@patch.object(gau_mod, "get_default_user", new_callable=AsyncMock)
def test_upload_ontology_success(mock_get_default_user, client, mock_default_user):
def test_upload_ontology_success(client):
"""Test successful ontology upload"""
import json
mock_get_default_user.return_value = mock_default_user
ontology_content = (
b"<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'></rdf:RDF>"
)
@ -46,7 +53,7 @@ def test_upload_ontology_success(mock_get_default_user, client, mock_default_use
response = client.post(
"/api/v1/ontologies",
files=[("ontology_file", ("test.owl", ontology_content, "application/xml"))],
data={"ontology_key": json.dumps([unique_key]), "description": json.dumps(["Test"])},
data={"ontology_key": unique_key, "description": "Test"},
)
assert response.status_code == 200
@ -55,10 +62,8 @@ def test_upload_ontology_success(mock_get_default_user, client, mock_default_use
assert "uploaded_at" in data["uploaded_ontologies"][0]
@patch.object(gau_mod, "get_default_user", new_callable=AsyncMock)
def test_upload_ontology_invalid_file(mock_get_default_user, client, mock_default_user):
def test_upload_ontology_invalid_file(client):
"""Test 400 response for non-.owl files"""
mock_get_default_user.return_value = mock_default_user
unique_key = f"test_ontology_{uuid.uuid4().hex[:8]}"
response = client.post(
"/api/v1/ontologies",
@ -68,14 +73,10 @@ def test_upload_ontology_invalid_file(mock_get_default_user, client, mock_defaul
assert response.status_code == 400
@patch.object(gau_mod, "get_default_user", new_callable=AsyncMock)
def test_upload_ontology_missing_data(mock_get_default_user, client, mock_default_user):
def test_upload_ontology_missing_data(client):
"""Test 400 response for missing file or key"""
import json
mock_get_default_user.return_value = mock_default_user
# Missing file
response = client.post("/api/v1/ontologies", data={"ontology_key": json.dumps(["test"])})
response = client.post("/api/v1/ontologies", data={"ontology_key": "test"})
assert response.status_code == 400
# Missing key
@ -85,34 +86,25 @@ def test_upload_ontology_missing_data(mock_get_default_user, client, mock_defaul
assert response.status_code == 400
@patch.object(gau_mod, "get_default_user", new_callable=AsyncMock)
def test_upload_ontology_unauthorized(mock_get_default_user, client, mock_default_user):
"""Test behavior when default user is provided (no explicit authentication)"""
import json
def test_upload_ontology_without_auth_header(client):
"""Test behavior when no explicit authentication header is provided."""
unique_key = f"test_ontology_{uuid.uuid4().hex[:8]}"
mock_get_default_user.return_value = mock_default_user
response = client.post(
"/api/v1/ontologies",
files=[("ontology_file", ("test.owl", b"<rdf></rdf>", "application/xml"))],
data={"ontology_key": json.dumps([unique_key])},
data={"ontology_key": unique_key},
)
# The current system provides a default user when no explicit authentication is given
# This test verifies the system works with conditional authentication
assert response.status_code == 200
data = response.json()
assert data["uploaded_ontologies"][0]["ontology_key"] == unique_key
assert "uploaded_at" in data["uploaded_ontologies"][0]
@patch.object(gau_mod, "get_default_user", new_callable=AsyncMock)
def test_upload_multiple_ontologies(mock_get_default_user, client, mock_default_user):
"""Test uploading multiple ontology files in single request"""
def test_upload_multiple_ontologies_in_single_request_is_rejected(client):
"""Uploading multiple ontology files in a single request should fail."""
import io
mock_get_default_user.return_value = mock_default_user
# Create mock files
file1_content = b"<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'></rdf:RDF>"
file2_content = b"<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'></rdf:RDF>"
@ -120,45 +112,34 @@ def test_upload_multiple_ontologies(mock_get_default_user, client, mock_default_
("ontology_file", ("vehicles.owl", io.BytesIO(file1_content), "application/xml")),
("ontology_file", ("manufacturers.owl", io.BytesIO(file2_content), "application/xml")),
]
data = {
"ontology_key": '["vehicles", "manufacturers"]',
"descriptions": '["Base vehicles", "Car manufacturers"]',
}
data = {"ontology_key": "vehicles", "description": "Base vehicles"}
response = client.post("/api/v1/ontologies", files=files, data=data)
assert response.status_code == 200
result = response.json()
assert "uploaded_ontologies" in result
assert len(result["uploaded_ontologies"]) == 2
assert result["uploaded_ontologies"][0]["ontology_key"] == "vehicles"
assert result["uploaded_ontologies"][1]["ontology_key"] == "manufacturers"
assert response.status_code == 400
assert "Only one ontology_file is allowed" in response.json()["error"]
@patch.object(gau_mod, "get_default_user", new_callable=AsyncMock)
def test_upload_endpoint_accepts_arrays(mock_get_default_user, client, mock_default_user):
"""Test that upload endpoint accepts array parameters"""
def test_upload_endpoint_rejects_array_style_fields(client):
"""Array-style form values should be rejected (no backwards compatibility)."""
import io
import json
mock_get_default_user.return_value = mock_default_user
file_content = b"<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'></rdf:RDF>"
files = [("ontology_file", ("single.owl", io.BytesIO(file_content), "application/xml"))]
data = {
"ontology_key": json.dumps(["single_key"]),
"descriptions": json.dumps(["Single ontology"]),
"description": json.dumps(["Single ontology"]),
}
response = client.post("/api/v1/ontologies", files=files, data=data)
assert response.status_code == 200
result = response.json()
assert result["uploaded_ontologies"][0]["ontology_key"] == "single_key"
assert response.status_code == 400
assert "ontology_key must be a string" in response.json()["error"]
@patch.object(gau_mod, "get_default_user", new_callable=AsyncMock)
def test_cognify_with_multiple_ontologies(mock_get_default_user, client, mock_default_user):
def test_cognify_with_multiple_ontologies(client):
"""Test cognify endpoint accepts multiple ontology keys"""
payload = {
"datasets": ["test_dataset"],
@ -172,14 +153,11 @@ def test_cognify_with_multiple_ontologies(mock_get_default_user, client, mock_de
assert response.status_code in [200, 400, 409] # May fail for other reasons, not type
@patch.object(gau_mod, "get_default_user", new_callable=AsyncMock)
def test_complete_multifile_workflow(mock_get_default_user, client, mock_default_user):
"""Test complete workflow: upload multiple ontologies → cognify with multiple keys"""
def test_complete_multifile_workflow(client):
"""Test workflow: upload ontologies one-by-one → cognify with multiple keys"""
import io
import json
mock_get_default_user.return_value = mock_default_user
# Step 1: Upload multiple ontologies
# Step 1: Upload two ontologies (one-by-one)
file1_content = b"""<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/2002/07/owl#">
@ -192,17 +170,21 @@ def test_complete_multifile_workflow(mock_get_default_user, client, mock_default
<owl:Class rdf:ID="Manufacturer"/>
</rdf:RDF>"""
files = [
("ontology_file", ("vehicles.owl", io.BytesIO(file1_content), "application/xml")),
("ontology_file", ("manufacturers.owl", io.BytesIO(file2_content), "application/xml")),
]
data = {
"ontology_key": json.dumps(["vehicles", "manufacturers"]),
"descriptions": json.dumps(["Vehicle ontology", "Manufacturer ontology"]),
}
upload_response_1 = client.post(
"/api/v1/ontologies",
files=[("ontology_file", ("vehicles.owl", io.BytesIO(file1_content), "application/xml"))],
data={"ontology_key": "vehicles", "description": "Vehicle ontology"},
)
assert upload_response_1.status_code == 200
upload_response = client.post("/api/v1/ontologies", files=files, data=data)
assert upload_response.status_code == 200
upload_response_2 = client.post(
"/api/v1/ontologies",
files=[
("ontology_file", ("manufacturers.owl", io.BytesIO(file2_content), "application/xml"))
],
data={"ontology_key": "manufacturers", "description": "Manufacturer ontology"},
)
assert upload_response_2.status_code == 200
# Step 2: Verify ontologies are listed
list_response = client.get("/api/v1/ontologies")
@ -223,44 +205,42 @@ def test_complete_multifile_workflow(mock_get_default_user, client, mock_default
assert cognify_response.status_code != 400 # Not a validation error
@patch.object(gau_mod, "get_default_user", new_callable=AsyncMock)
def test_multifile_error_handling(mock_get_default_user, client, mock_default_user):
"""Test error handling for invalid multifile uploads"""
def test_upload_error_handling(client):
"""Test error handling for invalid uploads (single-file endpoint)."""
import io
import json
# Test mismatched array lengths
# Array-style key should be rejected
file_content = b"<rdf:RDF></rdf:RDF>"
files = [("ontology_file", ("test.owl", io.BytesIO(file_content), "application/xml"))]
data = {
"ontology_key": json.dumps(["key1", "key2"]), # 2 keys, 1 file
"descriptions": json.dumps(["desc1"]),
"ontology_key": json.dumps(["key1", "key2"]),
"description": "desc1",
}
response = client.post("/api/v1/ontologies", files=files, data=data)
assert response.status_code == 400
assert "Number of keys must match number of files" in response.json()["error"]
assert "ontology_key must be a string" in response.json()["error"]
# Test duplicate keys
files = [
("ontology_file", ("test1.owl", io.BytesIO(file_content), "application/xml")),
("ontology_file", ("test2.owl", io.BytesIO(file_content), "application/xml")),
]
data = {
"ontology_key": json.dumps(["duplicate", "duplicate"]),
"descriptions": json.dumps(["desc1", "desc2"]),
}
# Duplicate key should be rejected
response_1 = client.post(
"/api/v1/ontologies",
files=[("ontology_file", ("test1.owl", io.BytesIO(file_content), "application/xml"))],
data={"ontology_key": "duplicate", "description": "desc1"},
)
assert response_1.status_code == 200
response = client.post("/api/v1/ontologies", files=files, data=data)
assert response.status_code == 400
assert "Duplicate ontology keys not allowed" in response.json()["error"]
response_2 = client.post(
"/api/v1/ontologies",
files=[("ontology_file", ("test2.owl", io.BytesIO(file_content), "application/xml"))],
data={"ontology_key": "duplicate", "description": "desc2"},
)
assert response_2.status_code == 400
assert "already exists" in response_2.json()["error"]
@patch.object(gau_mod, "get_default_user", new_callable=AsyncMock)
def test_cognify_missing_ontology_key(mock_get_default_user, client, mock_default_user):
def test_cognify_missing_ontology_key(client):
"""Test cognify with non-existent ontology key"""
mock_get_default_user.return_value = mock_default_user
payload = {
"datasets": ["test_dataset"],
"ontology_key": ["nonexistent_key"],

View file

@ -0,0 +1,100 @@
import types
from uuid import uuid4
import pytest
from cognee.modules.search.types import SearchType
def _make_user(user_id: str = "u1", tenant_id=None):
return types.SimpleNamespace(id=user_id, tenant_id=tenant_id)
def _make_dataset(*, name="ds", tenant_id="t1", dataset_id=None, owner_id=None):
return types.SimpleNamespace(
id=dataset_id or uuid4(),
name=name,
tenant_id=tenant_id,
owner_id=owner_id or uuid4(),
)
@pytest.fixture
def search_mod():
import importlib
return importlib.import_module("cognee.modules.search.methods.search")
@pytest.fixture(autouse=True)
def _patch_side_effect_boundaries(monkeypatch, search_mod):
"""
Keep production logic; patch only unavoidable side-effect boundaries.
"""
async def dummy_log_query(_query_text, _query_type, _user_id):
return types.SimpleNamespace(id="qid-1")
async def dummy_log_result(*_args, **_kwargs):
return None
async def dummy_prepare_search_result(search_result):
if isinstance(search_result, tuple) and len(search_result) == 3:
result, context, datasets = search_result
return {"result": result, "context": context, "graphs": {}, "datasets": datasets}
return {"result": None, "context": None, "graphs": {}, "datasets": []}
monkeypatch.setattr(search_mod, "send_telemetry", lambda *a, **k: None)
monkeypatch.setattr(search_mod, "log_query", dummy_log_query)
monkeypatch.setattr(search_mod, "log_result", dummy_log_result)
monkeypatch.setattr(search_mod, "prepare_search_result", dummy_prepare_search_result)
yield
@pytest.mark.asyncio
async def test_search_access_control_returns_dataset_shaped_dicts(monkeypatch, search_mod):
user = _make_user()
ds = _make_dataset(name="ds1", tenant_id="t1")
async def dummy_authorized_search(**kwargs):
assert kwargs["dataset_ids"] == [ds.id]
return [("r", ["ctx"], [ds])]
monkeypatch.setattr(search_mod, "backend_access_control_enabled", lambda: True)
monkeypatch.setattr(search_mod, "authorized_search", dummy_authorized_search)
out_non_verbose = await search_mod.search(
query_text="q",
query_type=SearchType.CHUNKS,
dataset_ids=[ds.id],
user=user,
verbose=False,
)
assert out_non_verbose == [
{
"search_result": ["r"],
"dataset_id": ds.id,
"dataset_name": "ds1",
"dataset_tenant_id": "t1",
}
]
out_verbose = await search_mod.search(
query_text="q",
query_type=SearchType.CHUNKS,
dataset_ids=[ds.id],
user=user,
verbose=True,
)
assert out_verbose == [
{
"search_result": ["r"],
"dataset_id": ds.id,
"dataset_name": "ds1",
"dataset_tenant_id": "t1",
"graphs": {},
}
]

View file

@ -20,19 +20,29 @@ echo "HTTP port: $HTTP_PORT"
# smooth redeployments and container restarts while maintaining data integrity.
echo "Running database migrations..."
set +e # Disable exit on error to handle specific migration errors
MIGRATION_OUTPUT=$(alembic upgrade head)
MIGRATION_EXIT_CODE=$?
set -e
if [[ $MIGRATION_EXIT_CODE -ne 0 ]]; then
if [[ "$MIGRATION_OUTPUT" == *"UserAlreadyExists"* ]] || [[ "$MIGRATION_OUTPUT" == *"User default_user@example.com already exists"* ]]; then
echo "Warning: Default user already exists, continuing startup..."
else
echo "Migration failed with unexpected error."
exit 1
fi
fi
echo "Migration failed with unexpected error. Trying to run Cognee without migrations."
echo "Database migrations done."
echo "Initializing database tables..."
python /app/cognee/modules/engine/operations/setup.py
INIT_EXIT_CODE=$?
if [[ $INIT_EXIT_CODE -ne 0 ]]; then
echo "Database initialization failed!"
exit 1
fi
fi
else
echo "Database migrations done."
fi
echo "Starting server..."

View file

@ -1,8 +1,9 @@
import asyncio
import cognee
import os
from pprint import pprint
# By default cognee uses OpenAI's gpt-5-mini LLM model
# Provide your OpenAI LLM API KEY
os.environ["LLM_API_KEY"] = ""
@ -24,13 +25,13 @@ async def cognee_demo():
# Query Cognee for information from provided document
answer = await cognee.search("List me all the important characters in Alice in Wonderland.")
print(answer)
pprint(answer)
answer = await cognee.search("How did Alice end up in Wonderland?")
print(answer)
pprint(answer)
answer = await cognee.search("Tell me about Alice's personality.")
print(answer)
pprint(answer)
# Cognee is an async library, it has to be called in an async context

View file

@ -1,4 +1,5 @@
import asyncio
from pprint import pprint
import cognee
from cognee.api.v1.search import SearchType
@ -187,7 +188,7 @@ async def main(enable_steps):
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text="Who has experience in design tools?"
)
print(search_results)
pprint(search_results)
if __name__ == "__main__":

View file

@ -1,6 +1,8 @@
import os
import asyncio
import pathlib
from pprint import pprint
from cognee.shared.logging_utils import setup_logging, ERROR
import cognee
@ -42,7 +44,7 @@ async def main():
# Display search results
for result_text in search_results:
print(result_text)
pprint(result_text)
if __name__ == "__main__":

View file

@ -1,5 +1,6 @@
import asyncio
import os
from pprint import pprint
import cognee
from cognee.api.v1.search import SearchType
@ -77,7 +78,7 @@ async def main():
query_type=SearchType.GRAPH_COMPLETION,
query_text="What are the exact cars and their types produced by Audi?",
)
print(search_results)
pprint(search_results)
await visualize_graph()

View file

@ -1,6 +1,7 @@
import os
import cognee
import pathlib
from pprint import pprint
from cognee.modules.users.exceptions import PermissionDeniedError
from cognee.modules.users.tenants.methods import select_tenant
@ -86,7 +87,7 @@ async def main():
)
print("\nSearch results as user_1 on dataset owned by user_1:")
for result in search_results:
print(f"{result}\n")
pprint(result)
# But user_1 cant read the dataset owned by user_2 (QUANTUM dataset)
print("\nSearch result as user_1 on the dataset owned by user_2:")
@ -134,7 +135,7 @@ async def main():
dataset_ids=[quantum_dataset_id],
)
for result in search_results:
print(f"{result}\n")
pprint(result)
# If we'd like for user_1 to add new documents to the QUANTUM dataset owned by user_2, user_1 would have to get
# "write" access permission, which user_1 currently does not have
@ -217,7 +218,7 @@ async def main():
dataset_ids=[quantum_cognee_lab_dataset_id],
)
for result in search_results:
print(f"{result}\n")
pprint(result)
# Note: All of these function calls and permission system is available through our backend endpoints as well

View file

@ -1,4 +1,6 @@
import asyncio
from pprint import pprint
import cognee
from cognee.modules.engine.operations.setup import setup
from cognee.modules.users.methods import get_default_user
@ -71,7 +73,7 @@ async def main():
print("Search results:")
# Display results
for result_text in search_results:
print(result_text)
pprint(result_text)
if __name__ == "__main__":

View file

@ -1,4 +1,6 @@
import asyncio
from pprint import pprint
import cognee
from cognee.shared.logging_utils import setup_logging, ERROR
from cognee.api.v1.search import SearchType
@ -54,7 +56,7 @@ async def main():
print("Search results:")
# Display results
for result_text in search_results:
print(result_text)
pprint(result_text)
if __name__ == "__main__":

View file

@ -1,4 +1,5 @@
import asyncio
from pprint import pprint
import cognee
from cognee.shared.logging_utils import setup_logging, INFO
from cognee.api.v1.search import SearchType
@ -35,16 +36,16 @@ biography_1 = """
biography_2 = """
Arnulf Øverland Ole Peter Arnulf Øverland ( 27 April 1889 25 March 1968 ) was a Norwegian poet and artist . He is principally known for his poetry which served to inspire the Norwegian resistance movement during the German occupation of Norway during World War II .
Biography .
Øverland was born in Kristiansund and raised in Bergen . His parents were Peter Anton Øverland ( 18521906 ) and Hanna Hage ( 18541939 ) . The early death of his father , left the family economically stressed . He was able to attend Bergen Cathedral School and in 1904 Kristiania Cathedral School . He graduated in 1907 and for a time studied philology at University of Kristiania . Øverland published his first collection of poems ( 1911 ) .
Øverland became a communist sympathizer from the early 1920s and became a member of Mot Dag . He also served as chairman of the Norwegian Students Society 192328 . He changed his stand in 1937 , partly as an expression of dissent against the ongoing Moscow Trials . He was an avid opponent of Nazism and in 1936 he wrote the poem Du ikke sove which was printed in the journal Samtiden . It ends with . ( I thought: : Something is imminent . Our era is over Europes on fire! ) . Probably the most famous line of the poem is ( You mustnt endure so well the injustice that doesnt affect you yourself! )
During the German occupation of Norway from 1940 in World War II , he wrote to inspire the Norwegian resistance movement . He wrote a series of poems which were clandestinely distributed , leading to the arrest of both him and his future wife Margrete Aamot Øverland in 1941 . Arnulf Øverland was held first in the prison camp of Grini before being transferred to Sachsenhausen concentration camp in Germany . He spent a four-year imprisonment until the liberation of Norway in 1945 . His poems were later collected in Vi overlever alt and published in 1945 .
Øverland played an important role in the Norwegian language struggle in the post-war era . He became a noted supporter for the conservative written form of Norwegian called Riksmål , he was president of Riksmålsforbundet ( an organization in support of Riksmål ) from 1947 to 1956 . In addition , Øverland adhered to the traditionalist style of writing , criticising modernist poetry on several occasions . His speech Tungetale fra parnasset , published in Arbeiderbladet in 1954 , initiated the so-called Glossolalia debate .
Personal life .
In 1918 he had married the singer Hildur Arntzen ( 18881957 ) . Their marriage was dissolved in 1939 . In 1940 , he married Bartholine Eufemia Leganger ( 19031995 ) . They separated shortly after , and were officially divorced in 1945 . Øverland was married to journalist Margrete Aamot Øverland ( 19131978 ) during June 1945 . In 1946 , the Norwegian Parliament arranged for Arnulf and Margrete Aamot Øverland to reside at the Grotten . He lived there until his death in 1968 and she lived there for another ten years until her death in 1978 . Arnulf Øverland was buried at Vår Frelsers Gravlund in Oslo . Joseph Grimeland designed the bust of Arnulf Øverland ( bronze , 1970 ) at his grave site .
@ -56,7 +57,7 @@ biography_2 = """
- Vi overlever alt ( 1945 )
- Sverdet bak døren ( 1956 )
- Livets minutter ( 1965 )
Awards .
- Gyldendals Endowment ( 1935 )
- Dobloug Prize ( 1951 )
@ -87,7 +88,8 @@ async def main():
top_k=15,
)
print(f"Query: {query_text}")
print(f"Results: {search_results}\n")
print("Results:")
pprint(search_results)
if __name__ == "__main__":

View file

@ -1,4 +1,5 @@
import asyncio
from pprint import pprint
import cognee
from cognee.memify_pipelines.create_triplet_embeddings import create_triplet_embeddings
@ -65,7 +66,7 @@ async def main():
query_type=SearchType.TRIPLET_COMPLETION,
query_text="What are the models produced by Volkswagen based on the context?",
)
print(search_results)
pprint(search_results)
if __name__ == "__main__":

View file

@ -1,7 +1,7 @@
[project]
name = "cognee"
version = "0.5.0.dev0"
version = "0.5.1"
description = "Cognee - is a library for enriching LLM context with a semantic layer for better understanding and reasoning."
authors = [
{ name = "Vasilije Markovic" },

4
uv.lock generated
View file

@ -1,5 +1,5 @@
version = 1
revision = 2
revision = 3
requires-python = ">=3.10, <3.14"
resolution-markers = [
"python_full_version >= '3.13' and platform_python_implementation != 'PyPy' and sys_platform == 'darwin'",
@ -946,7 +946,7 @@ wheels = [
[[package]]
name = "cognee"
version = "0.5.0.dev0"
version = "0.5.1"
source = { editable = "." }
dependencies = [
{ name = "aiofiles" },