Merge branch 'dev' into fix/litellm-dimensions-handling

This commit is contained in:
Stony 2026-01-12 15:03:07 +08:00 committed by GitHub
commit 03b7923a3e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
114 changed files with 1373 additions and 612 deletions

View file

@ -3,7 +3,7 @@
language: en language: en
early_access: false early_access: false
enable_free_tier: true enable_free_tier: true
reviews: reviews:
profile: chill profile: chill
instructions: >- instructions: >-
# Code Review Instructions # Code Review Instructions
@ -118,10 +118,10 @@ reviews:
- E117 - E117
- D208 - D208
line_length: 100 line_length: 100
dummy_variable_rgx: '^(_.*|junk|extra)$' # Variables starting with '_' or named 'junk' or 'extras', are considered dummy variables dummy_variable_rgx: '^(_.*|junk|extra)$' # Variables starting with '_' or named 'junk' or 'extras', are considered dummy variables
markdownlint: markdownlint:
enabled: true enabled: true
yamllint: yamllint:
enabled: true enabled: true
chat: chat:
auto_reply: true auto_reply: true

View file

@ -2,4 +2,8 @@
# Example: # Example:
# CORS_ALLOWED_ORIGINS="https://yourdomain.com,https://another.com" # CORS_ALLOWED_ORIGINS="https://yourdomain.com,https://another.com"
# For local development, you might use: # For local development, you might use:
# CORS_ALLOWED_ORIGINS="http://localhost:3000" # CORS_ALLOWED_ORIGINS="http://localhost:3000"
LLM_API_KEY="your-openai-api-key"
LLM_MODEL="openai/gpt-4o-mini"
LLM_PROVIDER="openai"

View file

@ -28,4 +28,4 @@ secret-scan:
- path: 'docker-compose.yml' - path: 'docker-compose.yml'
comment: 'Development docker compose with test credentials (neo4j/pleaseletmein, postgres cognee/cognee)' comment: 'Development docker compose with test credentials (neo4j/pleaseletmein, postgres cognee/cognee)'
- path: 'deployment/helm/docker-compose-helm.yml' - path: 'deployment/helm/docker-compose-helm.yml'
comment: 'Helm deployment docker compose with test postgres credentials (cognee/cognee)' comment: 'Helm deployment docker compose with test postgres credentials (cognee/cognee)'

View file

@ -8,7 +8,7 @@ body:
attributes: attributes:
value: | value: |
Thanks for taking the time to fill out this bug report! Please provide a clear and detailed description. Thanks for taking the time to fill out this bug report! Please provide a clear and detailed description.
- type: textarea - type: textarea
id: description id: description
attributes: attributes:
@ -17,7 +17,7 @@ body:
placeholder: Describe the bug in detail... placeholder: Describe the bug in detail...
validations: validations:
required: true required: true
- type: textarea - type: textarea
id: reproduction id: reproduction
attributes: attributes:
@ -29,7 +29,7 @@ body:
3. See error... 3. See error...
validations: validations:
required: true required: true
- type: textarea - type: textarea
id: expected id: expected
attributes: attributes:
@ -38,7 +38,7 @@ body:
placeholder: Describe what you expected... placeholder: Describe what you expected...
validations: validations:
required: true required: true
- type: textarea - type: textarea
id: actual id: actual
attributes: attributes:
@ -47,7 +47,7 @@ body:
placeholder: Describe what actually happened... placeholder: Describe what actually happened...
validations: validations:
required: true required: true
- type: textarea - type: textarea
id: environment id: environment
attributes: attributes:
@ -61,7 +61,7 @@ body:
- Database: [e.g. Neo4j] - Database: [e.g. Neo4j]
validations: validations:
required: true required: true
- type: textarea - type: textarea
id: logs id: logs
attributes: attributes:
@ -71,7 +71,7 @@ body:
render: shell render: shell
validations: validations:
required: false required: false
- type: textarea - type: textarea
id: additional id: additional
attributes: attributes:
@ -80,7 +80,7 @@ body:
placeholder: Any additional information... placeholder: Any additional information...
validations: validations:
required: false required: false
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:

View file

@ -8,7 +8,7 @@ body:
attributes: attributes:
value: | value: |
Thanks for helping improve our documentation! Please provide details about the documentation issue or improvement. Thanks for helping improve our documentation! Please provide details about the documentation issue or improvement.
- type: dropdown - type: dropdown
id: doc-type id: doc-type
attributes: attributes:
@ -22,7 +22,7 @@ body:
- New documentation request - New documentation request
validations: validations:
required: true required: true
- type: textarea - type: textarea
id: location id: location
attributes: attributes:
@ -31,7 +31,7 @@ body:
placeholder: https://cognee.ai/docs/... or specific file/section placeholder: https://cognee.ai/docs/... or specific file/section
validations: validations:
required: true required: true
- type: textarea - type: textarea
id: issue id: issue
attributes: attributes:
@ -40,7 +40,7 @@ body:
placeholder: The documentation is unclear about... placeholder: The documentation is unclear about...
validations: validations:
required: true required: true
- type: textarea - type: textarea
id: suggestion id: suggestion
attributes: attributes:
@ -49,7 +49,7 @@ body:
placeholder: I suggest changing this to... placeholder: I suggest changing this to...
validations: validations:
required: false required: false
- type: textarea - type: textarea
id: additional id: additional
attributes: attributes:
@ -58,7 +58,7 @@ body:
placeholder: Additional context... placeholder: Additional context...
validations: validations:
required: false required: false
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@ -71,4 +71,3 @@ body:
required: true required: true
- label: I have specified the location of the documentation issue - label: I have specified the location of the documentation issue
required: true required: true

View file

@ -8,7 +8,7 @@ body:
attributes: attributes:
value: | value: |
Thanks for suggesting a new feature! Please provide a clear and detailed description of your idea. Thanks for suggesting a new feature! Please provide a clear and detailed description of your idea.
- type: textarea - type: textarea
id: problem id: problem
attributes: attributes:
@ -17,7 +17,7 @@ body:
placeholder: I'm always frustrated when... placeholder: I'm always frustrated when...
validations: validations:
required: true required: true
- type: textarea - type: textarea
id: solution id: solution
attributes: attributes:
@ -26,7 +26,7 @@ body:
placeholder: I would like to see... placeholder: I would like to see...
validations: validations:
required: true required: true
- type: textarea - type: textarea
id: alternatives id: alternatives
attributes: attributes:
@ -35,7 +35,7 @@ body:
placeholder: I have also considered... placeholder: I have also considered...
validations: validations:
required: false required: false
- type: textarea - type: textarea
id: use-case id: use-case
attributes: attributes:
@ -44,7 +44,7 @@ body:
placeholder: This feature would help me... placeholder: This feature would help me...
validations: validations:
required: true required: true
- type: textarea - type: textarea
id: implementation id: implementation
attributes: attributes:
@ -53,7 +53,7 @@ body:
placeholder: This could be implemented by... placeholder: This could be implemented by...
validations: validations:
required: false required: false
- type: textarea - type: textarea
id: additional id: additional
attributes: attributes:
@ -62,7 +62,7 @@ body:
placeholder: Additional context... placeholder: Additional context...
validations: validations:
required: false required: false
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@ -75,4 +75,3 @@ body:
required: true required: true
- label: I have described my specific use case - label: I have described my specific use case
required: true required: true

View file

@ -34,14 +34,14 @@ runs:
-e NEO4J_apoc_export_file_enabled=true \ -e NEO4J_apoc_export_file_enabled=true \
-e NEO4J_apoc_import_file_enabled=true \ -e NEO4J_apoc_import_file_enabled=true \
neo4j:${{ inputs.neo4j-version }} neo4j:${{ inputs.neo4j-version }}
- name: Wait for Neo4j to be ready - name: Wait for Neo4j to be ready
shell: bash shell: bash
run: | run: |
echo "Waiting for Neo4j to start..." echo "Waiting for Neo4j to start..."
timeout=60 timeout=60
counter=0 counter=0
while [ $counter -lt $timeout ]; do while [ $counter -lt $timeout ]; do
if docker exec neo4j-test cypher-shell -u neo4j -p "${{ inputs.neo4j-password }}" "RETURN 1" > /dev/null 2>&1; then if docker exec neo4j-test cypher-shell -u neo4j -p "${{ inputs.neo4j-password }}" "RETURN 1" > /dev/null 2>&1; then
echo "Neo4j is ready!" echo "Neo4j is ready!"
@ -51,13 +51,13 @@ runs:
sleep 2 sleep 2
counter=$((counter + 2)) counter=$((counter + 2))
done done
if [ $counter -ge $timeout ]; then if [ $counter -ge $timeout ]; then
echo "Neo4j failed to start within $timeout seconds" echo "Neo4j failed to start within $timeout seconds"
docker logs neo4j-test docker logs neo4j-test
exit 1 exit 1
fi fi
- name: Verify GDS is available - name: Verify GDS is available
shell: bash shell: bash
run: | run: |

View file

@ -8,5 +8,3 @@ lxobr
pazone pazone
siillee siillee
vasilije1990 vasilije1990

View file

@ -10,26 +10,21 @@ DO NOT use AI-generated descriptions. We want to understand your thought process
<!-- <!--
* Key requirements to the new feature or modification; * Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements; * Proof that the changes work and meet the requirements;
* Include instructions on how to verify the changes. Describe how to test it locally;
* Proof that it's sufficiently tested.
--> -->
## Type of Change ## Type of Change
<!-- Please check the relevant option --> <!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue) - [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality) - [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Documentation update
- [ ] Code refactoring - [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify): - [ ] Other (please specify):
## Screenshots/Videos (if applicable) ## Screenshots
<!-- Add screenshots or videos to help explain your changes --> <!-- ADD SCREENSHOT OF LOCAL TESTS PASSING-->
## Pre-submission Checklist ## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR --> <!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR** - [ ] **I have tested my changes thoroughly before submitting this PR** (See `CONTRIBUTING.md`)
- [ ] **This PR contains minimal changes necessary to address the issue/feature** - [ ] **This PR contains minimal changes necessary to address the issue/feature**
- [ ] My code follows the project's coding standards and style guidelines - [ ] My code follows the project's coding standards and style guidelines
- [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added tests that prove my fix is effective or that my feature works

View file

@ -3,7 +3,7 @@ tag-template: 'v$NEXT_PATCH_VERSION'
categories: categories:
- title: 'Features' - title: 'Features'
labels: ['feature', 'enhancement'] labels: ['feature', 'enhancement']
- title: 'Bug Fixes' - title: 'Bug Fixes'
labels: ['bug', 'fix'] labels: ['bug', 'fix']
- title: 'Maintenance' - title: 'Maintenance'

View file

@ -34,43 +34,6 @@ env:
ENV: 'dev' ENV: 'dev'
jobs: jobs:
lint:
name: Run Linting
runs-on: ubuntu-22.04
steps:
- name: Check out repository
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Cognee Setup
uses: ./.github/actions/cognee_setup
with:
python-version: ${{ inputs.python-version }}
- name: Run Linting
uses: astral-sh/ruff-action@v2
format-check:
name: Run Formatting Check
runs-on: ubuntu-22.04
steps:
- name: Check out repository
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Cognee Setup
uses: ./.github/actions/cognee_setup
with:
python-version: ${{ inputs.python-version }}
- name: Run Formatting Check
uses: astral-sh/ruff-action@v2
with:
args: "format --check"
unit-tests: unit-tests:
name: Run Unit Tests name: Run Unit Tests
runs-on: ubuntu-22.04 runs-on: ubuntu-22.04

View file

@ -31,54 +31,54 @@ WORKFLOWS=(
for workflow in "${WORKFLOWS[@]}"; do for workflow in "${WORKFLOWS[@]}"; do
if [ -f "$workflow" ]; then if [ -f "$workflow" ]; then
echo "Processing $workflow..." echo "Processing $workflow..."
# Create a backup # Create a backup
cp "$workflow" "${workflow}.bak" cp "$workflow" "${workflow}.bak"
# Check if the file begins with a workflow_call trigger # Check if the file begins with a workflow_call trigger
if grep -q "workflow_call:" "$workflow"; then if grep -q "workflow_call:" "$workflow"; then
echo "$workflow already has workflow_call trigger, skipping..." echo "$workflow already has workflow_call trigger, skipping..."
continue continue
fi fi
# Get the content after the 'on:' section # Get the content after the 'on:' section
on_line=$(grep -n "^on:" "$workflow" | cut -d ':' -f1) on_line=$(grep -n "^on:" "$workflow" | cut -d ':' -f1)
if [ -z "$on_line" ]; then if [ -z "$on_line" ]; then
echo "Warning: No 'on:' section found in $workflow, skipping..." echo "Warning: No 'on:' section found in $workflow, skipping..."
continue continue
fi fi
# Create a new file with the modified content # Create a new file with the modified content
{ {
# Copy the part before 'on:' # Copy the part before 'on:'
head -n $((on_line-1)) "$workflow" head -n $((on_line-1)) "$workflow"
# Add the new on: section that only includes workflow_call # Add the new on: section that only includes workflow_call
echo "on:" echo "on:"
echo " workflow_call:" echo " workflow_call:"
echo " secrets:" echo " secrets:"
echo " inherit: true" echo " inherit: true"
# Find where to continue after the original 'on:' section # Find where to continue after the original 'on:' section
next_section=$(awk "NR > $on_line && /^[a-z]/ {print NR; exit}" "$workflow") next_section=$(awk "NR > $on_line && /^[a-z]/ {print NR; exit}" "$workflow")
if [ -z "$next_section" ]; then if [ -z "$next_section" ]; then
next_section=$(wc -l < "$workflow") next_section=$(wc -l < "$workflow")
next_section=$((next_section+1)) next_section=$((next_section+1))
fi fi
# Copy the rest of the file starting from the next section # Copy the rest of the file starting from the next section
tail -n +$next_section "$workflow" tail -n +$next_section "$workflow"
} > "${workflow}.new" } > "${workflow}.new"
# Replace the original with the new version # Replace the original with the new version
mv "${workflow}.new" "$workflow" mv "${workflow}.new" "$workflow"
echo "Modified $workflow to only run when called from test-suites.yml" echo "Modified $workflow to only run when called from test-suites.yml"
else else
echo "Warning: $workflow not found, skipping..." echo "Warning: $workflow not found, skipping..."
fi fi
done done
echo "Finished modifying workflows!" echo "Finished modifying workflows!"

View file

@ -45,4 +45,4 @@ jobs:
cache-to: type=registry,ref=cognee/cognee:buildcache,mode=max cache-to: type=registry,ref=cognee/cognee:buildcache,mode=max
- name: Image digest - name: Image digest
run: echo ${{ steps.build.outputs.digest }} run: echo ${{ steps.build.outputs.digest }}

View file

@ -72,5 +72,3 @@ jobs:
} catch (error) { } catch (error) {
core.warning(`Failed to add label: ${error.message}`); core.warning(`Failed to add label: ${error.message}`);
} }

View file

@ -66,5 +66,3 @@ jobs:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_S3_DEV_USER_KEY_ID }} AWS_ACCESS_KEY_ID: ${{ secrets.AWS_S3_DEV_USER_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_S3_DEV_USER_SECRET_KEY }} AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_S3_DEV_USER_SECRET_KEY }}
run: uv run python ./cognee/tests/test_load.py run: uv run python ./cognee/tests/test_load.py

View file

@ -5,7 +5,7 @@ permissions:
contents: read contents: read
jobs: jobs:
check-uv-lock: check-uv-lock:
name: Validate uv lockfile and project metadata name: Lockfile and Pre-commit Hooks
runs-on: ubuntu-22.04 runs-on: ubuntu-22.04
steps: steps:
- name: Check out repository - name: Check out repository
@ -17,6 +17,9 @@ jobs:
uses: astral-sh/setup-uv@v4 uses: astral-sh/setup-uv@v4
with: with:
enable-cache: true enable-cache: true
- name: Validate uv lockfile and project metadata - name: Validate uv lockfile and project metadata
run: uv lock --check || { echo "'uv lock --check' failed."; echo "Run 'uv lock' and push your changes."; exit 1; } run: uv lock --check || { echo "'uv lock --check' failed."; echo "Run 'uv lock' and push your changes."; exit 1; }
- name: Run pre-commit hooks
uses: pre-commit/action@v3.0.1

View file

@ -42,10 +42,10 @@ jobs:
echo "tag=${TAG}" >> "$GITHUB_OUTPUT" echo "tag=${TAG}" >> "$GITHUB_OUTPUT"
echo "version=${VERSION}" >> "$GITHUB_OUTPUT" echo "version=${VERSION}" >> "$GITHUB_OUTPUT"
git tag "${TAG}" git tag "${TAG}"
git push origin "${TAG}" git push origin "${TAG}"
- name: Create GitHub Release - name: Create GitHub Release
uses: softprops/action-gh-release@v2 uses: softprops/action-gh-release@v2
@ -54,8 +54,8 @@ jobs:
generate_release_notes: true generate_release_notes: true
env: env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
release-pypi-package: release-pypi-package:
needs: release-github needs: release-github
name: Release PyPI Package from ${{ inputs.flavour }} name: Release PyPI Package from ${{ inputs.flavour }}
permissions: permissions:
@ -67,25 +67,25 @@ jobs:
uses: actions/checkout@v4 uses: actions/checkout@v4
with: with:
ref: ${{ inputs.flavour }} ref: ${{ inputs.flavour }}
- name: Install uv - name: Install uv
uses: astral-sh/setup-uv@v7 uses: astral-sh/setup-uv@v7
- name: Install Python - name: Install Python
run: uv python install run: uv python install
- name: Install dependencies - name: Install dependencies
run: uv sync --locked --all-extras run: uv sync --locked --all-extras
- name: Build distributions - name: Build distributions
run: uv build run: uv build
- name: Publish ${{ inputs.flavour }} release to PyPI - name: Publish ${{ inputs.flavour }} release to PyPI
env: env:
UV_PUBLISH_TOKEN: ${{ secrets.PYPI_TOKEN }} UV_PUBLISH_TOKEN: ${{ secrets.PYPI_TOKEN }}
run: uv publish run: uv publish
release-docker-image: release-docker-image:
needs: release-github needs: release-github
name: Release Docker Image from ${{ inputs.flavour }} name: Release Docker Image from ${{ inputs.flavour }}
permissions: permissions:
@ -128,7 +128,7 @@ jobs:
context: . context: .
platforms: linux/amd64,linux/arm64 platforms: linux/amd64,linux/arm64
push: true push: true
tags: | tags: |
cognee/cognee:${{ needs.release-github.outputs.version }} cognee/cognee:${{ needs.release-github.outputs.version }}
cognee/cognee:latest cognee/cognee:latest
labels: | labels: |
@ -163,4 +163,4 @@ jobs:
-H "Authorization: Bearer ${{ secrets.REPO_DISPATCH_PAT_TOKEN }}" \ -H "Authorization: Bearer ${{ secrets.REPO_DISPATCH_PAT_TOKEN }}" \
-H "X-GitHub-Api-Version: 2022-11-28" \ -H "X-GitHub-Api-Version: 2022-11-28" \
https://api.github.com/repos/topoteretes/cognee-community/dispatches \ https://api.github.com/repos/topoteretes/cognee-community/dispatches \
-d '{"event_type":"new-main-release","client_payload":{"caller_repo":"'"${GITHUB_REPOSITORY}"'"}}' -d '{"event_type":"new-main-release","client_payload":{"caller_repo":"'"${GITHUB_REPOSITORY}"'"}}'

View file

@ -15,4 +15,3 @@ jobs:
name: Load Tests name: Load Tests
uses: ./.github/workflows/load_tests.yml uses: ./.github/workflows/load_tests.yml
secrets: inherit secrets: inherit

View file

@ -10,7 +10,7 @@ on:
required: false required: false
type: string type: string
default: '["3.10.x", "3.12.x", "3.13.x"]' default: '["3.10.x", "3.12.x", "3.13.x"]'
os: os:
required: false required: false
type: string type: string
default: '["ubuntu-22.04", "macos-15", "windows-latest"]' default: '["ubuntu-22.04", "macos-15", "windows-latest"]'

View file

@ -173,4 +173,4 @@ jobs:
EMBEDDING_MODEL: "amazon.titan-embed-text-v2:0" EMBEDDING_MODEL: "amazon.titan-embed-text-v2:0"
EMBEDDING_DIMENSIONS: "1024" EMBEDDING_DIMENSIONS: "1024"
EMBEDDING_MAX_TOKENS: "8191" EMBEDDING_MAX_TOKENS: "8191"
run: uv run python ./examples/python/simple_example.py run: uv run python ./examples/python/simple_example.py

View file

@ -18,11 +18,11 @@ env:
RUNTIME__LOG_LEVEL: ERROR RUNTIME__LOG_LEVEL: ERROR
ENV: 'dev' ENV: 'dev'
jobs: jobs:
pre-test: pre-test:
name: basic checks name: basic checks
uses: ./.github/workflows/pre_test.yml uses: ./.github/workflows/pre_test.yml
basic-tests: basic-tests:
name: Basic Tests name: Basic Tests
uses: ./.github/workflows/basic_tests.yml uses: ./.github/workflows/basic_tests.yml

2
.gitignore vendored
View file

@ -147,6 +147,8 @@ venv/
ENV/ ENV/
env.bak/ env.bak/
venv.bak/ venv.bak/
mise.toml
deployment/helm/values-local.yml
# Spyder project settings # Spyder project settings
.spyderproject .spyderproject

View file

@ -6,4 +6,4 @@ pull_request_rules:
actions: actions:
backport: backport:
branches: branches:
- main - main

View file

@ -7,6 +7,7 @@ repos:
- id: trailing-whitespace - id: trailing-whitespace
- id: end-of-file-fixer - id: end-of-file-fixer
- id: check-yaml - id: check-yaml
exclude: ^deployment/helm/templates/
- id: check-added-large-files - id: check-added-large-files
- repo: https://github.com/astral-sh/ruff-pre-commit - repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version. # Ruff version.

View file

@ -128,5 +128,3 @@ MCP server and Frontend:
## CI Mirrors Local Commands ## CI Mirrors Local Commands
Our GitHub Actions run the same ruff checks and pytest suites shown above (`.github/workflows/basic_tests.yml` and related workflows). Use the commands in this document locally to minimize CI surprises. Our GitHub Actions run the same ruff checks and pytest suites shown above (`.github/workflows/basic_tests.yml` and related workflows). Use the commands in this document locally to minimize CI surprises.

588
CLAUDE.md Normal file
View file

@ -0,0 +1,588 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
Cognee is an open-source AI memory platform that transforms raw data into persistent knowledge graphs for AI agents. It replaces traditional RAG (Retrieval-Augmented Generation) with an ECL (Extract, Cognify, Load) pipeline combining vector search, graph databases, and LLM-powered entity extraction.
**Requirements**: Python 3.9 - 3.12
## Development Commands
### Setup
```bash
# Create virtual environment (recommended: uv)
uv venv && source .venv/bin/activate
# Install with pip, poetry, or uv
uv pip install -e .
# Install with dev dependencies
uv pip install -e ".[dev]"
# Install with specific extras
uv pip install -e ".[postgres,neo4j,docs,chromadb]"
# Set up pre-commit hooks
pre-commit install
```
### Available Installation Extras
- **postgres** / **postgres-binary** - PostgreSQL + PGVector support
- **neo4j** - Neo4j graph database support
- **neptune** - AWS Neptune support
- **chromadb** - ChromaDB vector database
- **docs** - Document processing (unstructured library)
- **scraping** - Web scraping (Tavily, BeautifulSoup, Playwright)
- **langchain** - LangChain integration
- **llama-index** - LlamaIndex integration
- **anthropic** - Anthropic Claude models
- **gemini** - Google Gemini models
- **ollama** - Ollama local models
- **mistral** - Mistral AI models
- **groq** - Groq API support
- **llama-cpp** - Llama.cpp local inference
- **huggingface** - HuggingFace transformers
- **aws** - S3 storage backend
- **redis** - Redis caching
- **graphiti** - Graphiti-core integration
- **baml** - BAML structured output
- **dlt** - Data load tool (dlt) integration
- **docling** - Docling document processing
- **codegraph** - Code graph extraction
- **evals** - Evaluation tools
- **deepeval** - DeepEval testing framework
- **posthog** - PostHog analytics
- **monitoring** - Sentry + Langfuse observability
- **distributed** - Modal distributed execution
- **dev** - All development tools (pytest, mypy, ruff, etc.)
- **debug** - Debugpy for debugging
### Testing
```bash
# Run all tests
pytest
# Run with coverage
pytest --cov=cognee --cov-report=html
# Run specific test file
pytest cognee/tests/test_custom_model.py
# Run specific test function
pytest cognee/tests/test_custom_model.py::test_function_name
# Run async tests
pytest -v cognee/tests/integration/
# Run unit tests only
pytest cognee/tests/unit/
# Run integration tests only
pytest cognee/tests/integration/
```
### Code Quality
```bash
# Run ruff linter
ruff check .
# Run ruff formatter
ruff format .
# Run both linting and formatting (pre-commit)
pre-commit run --all-files
# Type checking with mypy
mypy cognee/
# Run pylint
pylint cognee/
```
### Running Cognee
```bash
# Using Python SDK
python examples/python/simple_example.py
# Using CLI
cognee-cli add "Your text here"
cognee-cli cognify
cognee-cli search "Your query"
cognee-cli delete --all
# Launch full stack with UI
cognee-cli -ui
```
## Architecture Overview
### Core Workflow: add → cognify → search/memify
1. **add()** - Ingest data (files, URLs, text) into datasets
2. **cognify()** - Extract entities/relationships and build knowledge graph
3. **search()** - Query knowledge using various retrieval strategies
4. **memify()** - Enrich graph with additional context and rules
### Key Architectural Patterns
#### 1. Pipeline-Based Processing
All data flows through task-based pipelines (`cognee/modules/pipelines/`). Tasks are composable units that can run sequentially or in parallel. Example pipeline tasks: `classify_documents`, `extract_graph_from_data`, `add_data_points`.
#### 2. Interface-Based Database Adapters
Multiple backends are supported through adapter interfaces:
- **Graph**: Kuzu (default), Neo4j, Neptune via `GraphDBInterface`
- **Vector**: LanceDB (default), ChromaDB, PGVector via `VectorDBInterface`
- **Relational**: SQLite (default), PostgreSQL
Key files:
- `cognee/infrastructure/databases/graph/graph_db_interface.py`
- `cognee/infrastructure/databases/vector/vector_db_interface.py`
#### 3. Multi-Tenant Access Control
User → Dataset → Data hierarchy with permission-based filtering. Enable with `ENABLE_BACKEND_ACCESS_CONTROL=True`. Each user+dataset combination can have isolated graph/vector databases (when using supported backends: Kuzu, LanceDB, SQLite, Postgres).
### Layer Structure
```
API Layer (cognee/api/v1/)
Main Functions (add, cognify, search, memify)
Pipeline Orchestrator (cognee/modules/pipelines/)
Task Execution Layer (cognee/tasks/)
Domain Modules (graph, retrieval, ingestion, etc.)
Infrastructure Adapters (LLM, databases)
External Services (OpenAI, Kuzu, LanceDB, etc.)
```
### Critical Data Flow Paths
#### ADD: Data Ingestion
`add()``resolve_data_directories``ingest_data``save_data_item_to_storage` → Create Dataset + Data records in relational DB
Key files: `cognee/api/v1/add/add.py`, `cognee/tasks/ingestion/ingest_data.py`
#### COGNIFY: Knowledge Graph Construction
`cognify()``classify_documents``extract_chunks_from_documents``extract_graph_from_data` (LLM extracts entities/relationships using Instructor) → `summarize_text``add_data_points` (store in graph + vector DBs)
Key files:
- `cognee/api/v1/cognify/cognify.py`
- `cognee/tasks/graph/extract_graph_from_data.py`
- `cognee/tasks/storage/add_data_points.py`
#### SEARCH: Retrieval
`search(query_text, query_type)` → route to retriever type → filter by permissions → return results
Available search types (from `cognee/modules/search/types/SearchType.py`):
- **GRAPH_COMPLETION** (default) - Graph traversal + LLM completion
- **GRAPH_SUMMARY_COMPLETION** - Uses pre-computed summaries with graph context
- **GRAPH_COMPLETION_COT** - Chain-of-thought reasoning over graph
- **GRAPH_COMPLETION_CONTEXT_EXTENSION** - Extended context graph retrieval
- **TRIPLET_COMPLETION** - Triplet-based (subject-predicate-object) search
- **RAG_COMPLETION** - Traditional RAG with chunks
- **CHUNKS** - Vector similarity search over chunks
- **CHUNKS_LEXICAL** - Lexical (keyword) search over chunks
- **SUMMARIES** - Search pre-computed document summaries
- **CYPHER** - Direct Cypher query execution (requires `ALLOW_CYPHER_QUERY=True`)
- **NATURAL_LANGUAGE** - Natural language to structured query
- **TEMPORAL** - Time-aware graph search
- **FEELING_LUCKY** - Automatic search type selection
- **FEEDBACK** - User feedback-based refinement
- **CODING_RULES** - Code-specific search rules
Key files:
- `cognee/api/v1/search/search.py`
- `cognee/modules/retrieval/context_providers/TripletSearchContextProvider.py`
- `cognee/modules/search/types/SearchType.py`
### Core Data Models
#### Engine Models (`cognee/infrastructure/engine/models/`)
- **DataPoint** - Base class for all graph nodes (versioned, with metadata)
- **Edge** - Graph relationships (source, target, relationship type)
- **Triplet** - (Subject, Predicate, Object) representation
#### Graph Models (`cognee/shared/data_models.py`)
- **KnowledgeGraph** - Container for nodes and edges
- **Node** - Entity (id, name, type, description)
- **Edge** - Relationship (source_node_id, target_node_id, relationship_name)
### Key Infrastructure Components
#### LLM Gateway (`cognee/infrastructure/llm/LLMGateway.py`)
Unified interface for multiple LLM providers: OpenAI, Anthropic, Gemini, Ollama, Mistral, Bedrock. Uses Instructor for structured output extraction.
#### Embedding Engines
Factory pattern for embeddings: `cognee/infrastructure/databases/vector/embeddings/get_embedding_engine.py`
#### Document Loaders
Support for PDF, DOCX, CSV, images, audio, code files in `cognee/infrastructure/files/`
## Important Configuration
### Environment Setup
Copy `.env.template` to `.env` and configure:
```bash
# Minimal setup (defaults to OpenAI + local file-based databases)
LLM_API_KEY="your_openai_api_key"
LLM_MODEL="openai/gpt-4o-mini" # Default model
```
**Important**: If you configure only LLM or only embeddings, the other defaults to OpenAI. Ensure you have a working OpenAI API key, or configure both to avoid unexpected defaults.
Default databases (no extra setup needed):
- **Relational**: SQLite (metadata and state storage)
- **Vector**: LanceDB (embeddings for semantic search)
- **Graph**: Kuzu (knowledge graph and relationships)
All stored in `.venv` by default. Override with `DATA_ROOT_DIRECTORY` and `SYSTEM_ROOT_DIRECTORY`.
### Switching Databases
#### Relational Databases
```bash
# PostgreSQL (requires postgres extra: pip install cognee[postgres])
DB_PROVIDER=postgres
DB_HOST=localhost
DB_PORT=5432
DB_USERNAME=cognee
DB_PASSWORD=cognee
DB_NAME=cognee_db
```
#### Vector Databases
Supported: lancedb (default), pgvector, chromadb, qdrant, weaviate, milvus
```bash
# ChromaDB (requires chromadb extra)
VECTOR_DB_PROVIDER=chromadb
# PGVector (requires postgres extra)
VECTOR_DB_PROVIDER=pgvector
VECTOR_DB_URL=postgresql://cognee:cognee@localhost:5432/cognee_db
```
#### Graph Databases
Supported: kuzu (default), neo4j, neptune, kuzu-remote
```bash
# Neo4j (requires neo4j extra: pip install cognee[neo4j])
GRAPH_DATABASE_PROVIDER=neo4j
GRAPH_DATABASE_URL=bolt://localhost:7687
GRAPH_DATABASE_NAME=neo4j
GRAPH_DATABASE_USERNAME=neo4j
GRAPH_DATABASE_PASSWORD=yourpassword
# Remote Kuzu
GRAPH_DATABASE_PROVIDER=kuzu-remote
GRAPH_DATABASE_URL=http://localhost:8000
GRAPH_DATABASE_USERNAME=your_username
GRAPH_DATABASE_PASSWORD=your_password
```
### LLM Provider Configuration
Supported providers: OpenAI (default), Azure OpenAI, Google Gemini, Anthropic, AWS Bedrock, Ollama, LM Studio, Custom (OpenAI-compatible APIs)
#### OpenAI (Recommended - Minimal Setup)
```bash
LLM_API_KEY="your_openai_api_key"
LLM_MODEL="openai/gpt-4o-mini" # or gpt-4o, gpt-4-turbo, etc.
LLM_PROVIDER="openai"
```
#### Azure OpenAI
```bash
LLM_PROVIDER="azure"
LLM_MODEL="azure/gpt-4o-mini"
LLM_ENDPOINT="https://YOUR-RESOURCE.openai.azure.com/openai/deployments/gpt-4o-mini"
LLM_API_KEY="your_azure_api_key"
LLM_API_VERSION="2024-12-01-preview"
```
#### Google Gemini (requires gemini extra)
```bash
LLM_PROVIDER="gemini"
LLM_MODEL="gemini/gemini-2.0-flash-exp"
LLM_API_KEY="your_gemini_api_key"
```
#### Anthropic Claude (requires anthropic extra)
```bash
LLM_PROVIDER="anthropic"
LLM_MODEL="claude-3-5-sonnet-20241022"
LLM_API_KEY="your_anthropic_api_key"
```
#### Ollama (Local - requires ollama extra)
```bash
LLM_PROVIDER="ollama"
LLM_MODEL="llama3.1:8b"
LLM_ENDPOINT="http://localhost:11434/v1"
LLM_API_KEY="ollama"
EMBEDDING_PROVIDER="ollama"
EMBEDDING_MODEL="nomic-embed-text:latest"
EMBEDDING_ENDPOINT="http://localhost:11434/api/embed"
HUGGINGFACE_TOKENIZER="nomic-ai/nomic-embed-text-v1.5"
```
#### Custom / OpenRouter / vLLM
```bash
LLM_PROVIDER="custom"
LLM_MODEL="openrouter/google/gemini-2.0-flash-lite-preview-02-05:free"
LLM_ENDPOINT="https://openrouter.ai/api/v1"
LLM_API_KEY="your_api_key"
```
#### AWS Bedrock (requires aws extra)
```bash
LLM_PROVIDER="bedrock"
LLM_MODEL="anthropic.claude-3-sonnet-20240229-v1:0"
AWS_REGION="us-east-1"
AWS_ACCESS_KEY_ID="your_access_key"
AWS_SECRET_ACCESS_KEY="your_secret_key"
# Optional for temporary credentials:
# AWS_SESSION_TOKEN="your_session_token"
```
#### LLM Rate Limiting
```bash
LLM_RATE_LIMIT_ENABLED=true
LLM_RATE_LIMIT_REQUESTS=60 # Requests per interval
LLM_RATE_LIMIT_INTERVAL=60 # Interval in seconds
```
#### Instructor Mode (Structured Output)
```bash
# LLM_INSTRUCTOR_MODE controls how structured data is extracted
# Each LLM has its own default (e.g., gpt-4o models use "json_schema_mode")
# Override if needed:
LLM_INSTRUCTOR_MODE="json_schema_mode" # or "tool_call", "md_json", etc.
```
### Structured Output Framework
```bash
# Use Instructor (default, via litellm)
STRUCTURED_OUTPUT_FRAMEWORK="instructor"
# Or use BAML (requires baml extra: pip install cognee[baml])
STRUCTURED_OUTPUT_FRAMEWORK="baml"
BAML_LLM_PROVIDER=openai
BAML_LLM_MODEL="gpt-4o-mini"
BAML_LLM_API_KEY="your_api_key"
```
### Storage Backend
```bash
# Local filesystem (default)
STORAGE_BACKEND="local"
# S3 (requires aws extra: pip install cognee[aws])
STORAGE_BACKEND="s3"
STORAGE_BUCKET_NAME="your-bucket-name"
AWS_REGION="us-east-1"
AWS_ACCESS_KEY_ID="your_access_key"
AWS_SECRET_ACCESS_KEY="your_secret_key"
DATA_ROOT_DIRECTORY="s3://your-bucket/cognee/data"
SYSTEM_ROOT_DIRECTORY="s3://your-bucket/cognee/system"
```
## Extension Points
### Adding New Functionality
1. **New Task Type**: Create task function in `cognee/tasks/`, return Task object, register in pipeline
2. **New Database Backend**: Implement `GraphDBInterface` or `VectorDBInterface` in `cognee/infrastructure/databases/`
3. **New LLM Provider**: Add configuration in LLM config (uses litellm)
4. **New Document Processor**: Extend loaders in `cognee/modules/data/processing/`
5. **New Search Type**: Add to `SearchType` enum and implement retriever in `cognee/modules/retrieval/`
6. **Custom Graph Models**: Define Pydantic models extending `DataPoint` in your code
### Working with Ontologies
Cognee supports ontology-based entity extraction to ground knowledge graphs in standardized semantic frameworks (e.g., OWL ontologies).
Configuration:
```bash
ONTOLOGY_RESOLVER=rdflib # Default: uses rdflib and OWL files
MATCHING_STRATEGY=fuzzy # Default: fuzzy matching with 80% similarity
ONTOLOGY_FILE_PATH=/path/to/your/ontology.owl # Full path to ontology file
```
Implementation: `cognee/modules/ontology/`
## Branching Strategy
**IMPORTANT**: Always branch from `dev`, not `main`. The `dev` branch is the active development branch.
```bash
git checkout dev
git pull origin dev
git checkout -b feature/your-feature-name
```
## Code Style
- Ruff for linting and formatting (configured in `pyproject.toml`)
- Line length: 100 characters
- Pre-commit hooks run ruff automatically
- Type hints encouraged (mypy checks enabled)
## Testing Strategy
Tests are organized in `cognee/tests/`:
- `unit/` - Unit tests for individual modules
- `integration/` - Full pipeline integration tests
- `cli_tests/` - CLI command tests
- `tasks/` - Task-specific tests
When adding features, add corresponding tests. Integration tests should cover the full add → cognify → search flow.
## API Structure
FastAPI application with versioned routes under `cognee/api/v1/`:
- `/add` - Data ingestion
- `/cognify` - Knowledge graph processing
- `/search` - Query interface
- `/memify` - Graph enrichment
- `/datasets` - Dataset management
- `/users` - Authentication (if `REQUIRE_AUTHENTICATION=True`)
- `/visualize` - Graph visualization server
## Python SDK Entry Points
Main functions exported from `cognee/__init__.py`:
- `add(data, dataset_name)` - Ingest data
- `cognify(datasets)` - Build knowledge graph
- `search(query_text, query_type)` - Query knowledge
- `memify(extraction_tasks, enrichment_tasks)` - Enrich graph
- `delete(data_id)` - Remove data
- `config()` - Configuration management
- `datasets()` - Dataset operations
All functions are async - use `await` or `asyncio.run()`.
## Security Considerations
Several security environment variables in `.env`:
- `ACCEPT_LOCAL_FILE_PATH` - Allow local file paths (default: True)
- `ALLOW_HTTP_REQUESTS` - Allow HTTP requests from Cognee (default: True)
- `ALLOW_CYPHER_QUERY` - Allow raw Cypher queries (default: True)
- `REQUIRE_AUTHENTICATION` - Enable API authentication (default: False)
- `ENABLE_BACKEND_ACCESS_CONTROL` - Multi-tenant isolation (default: True)
For production deployments, review and tighten these settings.
## Common Patterns
### Creating a Custom Pipeline Task
```python
from cognee.modules.pipelines.tasks.Task import Task
async def my_custom_task(data):
# Your logic here
processed_data = process(data)
return processed_data
# Use in pipeline
task = Task(my_custom_task)
```
### Accessing Databases Directly
```python
from cognee.infrastructure.databases.graph import get_graph_engine
from cognee.infrastructure.databases.vector import get_vector_engine
graph_engine = await get_graph_engine()
vector_engine = await get_vector_engine()
```
### Using LLM Gateway
```python
from cognee.infrastructure.llm.get_llm_client import get_llm_client
llm_client = get_llm_client()
response = await llm_client.acreate_structured_output(
text_input="Your prompt",
system_prompt="System instructions",
response_model=YourPydanticModel
)
```
## Key Concepts
### Datasets
Datasets are project-level containers that support organization, permissions, and isolated processing workflows. Each user can have multiple datasets with different access permissions.
```python
# Create/use a dataset
await cognee.add(data, dataset_name="my_project")
await cognee.cognify(datasets=["my_project"])
```
### DataPoints
Atomic knowledge units that form the foundation of graph structures. All graph nodes extend the `DataPoint` base class with versioning and metadata support.
### Permissions System
Multi-tenant architecture with users, roles, and Access Control Lists (ACLs):
- Read, write, delete, and share permissions per dataset
- Enable with `ENABLE_BACKEND_ACCESS_CONTROL=True`
- Supports isolated databases per user+dataset (Kuzu, LanceDB, SQLite, Postgres)
### Graph Visualization
Launch visualization server:
```bash
# Via CLI
cognee-cli -ui # Launches full stack with UI at http://localhost:3000
# Via Python
from cognee.api.v1.visualize import start_visualization_server
await start_visualization_server(port=8080)
```
## Debugging & Troubleshooting
### Debug Configuration
- Set `LITELLM_LOG="DEBUG"` for verbose LLM logs (default: "ERROR")
- Enable debug mode: `ENV="development"` or `ENV="debug"`
- Disable telemetry: `TELEMETRY_DISABLED=1`
- Check logs in structured format (uses structlog)
- Use `debugpy` optional dependency for debugging: `pip install cognee[debug]`
### Common Issues
**Ollama + OpenAI Embeddings NoDataError**
- Issue: Mixing Ollama with OpenAI embeddings can cause errors
- Solution: Configure both LLM and embeddings to use the same provider, or ensure `HUGGINGFACE_TOKENIZER` is set when using Ollama
**LM Studio Structured Output**
- Issue: LM Studio requires explicit instructor mode
- Solution: Set `LLM_INSTRUCTOR_MODE="json_schema_mode"` (or appropriate mode)
**Default Provider Fallback**
- Issue: Configuring only LLM or only embeddings defaults the other to OpenAI
- Solution: Always configure both LLM and embedding providers, or ensure valid OpenAI API key
**Permission Denied on Search**
- Behavior: Returns empty list rather than error (prevents information leakage)
- Solution: Check dataset permissions and user access rights
**Database Connection Issues**
- Check: Verify database URLs, credentials, and that services are running
- Docker users: Use `DB_HOST=host.docker.internal` for local databases
**Rate Limiting Errors**
- Enable client-side rate limiting: `LLM_RATE_LIMIT_ENABLED=true`
- Adjust limits: `LLM_RATE_LIMIT_REQUESTS` and `LLM_RATE_LIMIT_INTERVAL`
## Resources
- [Documentation](https://docs.cognee.ai/)
- [Discord Community](https://discord.gg/NQPKmU5CCg)
- [GitHub Issues](https://github.com/topoteretes/cognee/issues)
- [Example Notebooks](examples/python/)
- [Research Paper](https://arxiv.org/abs/2505.24478) - Optimizing knowledge graphs for LLM reasoning

View file

@ -1,16 +1,16 @@
> [!IMPORTANT] > [!IMPORTANT]
> **Note for contributors:** When branching out, create a new branch from the `dev` branch. > **Note for contributors:** When branching out, create a new branch from the `dev` branch.
# 🎉 Welcome to **cognee**! # 🎉 Welcome to **cognee**!
We're excited that you're interested in contributing to our project! We're excited that you're interested in contributing to our project!
We want to ensure that every user and contributor feels welcome, included and supported to participate in cognee community. We want to ensure that every user and contributor feels welcome, included and supported to participate in cognee community.
This guide will help you get started and ensure your contributions can be efficiently integrated into the project. This guide will help you get started and ensure your contributions can be efficiently integrated into the project.
## 🌟 Quick Links ## 🌟 Quick Links
- [Code of Conduct](CODE_OF_CONDUCT.md) - [Code of Conduct](CODE_OF_CONDUCT.md)
- [Discord Community](https://discord.gg/bcy8xFAtfd) - [Discord Community](https://discord.gg/bcy8xFAtfd)
- [Issue Tracker](https://github.com/topoteretes/cognee/issues) - [Issue Tracker](https://github.com/topoteretes/cognee/issues)
- [Cognee Docs](https://docs.cognee.ai) - [Cognee Docs](https://docs.cognee.ai)
@ -62,6 +62,11 @@ Looking for a place to start? Try filtering for [good first issues](https://gith
## 2. 🛠️ Development Setup ## 2. 🛠️ Development Setup
### Required tools
* [Python](https://www.python.org/downloads/)
* [uv](https://docs.astral.sh/uv/getting-started/installation/)
* pre-commit: `uv run pip install pre-commit && pre-commit install`
### Fork and Clone ### Fork and Clone
1. Fork the [**cognee**](https://github.com/topoteretes/cognee) repository 1. Fork the [**cognee**](https://github.com/topoteretes/cognee) repository
@ -93,29 +98,31 @@ git checkout -b feature/your-feature-name
4. **Commits**: Write clear commit messages 4. **Commits**: Write clear commit messages
### Running Tests ### Running Tests
Rename `.env.example` into `.env` and provide your OPENAI_API_KEY as LLM_API_KEY
```shell ```shell
python cognee/cognee/tests/test_library.py uv run python cognee/tests/test_library.py
``` ```
### Running Simple Example ### Running Simple Example
Change .env.example into .env and provide your OPENAI_API_KEY as LLM_API_KEY Rename `.env.example` into `.env` and provide your OPENAI_API_KEY as LLM_API_KEY
Make sure to run ```shell uv sync ``` in the root cloned folder or set up a virtual environment to run cognee Make sure to run ```shell uv sync ``` in the root cloned folder or set up a virtual environment to run cognee
```shell ```shell
python cognee/cognee/examples/python/simple_example.py python examples/python/simple_example.py
``` ```
or or
```shell ```shell
uv run python cognee/cognee/examples/python/simple_example.py uv run python examples/python/simple_example.py
``` ```
## 4. 📤 Submitting Changes ## 4. 📤 Submitting Changes
1. Install ruff on your system 1. Make sure that `pre-commit` and hooks are installed. See `Required tools` section for more information. Try executing `pre-commit run` if you are not sure.
2. Run ```ruff format .``` and ``` ruff check ``` and fix the issues
3. Push your changes: 3. Push your changes:
```shell ```shell
git add . git add .

View file

@ -16,9 +16,6 @@ ARG DEBUG
# Set environment variable based on the build argument # Set environment variable based on the build argument
ENV DEBUG=${DEBUG} ENV DEBUG=${DEBUG}
# if you located in China, you can use aliyun mirror to speed up
#RUN sed -i 's@deb.debian.org@mirrors.ustc.edu.cn@g' /etc/apt/sources.list.d/debian.sources
# Install system dependencies # Install system dependencies
RUN apt-get update && apt-get install -y \ RUN apt-get update && apt-get install -y \
gcc \ gcc \

View file

@ -65,12 +65,12 @@ Use your data to build personalized and dynamic memory for AI Agents. Cognee let
## About Cognee ## About Cognee
Cognee is an open-source tool and platform that transforms your raw data into persistent and dynamic AI memory for Agents. It combines vector search with graph databases to make your documents both searchable by meaning and connected by relationships. Cognee is an open-source tool and platform that transforms your raw data into persistent and dynamic AI memory for Agents. It combines vector search with graph databases to make your documents both searchable by meaning and connected by relationships.
You can use Cognee in two ways: You can use Cognee in two ways:
1. [Self-host Cognee Open Source](https://docs.cognee.ai/getting-started/installation), which stores all data locally by default. 1. [Self-host Cognee Open Source](https://docs.cognee.ai/getting-started/installation), which stores all data locally by default.
2. [Connect to Cognee Cloud](https://platform.cognee.ai/), and get the same OSS stack on managed infrastructure for easier development and productionization. 2. [Connect to Cognee Cloud](https://platform.cognee.ai/), and get the same OSS stack on managed infrastructure for easier development and productionization.
### Cognee Open Source (self-hosted): ### Cognee Open Source (self-hosted):
@ -81,8 +81,8 @@ You can use Cognee in two ways:
- Offers high customizability through user-defined tasks, modular pipelines, and built-in search endpoints - Offers high customizability through user-defined tasks, modular pipelines, and built-in search endpoints
### Cognee Cloud (managed): ### Cognee Cloud (managed):
- Hosted web UI dashboard - Hosted web UI dashboard
- Automatic version updates - Automatic version updates
- Resource usage analytics - Resource usage analytics
- GDPR compliant, enterprise-grade security - GDPR compliant, enterprise-grade security
@ -119,7 +119,7 @@ To integrate other LLM providers, see our [LLM Provider Documentation](https://d
### Step 3: Run the Pipeline ### Step 3: Run the Pipeline
Cognee will take your documents, generate a knowledge graph from them and then query the graph based on combined relationships. Cognee will take your documents, generate a knowledge graph from them and then query the graph based on combined relationships.
Now, run a minimal pipeline: Now, run a minimal pipeline:
@ -157,7 +157,7 @@ As you can see, the output is generated from the document we previously stored i
Cognee turns documents into AI memory. Cognee turns documents into AI memory.
``` ```
### Use the Cognee CLI ### Use the Cognee CLI
As an alternative, you can get started with these essential commands: As an alternative, you can get started with these essential commands:

View file

@ -1 +1 @@
Generic single-database configuration with an async dbapi. Generic single-database configuration with an async dbapi.

View file

@ -14,7 +14,7 @@ import sqlalchemy as sa
# revision identifiers, used by Alembic. # revision identifiers, used by Alembic.
revision: str = "1a58b986e6e1" revision: str = "1a58b986e6e1"
down_revision: Union[str, None] = "46a6ce2bd2b2" down_revision: Union[str, None] = "e1ec1dcb50b6"
branch_labels: Union[str, Sequence[str], None] = None branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None depends_on: Union[str, Sequence[str], None] = None

View file

@ -43,10 +43,10 @@ Saiba mais sobre os [casos de uso](https://docs.cognee.ai/use-cases) e [avaliaç
## Funcionalidades ## Funcionalidades
- Conecte e recupere suas conversas passadas, documentos, imagens e transcrições de áudio - Conecte e recupere suas conversas passadas, documentos, imagens e transcrições de áudio
- Reduza alucinações, esforço de desenvolvimento e custos - Reduza alucinações, esforço de desenvolvimento e custos
- Carregue dados em bancos de dados de grafos e vetores usando apenas Pydantic - Carregue dados em bancos de dados de grafos e vetores usando apenas Pydantic
- Transforme e organize seus dados enquanto os coleta de mais de 30 fontes diferentes - Transforme e organize seus dados enquanto os coleta de mais de 30 fontes diferentes
## Primeiros Passos ## Primeiros Passos
@ -108,7 +108,7 @@ if __name__ == '__main__':
Exemplo do output: Exemplo do output:
``` ```
O Processamento de Linguagem Natural (NLP) é um campo interdisciplinar e transdisciplinar que envolve ciência da computação e recuperação de informações. Ele se concentra na interação entre computadores e a linguagem humana, permitindo que as máquinas compreendam e processem a linguagem natural. O Processamento de Linguagem Natural (NLP) é um campo interdisciplinar e transdisciplinar que envolve ciência da computação e recuperação de informações. Ele se concentra na interação entre computadores e a linguagem humana, permitindo que as máquinas compreendam e processem a linguagem natural.
``` ```
Visualização do grafo: Visualização do grafo:

View file

@ -141,7 +141,7 @@ if __name__ == '__main__':
2. Простая демонстрация GraphRAG 2. Простая демонстрация GraphRAG
[Видео](https://github.com/user-attachments/assets/d80b0776-4eb9-4b8e-aa22-3691e2d44b8f) [Видео](https://github.com/user-attachments/assets/d80b0776-4eb9-4b8e-aa22-3691e2d44b8f)
3. Cognee с Ollama 3. Cognee с Ollama
[Видео](https://github.com/user-attachments/assets/8621d3e8-ecb8-4860-afb2-5594f2ee17db) [Видео](https://github.com/user-attachments/assets/8621d3e8-ecb8-4860-afb2-5594f2ee17db)
## Правила поведения ## Правила поведения

View file

@ -114,7 +114,7 @@ if __name__ == '__main__':
示例输出: 示例输出:
``` ```
自然语言处理NLP是计算机科学和信息检索的跨学科领域。它关注计算机和人类语言之间的交互使机器能够理解和处理自然语言。 自然语言处理NLP是计算机科学和信息检索的跨学科领域。它关注计算机和人类语言之间的交互使机器能够理解和处理自然语言。
``` ```
图形可视化: 图形可视化:
<a href="https://rawcdn.githack.com/topoteretes/cognee/refs/heads/main/assets/graph_visualization.html"><img src="https://rawcdn.githack.com/topoteretes/cognee/refs/heads/main/assets/graph_visualization.png" width="100%" alt="图形可视化"></a> <a href="https://rawcdn.githack.com/topoteretes/cognee/refs/heads/main/assets/graph_visualization.html"><img src="https://rawcdn.githack.com/topoteretes/cognee/refs/heads/main/assets/graph_visualization.png" width="100%" alt="图形可视化"></a>

View file

@ -13,7 +13,7 @@
"classnames": "^2.5.1", "classnames": "^2.5.1",
"culori": "^4.0.1", "culori": "^4.0.1",
"d3-force-3d": "^3.0.6", "d3-force-3d": "^3.0.6",
"next": "^16.1.0", "next": "^16.1.7",
"react": "^19.2.3", "react": "^19.2.3",
"react-dom": "^19.2.3", "react-dom": "^19.2.3",
"react-force-graph-2d": "^1.27.1", "react-force-graph-2d": "^1.27.1",
@ -34,4 +34,4 @@
"tailwindcss": "^4.1.7", "tailwindcss": "^4.1.7",
"typescript": "^5" "typescript": "^5"
} }
} }

View file

@ -55,7 +55,7 @@ export default function CogneeAddWidget({ onData, useCloud = false }: CogneeAddW
setTrue: setProcessingFilesInProgress, setTrue: setProcessingFilesInProgress,
setFalse: setProcessingFilesDone, setFalse: setProcessingFilesDone,
} = useBoolean(false); } = useBoolean(false);
const handleAddFiles = (dataset: Dataset, event: ChangeEvent<HTMLInputElement>) => { const handleAddFiles = (dataset: Dataset, event: ChangeEvent<HTMLInputElement>) => {
event.stopPropagation(); event.stopPropagation();

View file

@ -111,7 +111,7 @@ export default function GraphControls({ data, isAddNodeFormOpen, onGraphShapeCha
const [isAuthShapeChangeEnabled, setIsAuthShapeChangeEnabled] = useState(true); const [isAuthShapeChangeEnabled, setIsAuthShapeChangeEnabled] = useState(true);
const shapeChangeTimeout = useRef<number | null>(null); const shapeChangeTimeout = useRef<number | null>(null);
useEffect(() => { useEffect(() => {
onGraphShapeChange(DEFAULT_GRAPH_SHAPE); onGraphShapeChange(DEFAULT_GRAPH_SHAPE);

View file

@ -57,7 +57,7 @@ export default function GraphVisualization({ ref, data, graphControls, className
// Initial size calculation // Initial size calculation
handleResize(); handleResize();
// ResizeObserver // ResizeObserver
const resizeObserver = new ResizeObserver(() => { const resizeObserver = new ResizeObserver(() => {
handleResize(); handleResize();
}); });
@ -216,7 +216,7 @@ export default function GraphVisualization({ ref, data, graphControls, className
}, [data, graphRef]); }, [data, graphRef]);
const [graphShape, setGraphShape] = useState<string>(); const [graphShape, setGraphShape] = useState<string>();
const zoomToFit: ForceGraphMethods["zoomToFit"] = ( const zoomToFit: ForceGraphMethods["zoomToFit"] = (
durationMs?: number, durationMs?: number,
padding?: number, padding?: number,
@ -227,15 +227,15 @@ export default function GraphVisualization({ ref, data, graphControls, className
// eslint-disable-next-line @typescript-eslint/no-explicit-any // eslint-disable-next-line @typescript-eslint/no-explicit-any
return undefined as any; return undefined as any;
} }
return graphRef.current.zoomToFit?.(durationMs, padding, nodeFilter); return graphRef.current.zoomToFit?.(durationMs, padding, nodeFilter);
}; };
useImperativeHandle(ref, () => ({ useImperativeHandle(ref, () => ({
zoomToFit, zoomToFit,
setGraphShape, setGraphShape,
})); }));
return ( return (
<div ref={containerRef} className={classNames("w-full h-full", className)} id="graph-container"> <div ref={containerRef} className={classNames("w-full h-full", className)} id="graph-container">

View file

@ -1373,4 +1373,4 @@
"padding": 20 "padding": 20
} }
} }
} }

View file

@ -134,7 +134,7 @@ export default function DatasetsAccordion({
} = useBoolean(false); } = useBoolean(false);
const [datasetToRemove, setDatasetToRemove] = useState<Dataset | null>(null); const [datasetToRemove, setDatasetToRemove] = useState<Dataset | null>(null);
const handleDatasetRemove = (dataset: Dataset) => { const handleDatasetRemove = (dataset: Dataset) => {
setDatasetToRemove(dataset); setDatasetToRemove(dataset);
openRemoveDatasetModal(); openRemoveDatasetModal();

View file

@ -45,7 +45,7 @@ export default function Plan() {
<div className="bg-white rounded-xl px-5 py-5 mb-2"> <div className="bg-white rounded-xl px-5 py-5 mb-2">
Affordable and transparent pricing Affordable and transparent pricing
</div> </div>
<div className="grid grid-cols-3 gap-x-2.5"> <div className="grid grid-cols-3 gap-x-2.5">
<div className="pt-13 py-4 px-5 mb-2.5 rounded-tl-xl rounded-tr-xl bg-white h-full"> <div className="pt-13 py-4 px-5 mb-2.5 rounded-tl-xl rounded-tr-xl bg-white h-full">
<div>Basic</div> <div>Basic</div>

View file

@ -40,7 +40,7 @@ export default function useChat(dataset: Dataset) {
setTrue: disableSearchRun, setTrue: disableSearchRun,
setFalse: enableSearchRun, setFalse: enableSearchRun,
} = useBoolean(false); } = useBoolean(false);
const refreshChat = useCallback(async () => { const refreshChat = useCallback(async () => {
const data = await fetchMessages(); const data = await fetchMessages();
return setMessages(data); return setMessages(data);

View file

@ -46,7 +46,7 @@ function useDatasets(useCloud = false) {
// checkDatasetStatuses(datasets); // checkDatasetStatuses(datasets);
// }, 50000); // }, 50000);
// }, [fetchDatasetStatuses]); // }, [fetchDatasetStatuses]);
// useEffect(() => { // useEffect(() => {
// return () => { // return () => {
// if (statusTimeout.current !== null) { // if (statusTimeout.current !== null) {

View file

@ -7,7 +7,7 @@ export default function createNotebook(notebookName: string, instance: CogneeIns
headers: { headers: {
"Content-Type": "application/json", "Content-Type": "application/json",
}, },
}).then((response: Response) => }).then((response: Response) =>
response.ok ? response.json() : Promise.reject(response) response.ok ? response.json() : Promise.reject(response)
); );
} }

View file

@ -6,7 +6,7 @@ export default function getNotebooks(instance: CogneeInstance) {
headers: { headers: {
"Content-Type": "application/json", "Content-Type": "application/json",
}, },
}).then((response: Response) => }).then((response: Response) =>
response.ok ? response.json() : Promise.reject(response) response.ok ? response.json() : Promise.reject(response)
); );
} }

View file

@ -7,7 +7,7 @@ export default function saveNotebook(notebookId: string, notebookData: object, i
headers: { headers: {
"Content-Type": "application/json", "Content-Type": "application/json",
}, },
}).then((response: Response) => }).then((response: Response) =>
response.ok ? response.json() : Promise.reject(response) response.ok ? response.json() : Promise.reject(response)
); );
} }

View file

@ -7,4 +7,4 @@ export default function GitHubIcon({ width = 24, height = 24, color = 'currentCo
</g> </g>
</svg> </svg>
); );
} }

View file

@ -46,7 +46,7 @@ export default function Header({ user }: HeaderProps) {
checkMCPConnection(); checkMCPConnection();
const interval = setInterval(checkMCPConnection, 30000); const interval = setInterval(checkMCPConnection, 30000);
return () => clearInterval(interval); return () => clearInterval(interval);
}, [setMCPConnected, setMCPDisconnected]); }, [setMCPConnected, setMCPDisconnected]);

View file

@ -90,7 +90,7 @@ export default function SearchView() {
scrollToBottom(); scrollToBottom();
setSearchInputValue(""); setSearchInputValue("");
// Pass topK to sendMessage // Pass topK to sendMessage
sendMessage(chatInput, searchType, topK) sendMessage(chatInput, searchType, topK)
.then(scrollToBottom) .then(scrollToBottom)
@ -171,4 +171,4 @@ export default function SearchView() {
</form> </form>
</div> </div>
); );
} }

View file

@ -1,3 +1,2 @@
export { default as Modal } from "./Modal"; export { default as Modal } from "./Modal";
export { default as useModal } from "./useModal"; export { default as useModal } from "./useModal";

View file

@ -74,4 +74,3 @@ function MarkdownPreview({ content, className = "" }: MarkdownPreviewProps) {
} }
export default memo(MarkdownPreview); export default memo(MarkdownPreview);

View file

@ -534,7 +534,7 @@ function transformInsightsGraphData(triplets: Triplet[]) {
target: string, target: string,
label: string, label: string,
} }
} = {}; } = {};
for (const triplet of triplets) { for (const triplet of triplets) {
nodes[triplet[0].id] = { nodes[triplet[0].id] = {

View file

@ -34,8 +34,8 @@ export default function TextArea({
// Cache maxHeight on first calculation // Cache maxHeight on first calculation
if (maxHeightRef.current === null) { if (maxHeightRef.current === null) {
const computedStyle = getComputedStyle(textarea); const computedStyle = getComputedStyle(textarea);
maxHeightRef.current = computedStyle.maxHeight === "none" maxHeightRef.current = computedStyle.maxHeight === "none"
? Infinity ? Infinity
: parseInt(computedStyle.maxHeight) || Infinity; : parseInt(computedStyle.maxHeight) || Infinity;
} }

View file

@ -10,4 +10,4 @@ export { default as NeutralButton } from "./NeutralButton";
export { default as StatusIndicator } from "./StatusIndicator"; export { default as StatusIndicator } from "./StatusIndicator";
export { default as StatusDot } from "./StatusDot"; export { default as StatusDot } from "./StatusDot";
export { default as Accordion } from "./Accordion"; export { default as Accordion } from "./Accordion";
export { default as Notebook } from "./Notebook"; export { default as Notebook } from "./Notebook";

View file

@ -57,7 +57,7 @@ export default async function fetch(url: string, options: RequestInit = {}, useC
new Error("Backend server is not responding. Please check if the server is running.") new Error("Backend server is not responding. Please check if the server is running.")
); );
} }
if (error.detail === undefined) { if (error.detail === undefined) {
return Promise.reject( return Promise.reject(
new Error("No connection to the server.") new Error("No connection to the server.")
@ -74,7 +74,7 @@ export default async function fetch(url: string, options: RequestInit = {}, useC
fetch.checkHealth = async () => { fetch.checkHealth = async () => {
const maxRetries = 5; const maxRetries = 5;
const retryDelay = 1000; // 1 second const retryDelay = 1000; // 1 second
for (let i = 0; i < maxRetries; i++) { for (let i = 0; i < maxRetries; i++) {
try { try {
const response = await global.fetch(`${backendApiUrl.replace("/api", "")}/health`); const response = await global.fetch(`${backendApiUrl.replace("/api", "")}/health`);
@ -90,7 +90,7 @@ fetch.checkHealth = async () => {
await new Promise(resolve => setTimeout(resolve, retryDelay)); await new Promise(resolve => setTimeout(resolve, retryDelay));
} }
} }
throw new Error("Backend server is not responding after multiple attempts"); throw new Error("Backend server is not responding after multiple attempts");
}; };

View file

@ -105,14 +105,14 @@ If you'd rather run cognee-mcp in a container, you have two options:
```bash ```bash
# For HTTP transport (recommended for web deployments) # For HTTP transport (recommended for web deployments)
docker run -e TRANSPORT_MODE=http --env-file ./.env -p 8000:8000 --rm -it cognee/cognee-mcp:main docker run -e TRANSPORT_MODE=http --env-file ./.env -p 8000:8000 --rm -it cognee/cognee-mcp:main
# For SSE transport # For SSE transport
docker run -e TRANSPORT_MODE=sse --env-file ./.env -p 8000:8000 --rm -it cognee/cognee-mcp:main docker run -e TRANSPORT_MODE=sse --env-file ./.env -p 8000:8000 --rm -it cognee/cognee-mcp:main
# For stdio transport (default) # For stdio transport (default)
docker run -e TRANSPORT_MODE=stdio --env-file ./.env --rm -it cognee/cognee-mcp:main docker run -e TRANSPORT_MODE=stdio --env-file ./.env --rm -it cognee/cognee-mcp:main
``` ```
**Installing optional dependencies at runtime:** **Installing optional dependencies at runtime:**
You can install optional dependencies when running the container by setting the `EXTRAS` environment variable: You can install optional dependencies when running the container by setting the `EXTRAS` environment variable:
```bash ```bash
# Install a single optional dependency group at runtime # Install a single optional dependency group at runtime
@ -122,7 +122,7 @@ If you'd rather run cognee-mcp in a container, you have two options:
--env-file ./.env \ --env-file ./.env \
-p 8000:8000 \ -p 8000:8000 \
--rm -it cognee/cognee-mcp:main --rm -it cognee/cognee-mcp:main
# Install multiple optional dependency groups at runtime (comma-separated) # Install multiple optional dependency groups at runtime (comma-separated)
docker run \ docker run \
-e TRANSPORT_MODE=sse \ -e TRANSPORT_MODE=sse \
@ -131,7 +131,7 @@ If you'd rather run cognee-mcp in a container, you have two options:
-p 8000:8000 \ -p 8000:8000 \
--rm -it cognee/cognee-mcp:main --rm -it cognee/cognee-mcp:main
``` ```
**Available optional dependency groups:** **Available optional dependency groups:**
- `aws` - S3 storage support - `aws` - S3 storage support
- `postgres` / `postgres-binary` - PostgreSQL database support - `postgres` / `postgres-binary` - PostgreSQL database support
@ -160,7 +160,7 @@ If you'd rather run cognee-mcp in a container, you have two options:
# With stdio transport (default) # With stdio transport (default)
docker run -e TRANSPORT_MODE=stdio --env-file ./.env --rm -it cognee/cognee-mcp:main docker run -e TRANSPORT_MODE=stdio --env-file ./.env --rm -it cognee/cognee-mcp:main
``` ```
**With runtime installation of optional dependencies:** **With runtime installation of optional dependencies:**
```bash ```bash
# Install optional dependencies from Docker Hub image # Install optional dependencies from Docker Hub image
@ -357,7 +357,7 @@ You can configure both transports simultaneously for testing:
"url": "http://localhost:8000/sse" "url": "http://localhost:8000/sse"
}, },
"cognee-http": { "cognee-http": {
"type": "http", "type": "http",
"url": "http://localhost:8000/mcp" "url": "http://localhost:8000/mcp"
} }
} }

View file

@ -7,11 +7,11 @@ echo "Environment: $ENVIRONMENT"
# Install optional dependencies if EXTRAS is set # Install optional dependencies if EXTRAS is set
if [ -n "$EXTRAS" ]; then if [ -n "$EXTRAS" ]; then
echo "Installing optional dependencies: $EXTRAS" echo "Installing optional dependencies: $EXTRAS"
# Get the cognee version that's currently installed # Get the cognee version that's currently installed
COGNEE_VERSION=$(uv pip show cognee | grep "Version:" | awk '{print $2}') COGNEE_VERSION=$(uv pip show cognee | grep "Version:" | awk '{print $2}')
echo "Current cognee version: $COGNEE_VERSION" echo "Current cognee version: $COGNEE_VERSION"
# Build the extras list for cognee # Build the extras list for cognee
IFS=',' read -ra EXTRA_ARRAY <<< "$EXTRAS" IFS=',' read -ra EXTRA_ARRAY <<< "$EXTRAS"
# Combine base extras from pyproject.toml with requested extras # Combine base extras from pyproject.toml with requested extras
@ -28,11 +28,11 @@ if [ -n "$EXTRAS" ]; then
fi fi
fi fi
done done
echo "Installing cognee with extras: $ALL_EXTRAS" echo "Installing cognee with extras: $ALL_EXTRAS"
echo "Running: uv pip install 'cognee[$ALL_EXTRAS]==$COGNEE_VERSION'" echo "Running: uv pip install 'cognee[$ALL_EXTRAS]==$COGNEE_VERSION'"
uv pip install "cognee[$ALL_EXTRAS]==$COGNEE_VERSION" uv pip install "cognee[$ALL_EXTRAS]==$COGNEE_VERSION"
# Verify installation # Verify installation
echo "" echo ""
echo "✓ Optional dependencies installation completed" echo "✓ Optional dependencies installation completed"
@ -93,19 +93,19 @@ if [ -n "$API_URL" ]; then
if echo "$API_URL" | grep -q "localhost" || echo "$API_URL" | grep -q "127.0.0.1"; then if echo "$API_URL" | grep -q "localhost" || echo "$API_URL" | grep -q "127.0.0.1"; then
echo "⚠️ Warning: API_URL contains localhost/127.0.0.1" echo "⚠️ Warning: API_URL contains localhost/127.0.0.1"
echo " Original: $API_URL" echo " Original: $API_URL"
# Try to use host.docker.internal (works on Mac/Windows and recent Linux with Docker Desktop) # Try to use host.docker.internal (works on Mac/Windows and recent Linux with Docker Desktop)
FIXED_API_URL=$(echo "$API_URL" | sed 's/localhost/host.docker.internal/g' | sed 's/127\.0\.0\.1/host.docker.internal/g') FIXED_API_URL=$(echo "$API_URL" | sed 's/localhost/host.docker.internal/g' | sed 's/127\.0\.0\.1/host.docker.internal/g')
echo " Converted to: $FIXED_API_URL" echo " Converted to: $FIXED_API_URL"
echo " This will work on Mac/Windows/Docker Desktop." echo " This will work on Mac/Windows/Docker Desktop."
echo " On Linux without Docker Desktop, you may need to:" echo " On Linux without Docker Desktop, you may need to:"
echo " - Use --network host, OR" echo " - Use --network host, OR"
echo " - Set API_URL=http://172.17.0.1:8000 (Docker bridge IP)" echo " - Set API_URL=http://172.17.0.1:8000 (Docker bridge IP)"
API_URL="$FIXED_API_URL" API_URL="$FIXED_API_URL"
fi fi
API_ARGS="--api-url $API_URL" API_ARGS="--api-url $API_URL"
if [ -n "$API_TOKEN" ]; then if [ -n "$API_TOKEN" ]; then
API_ARGS="$API_ARGS --api-token $API_TOKEN" API_ARGS="$API_ARGS --api-token $API_TOKEN"

View file

@ -16,4 +16,4 @@ EMBEDDING_API_VERSION=""
GRAPHISTRY_USERNAME="" GRAPHISTRY_USERNAME=""
GRAPHISTRY_PASSWORD="" GRAPHISTRY_PASSWORD=""

View file

@ -14,7 +14,7 @@ This starter kit is deprecated. Its examples have been integrated into the `/new
# Cognee Starter Kit # Cognee Starter Kit
Welcome to the <a href="https://github.com/topoteretes/cognee">cognee</a> Starter Repo! This repository is designed to help you get started quickly by providing a structured dataset and pre-built data pipelines using cognee to build powerful knowledge graphs. Welcome to the <a href="https://github.com/topoteretes/cognee">cognee</a> Starter Repo! This repository is designed to help you get started quickly by providing a structured dataset and pre-built data pipelines using cognee to build powerful knowledge graphs.
You can use this repo to ingest, process, and visualize data in minutes. You can use this repo to ingest, process, and visualize data in minutes.
By following this guide, you will: By following this guide, you will:
@ -80,7 +80,7 @@ Custom model uses custom pydantic model for graph extraction. This script catego
python src/pipelines/custom-model.py python src/pipelines/custom-model.py
``` ```
## Graph preview ## Graph preview
cognee provides a visualize_graph function that will render the graph for you. cognee provides a visualize_graph function that will render the graph for you.

View file

@ -8,12 +8,14 @@ from fastapi.encoders import jsonable_encoder
from cognee.modules.search.types import SearchType, SearchResult, CombinedSearchResult from cognee.modules.search.types import SearchType, SearchResult, CombinedSearchResult
from cognee.api.DTO import InDTO, OutDTO from cognee.api.DTO import InDTO, OutDTO
from cognee.modules.users.exceptions.exceptions import PermissionDeniedError from cognee.modules.users.exceptions.exceptions import PermissionDeniedError, UserNotFoundError
from cognee.modules.users.models import User from cognee.modules.users.models import User
from cognee.modules.search.operations import get_history from cognee.modules.search.operations import get_history
from cognee.modules.users.methods import get_authenticated_user from cognee.modules.users.methods import get_authenticated_user
from cognee.shared.utils import send_telemetry from cognee.shared.utils import send_telemetry
from cognee import __version__ as cognee_version from cognee import __version__ as cognee_version
from cognee.infrastructure.databases.exceptions import DatabaseNotCreatedError
from cognee.exceptions import CogneeValidationError
# Note: Datasets sent by name will only map to datasets owned by the request sender # Note: Datasets sent by name will only map to datasets owned by the request sender
@ -138,6 +140,17 @@ def get_search_router() -> APIRouter:
) )
return jsonable_encoder(results) return jsonable_encoder(results)
except (DatabaseNotCreatedError, UserNotFoundError, CogneeValidationError) as e:
# Return a clear 422 with actionable guidance instead of leaking a stacktrace
status_code = getattr(e, "status_code", 422)
return JSONResponse(
status_code=status_code,
content={
"error": "Search prerequisites not met",
"detail": str(e),
"hint": "Run `await cognee.add(...)` then `await cognee.cognify()` before searching.",
},
)
except PermissionDeniedError: except PermissionDeniedError:
return [] return []
except Exception as error: except Exception as error:

View file

@ -11,6 +11,9 @@ from cognee.modules.data.methods import get_authorized_existing_datasets
from cognee.modules.data.exceptions import DatasetNotFoundError from cognee.modules.data.exceptions import DatasetNotFoundError
from cognee.context_global_variables import set_session_user_context_variable from cognee.context_global_variables import set_session_user_context_variable
from cognee.shared.logging_utils import get_logger from cognee.shared.logging_utils import get_logger
from cognee.infrastructure.databases.exceptions import DatabaseNotCreatedError
from cognee.exceptions import CogneeValidationError
from cognee.modules.users.exceptions.exceptions import UserNotFoundError
logger = get_logger() logger = get_logger()
@ -176,7 +179,18 @@ async def search(
datasets = [datasets] datasets = [datasets]
if user is None: if user is None:
user = await get_default_user() try:
user = await get_default_user()
except (DatabaseNotCreatedError, UserNotFoundError) as error:
# Provide a clear, actionable message instead of surfacing low-level stacktraces
raise CogneeValidationError(
message=(
"Search prerequisites not met: no database/default user found. "
"Initialize Cognee before searching by:\n"
"• running `await cognee.add(...)` followed by `await cognee.cognify()`."
),
name="SearchPreconditionError",
) from error
await set_session_user_context_variable(user) await set_session_user_context_variable(user)

View file

@ -71,7 +71,7 @@ def get_sync_router() -> APIRouter:
-H "Content-Type: application/json" \\ -H "Content-Type: application/json" \\
-H "Cookie: auth_token=your-token" \\ -H "Cookie: auth_token=your-token" \\
-d '{"dataset_ids": ["123e4567-e89b-12d3-a456-426614174000", "456e7890-e12b-34c5-d678-901234567000"]}' -d '{"dataset_ids": ["123e4567-e89b-12d3-a456-426614174000", "456e7890-e12b-34c5-d678-901234567000"]}'
# Sync all user datasets (empty request body or null dataset_ids) # Sync all user datasets (empty request body or null dataset_ids)
curl -X POST "http://localhost:8000/api/v1/sync" \\ curl -X POST "http://localhost:8000/api/v1/sync" \\
-H "Content-Type: application/json" \\ -H "Content-Type: application/json" \\
@ -88,7 +88,7 @@ def get_sync_router() -> APIRouter:
- **413 Payload Too Large**: Dataset too large for current cloud plan - **413 Payload Too Large**: Dataset too large for current cloud plan
- **429 Too Many Requests**: Rate limit exceeded - **429 Too Many Requests**: Rate limit exceeded
## Notes ## Notes
- Sync operations run in the background - you get an immediate response - Sync operations run in the background - you get an immediate response
- Use the returned run_id to track progress (status API coming soon) - Use the returned run_id to track progress (status API coming soon)
- Large datasets are automatically chunked for efficient transfer - Large datasets are automatically chunked for efficient transfer
@ -179,7 +179,7 @@ def get_sync_router() -> APIRouter:
``` ```
## Example Responses ## Example Responses
**No running syncs:** **No running syncs:**
```json ```json
{ {

View file

@ -21,7 +21,7 @@ binary streams, then stores them in a specified dataset for further processing.
Supported Input Types: Supported Input Types:
- **Text strings**: Direct text content - **Text strings**: Direct text content
- **File paths**: Local file paths (absolute paths starting with "/") - **File paths**: Local file paths (absolute paths starting with "/")
- **File URLs**: "file:///absolute/path" or "file://relative/path" - **File URLs**: "file:///absolute/path" or "file://relative/path"
- **S3 paths**: "s3://bucket-name/path/to/file" - **S3 paths**: "s3://bucket-name/path/to/file"
- **Lists**: Multiple files or text strings in a single call - **Lists**: Multiple files or text strings in a single call

View file

@ -17,7 +17,7 @@ The `cognee config` command allows you to view and modify configuration settings
You can: You can:
- View all current configuration settings - View all current configuration settings
- Get specific configuration values - Get specific configuration values
- Set configuration values - Set configuration values
- Unset (reset to default) specific configuration values - Unset (reset to default) specific configuration values
- Reset all configuration to defaults - Reset all configuration to defaults

View file

@ -290,7 +290,7 @@ class NeptuneAnalyticsAdapter(NeptuneGraphDB, VectorDBInterface):
query_string = f""" query_string = f"""
CALL neptune.algo.vectors.topKByEmbeddingWithFiltering({{ CALL neptune.algo.vectors.topKByEmbeddingWithFiltering({{
topK: {limit}, topK: {limit},
embedding: {embedding}, embedding: {embedding},
nodeFilter: {{ equals: {{property: '{self._COLLECTION_PREFIX}', value: '{collection_name}'}} }} nodeFilter: {{ equals: {{property: '{self._COLLECTION_PREFIX}', value: '{collection_name}'}} }}
}} }}
) )
@ -299,7 +299,7 @@ class NeptuneAnalyticsAdapter(NeptuneGraphDB, VectorDBInterface):
if with_vector: if with_vector:
query_string += """ query_string += """
WITH node, score, id(node) as node_id WITH node, score, id(node) as node_id
MATCH (n) MATCH (n)
WHERE id(n) = id(node) WHERE id(n) = id(node)
CALL neptune.algo.vectors.get(n) CALL neptune.algo.vectors.get(n)

View file

@ -10,4 +10,4 @@ Extraction rules:
5. Current-time references ("now", "current", "today"): If the query explicitly refers to the present, set both starts_at and ends_at to now (the ingestion timestamp). 5. Current-time references ("now", "current", "today"): If the query explicitly refers to the present, set both starts_at and ends_at to now (the ingestion timestamp).
6. "Who is" and "Who was" questions: These imply a general identity or biographical inquiry without a specific temporal scope. Set both starts_at and ends_at to None. 6. "Who is" and "Who was" questions: These imply a general identity or biographical inquiry without a specific temporal scope. Set both starts_at and ends_at to None.
7. Ordering rule: Always ensure the earlier date is assigned to starts_at and the later date to ends_at. 7. Ordering rule: Always ensure the earlier date is assigned to starts_at and the later date to ends_at.
8. No temporal information: If no valid or inferable time reference is found, set both starts_at and ends_at to None. 8. No temporal information: If no valid or inferable time reference is found, set both starts_at and ends_at to None.

View file

@ -22,4 +22,4 @@ The `attributes` should be a list of dictionaries, each containing:
- Relationships should be technical with one or at most two words. If two words, use underscore camelcase style - Relationships should be technical with one or at most two words. If two words, use underscore camelcase style
- Relationships could imply general meaning like: subject, object, participant, recipient, agent, instrument, tool, source, cause, effect, purpose, manner, resource, etc. - Relationships could imply general meaning like: subject, object, participant, recipient, agent, instrument, tool, source, cause, effect, purpose, manner, resource, etc.
- You can combine two words to form a relationship name: subject_role, previous_owner, etc. - You can combine two words to form a relationship name: subject_role, previous_owner, etc.
- Focus on how the entity specifically relates to the event - Focus on how the entity specifically relates to the event

View file

@ -27,4 +27,4 @@ class Event(BaseModel):
time_from: Optional[Timestamp] = None time_from: Optional[Timestamp] = None
time_to: Optional[Timestamp] = None time_to: Optional[Timestamp] = None
location: Optional[str] = None location: Optional[str] = None
``` ```

View file

@ -19,8 +19,8 @@ The aim is to achieve simplicity and clarity in the knowledge graph.
- **Naming Convention**: Use snake_case for relationship names, e.g., `acted_in`. - **Naming Convention**: Use snake_case for relationship names, e.g., `acted_in`.
# 3. Coreference Resolution # 3. Coreference Resolution
- **Maintain Entity Consistency**: When extracting entities, it's vital to ensure consistency. - **Maintain Entity Consistency**: When extracting entities, it's vital to ensure consistency.
If an entity, such as "John Doe", is mentioned multiple times in the text but is referred to by different names or pronouns (e.g., "Joe", "he"), If an entity, is mentioned multiple times in the text but is referred to by different names or pronouns,
always use the most complete identifier for that entity throughout the knowledge graph. In this example, use "John Doe" as the Persons ID. always use the most complete identifier for that entity throughout the knowledge graph.
Remember, the knowledge graph should be coherent and easily understandable, so maintaining consistency in entity references is crucial. Remember, the knowledge graph should be coherent and easily understandable, so maintaining consistency in entity references is crucial.
# 4. Strict Compliance # 4. Strict Compliance
Adhere to the rules strictly. Non-compliance will result in termination Adhere to the rules strictly. Non-compliance will result in termination

View file

@ -22,7 +22,7 @@ You are an advanced algorithm designed to extract structured information to buil
3. **Coreference Resolution**: 3. **Coreference Resolution**:
- Maintain one consistent node ID for each real-world entity. - Maintain one consistent node ID for each real-world entity.
- Resolve aliases, acronyms, and pronouns to the most complete form. - Resolve aliases, acronyms, and pronouns to the most complete form.
- *Example*: Always use "John Doe" even if later referred to as "Doe" or "he". - *Example*: Always use full identifier even if later referred to as in a similar but slightly different way
**Property & Data Guidelines**: **Property & Data Guidelines**:

View file

@ -42,10 +42,10 @@ You are an advanced algorithm designed to extract structured information from un
- **Rule**: Resolve all aliases, acronyms, and pronouns to one canonical identifier. - **Rule**: Resolve all aliases, acronyms, and pronouns to one canonical identifier.
> **One-Shot Example**: > **One-Shot Example**:
> **Input**: "John Doe is an author. Later, Doe published a book. He is well-known." > **Input**: "X is an author. Later, Doe published a book. He is well-known."
> **Output Node**: > **Output Node**:
> ``` > ```
> John Doe (Person) > X (Person)
> ``` > ```
--- ---

View file

@ -15,7 +15,7 @@ You are an advanced algorithm that extracts structured data into a knowledge gra
- Properties are key-value pairs; do not use escaped quotes. - Properties are key-value pairs; do not use escaped quotes.
3. **Coreference Resolution** 3. **Coreference Resolution**
- Use a single, complete identifier for each entity (e.g., always "John Doe" not "Joe" or "he"). - Use a single, complete identifier for each entity
4. **Relationship Labels**: 4. **Relationship Labels**:
- Use descriptive, lowercase, snake_case names for edges. - Use descriptive, lowercase, snake_case names for edges.

View file

@ -26,7 +26,7 @@ Use **basic atomic types** for node labels. Always prefer general types over spe
- Good: "Alan Turing", "Google Inc.", "World War II" - Good: "Alan Turing", "Google Inc.", "World War II"
- Bad: "Entity_001", "1234", "he", "they" - Bad: "Entity_001", "1234", "he", "they"
- Never use numeric or autogenerated IDs. - Never use numeric or autogenerated IDs.
- Prioritize **most complete form** of entity names for consistency (e.g., always use "John Doe" instead of "John" or "he"). - Prioritize **most complete form** of entity names for consistency
2. Dates, Numbers, and Properties 2. Dates, Numbers, and Properties
--------------------------------- ---------------------------------

View file

@ -2,12 +2,12 @@ You are an expert query analyzer for a **GraphRAG system**. Your primary goal is
Here are the available `SearchType` tools and their specific functions: Here are the available `SearchType` tools and their specific functions:
- **`SUMMARIES`**: The `SUMMARIES` search type retrieves summarized information from the knowledge graph. - **`SUMMARIES`**: The `SUMMARIES` search type retrieves summarized information from the knowledge graph.
**Best for:** **Best for:**
- Getting concise overviews of topics - Getting concise overviews of topics
- Summarizing large amounts of information - Summarizing large amounts of information
- Quick understanding of complex subjects - Quick understanding of complex subjects
**Best for:** **Best for:**
@ -16,7 +16,7 @@ Here are the available `SearchType` tools and their specific functions:
- Understanding relationships between concepts - Understanding relationships between concepts
- Exploring the structure of your knowledge graph - Exploring the structure of your knowledge graph
* **`CHUNKS`**: The `CHUNKS` search type retrieves specific facts and information chunks from the knowledge graph. * **`CHUNKS`**: The `CHUNKS` search type retrieves specific facts and information chunks from the knowledge graph.
**Best for:** **Best for:**
@ -122,4 +122,4 @@ Response: `NATURAL_LANGUAGE`
Your response MUST be a single word, consisting of only the chosen `SearchType` name. Do not provide any explanation. Your response MUST be a single word, consisting of only the chosen `SearchType` name. Do not provide any explanation.

View file

@ -1 +1 @@
Respond with: test Respond with: test

View file

@ -194,6 +194,7 @@ def get_llm_client(raise_api_key_error: bool = True):
) )
# Get optional local mode parameters (will be None if not set) # Get optional local mode parameters (will be None if not set)
# TODO: refactor llm_config to include these parameters, currently they cannot be defined and defaults are used
model_path = getattr(llm_config, "llama_cpp_model_path", None) model_path = getattr(llm_config, "llama_cpp_model_path", None)
n_ctx = getattr(llm_config, "llama_cpp_n_ctx", 2048) n_ctx = getattr(llm_config, "llama_cpp_n_ctx", 2048)
n_gpu_layers = getattr(llm_config, "llama_cpp_n_gpu_layers", 0) n_gpu_layers = getattr(llm_config, "llama_cpp_n_gpu_layers", 0)

View file

@ -973,4 +973,4 @@
"python_version": null, "python_version": null,
"pep_status": null "pep_status": null
} }
] ]

View file

@ -76,4 +76,4 @@ Section: Open Questions or TODOs
Create a checklist of unresolved decisions, logic that needs clarification, or tasks that are still pending. Create a checklist of unresolved decisions, logic that needs clarification, or tasks that are still pending.
Section: Last Updated Section: Last Updated
Include the most recent update date and who made the update. Include the most recent update date and who made the update.

View file

@ -72,4 +72,3 @@ profile = "black"
- E501: line too long -> break with parentheses - E501: line too long -> break with parentheses
- E225: missing whitespace around operator - E225: missing whitespace around operator
- E402: module import not at top of file - E402: module import not at top of file

View file

@ -72,4 +72,3 @@ Use modules/packages to separate concerns; avoid wildcard imports.
- Is this the simplest working solution? - Is this the simplest working solution?
- Are errors explicit and logged? - Are errors explicit and logged?
- Are modules/namespaces used appropriately? - Are modules/namespaces used appropriately?

View file

@ -1 +0,0 @@

View file

@ -16,11 +16,8 @@ def get_api_auth_backend():
def get_jwt_strategy() -> JWTStrategy[models.UP, models.ID]: def get_jwt_strategy() -> JWTStrategy[models.UP, models.ID]:
secret = os.getenv("FASTAPI_USERS_JWT_SECRET", "super_secret") secret = os.getenv("FASTAPI_USERS_JWT_SECRET", "super_secret")
try: lifetime_seconds = int(os.getenv("JWT_LIFETIME_SECONDS", "3600"))
lifetime_seconds = int(os.getenv("JWT_LIFETIME_SECONDS", "3600"))
except ValueError:
lifetime_seconds = 3600
return APIJWTStrategy(secret, lifetime_seconds=lifetime_seconds) return APIJWTStrategy(secret, lifetime_seconds=lifetime_seconds)
auth_backend = AuthenticationBackend( auth_backend = AuthenticationBackend(

View file

@ -18,10 +18,7 @@ def get_client_auth_backend():
from .default.default_jwt_strategy import DefaultJWTStrategy from .default.default_jwt_strategy import DefaultJWTStrategy
secret = os.getenv("FASTAPI_USERS_JWT_SECRET", "super_secret") secret = os.getenv("FASTAPI_USERS_JWT_SECRET", "super_secret")
try: lifetime_seconds = int(os.getenv("JWT_LIFETIME_SECONDS", "3600"))
lifetime_seconds = int(os.getenv("JWT_LIFETIME_SECONDS", "3600"))
except ValueError:
lifetime_seconds = 3600
return DefaultJWTStrategy(secret, lifetime_seconds=lifetime_seconds) return DefaultJWTStrategy(secret, lifetime_seconds=lifetime_seconds)

View file

@ -8,7 +8,8 @@ import http.server
import socketserver import socketserver
from threading import Thread from threading import Thread
import pathlib import pathlib
from uuid import uuid4, uuid5, NAMESPACE_OID from typing import Union, Any, Dict, List
from uuid import uuid4, uuid5, NAMESPACE_OID, UUID
from cognee.base_config import get_base_config from cognee.base_config import get_base_config
from cognee.shared.logging_utils import get_logger from cognee.shared.logging_utils import get_logger
@ -58,7 +59,7 @@ def get_anonymous_id():
return anonymous_id return anonymous_id
def _sanitize_nested_properties(obj, property_names: list[str]): def _sanitize_nested_properties(obj: Any, property_names: list[str]) -> Any:
""" """
Recursively replaces any property whose key matches one of `property_names` Recursively replaces any property whose key matches one of `property_names`
(e.g., ['url', 'path']) in a nested dict or list with a uuid5 hash (e.g., ['url', 'path']) in a nested dict or list with a uuid5 hash
@ -78,7 +79,9 @@ def _sanitize_nested_properties(obj, property_names: list[str]):
return obj return obj
def send_telemetry(event_name: str, user_id, additional_properties: dict = {}): def send_telemetry(event_name: str, user_id: Union[str, UUID], additional_properties: dict = {}):
if additional_properties is None:
additional_properties = {}
if os.getenv("TELEMETRY_DISABLED"): if os.getenv("TELEMETRY_DISABLED"):
return return
@ -108,7 +111,7 @@ def send_telemetry(event_name: str, user_id, additional_properties: dict = {}):
print(f"Error sending telemetry through proxy: {response.status_code}") print(f"Error sending telemetry through proxy: {response.status_code}")
def embed_logo(p, layout_scale, logo_alpha, position): def embed_logo(p: Any, layout_scale: float, logo_alpha: float, position: str):
""" """
Embed a logo into the graph visualization as a watermark. Embed a logo into the graph visualization as a watermark.
""" """
@ -138,7 +141,11 @@ def embed_logo(p, layout_scale, logo_alpha, position):
def start_visualization_server( def start_visualization_server(
host="0.0.0.0", port=8001, handler_class=http.server.SimpleHTTPRequestHandler host: str = "0.0.0.0",
port: int = 8001,
handler_class: type[
http.server.SimpleHTTPRequestHandler
] = http.server.SimpleHTTPRequestHandler,
): ):
""" """
Spin up a simple HTTP server in a background thread to serve files. Spin up a simple HTTP server in a background thread to serve files.

View file

@ -1 +0,0 @@

View file

@ -46,10 +46,10 @@ async def test_textdocument_cleanup_with_sql():
# Step 1: Add and cognify a test document # Step 1: Add and cognify a test document
dataset_name = "test_cleanup_dataset" dataset_name = "test_cleanup_dataset"
test_text = """ test_text = """
Machine learning is a subset of artificial intelligence that enables systems to learn Machine learning is a subset of artificial intelligence that enables systems to learn
and improve from experience without being explicitly programmed. Deep learning uses and improve from experience without being explicitly programmed. Deep learning uses
neural networks with multiple layers to process data. neural networks with multiple layers to process data.
""" """
await setup() await setup()

View file

@ -47,20 +47,20 @@ async def main():
# Test data # Test data
text_1 = """ text_1 = """
Apple Inc. is an American multinational technology company that specializes in consumer electronics, Apple Inc. is an American multinational technology company that specializes in consumer electronics,
software, and online services. Apple is the world's largest technology company by revenue and, software, and online services. Apple is the world's largest technology company by revenue and,
since January 2021, the world's most valuable company. since January 2021, the world's most valuable company.
""" """
text_2 = """ text_2 = """
Microsoft Corporation is an American multinational technology corporation which produces computer software, Microsoft Corporation is an American multinational technology corporation which produces computer software,
consumer electronics, personal computers, and related services. Its best known software products are the consumer electronics, personal computers, and related services. Its best known software products are the
Microsoft Windows line of operating systems and the Microsoft Office suite. Microsoft Windows line of operating systems and the Microsoft Office suite.
""" """
text_3 = """ text_3 = """
Google LLC is an American multinational technology company that specializes in Internet-related services and products, Google LLC is an American multinational technology company that specializes in Internet-related services and products,
which include online advertising technologies, search engine, cloud computing, software, and hardware. Google has been which include online advertising technologies, search engine, cloud computing, software, and hardware. Google has been
referred to as the most powerful company in the world and one of the world's most valuable brands. referred to as the most powerful company in the world and one of the world's most valuable brands.
""" """

View file

@ -1,6 +1,7 @@
# cognee-infra-helm # Example helm chart
General infrastructure setup for Cognee on Kubernetes using a Helm chart. Example Helm chart fro Cognee with PostgreSQL and pgvector extension
It is not ready for production usage
## Prerequisites ## Prerequisites
Before deploying the Helm chart, ensure the following prerequisites are met:  Before deploying the Helm chart, ensure the following prerequisites are met: 
@ -13,13 +14,22 @@ Before deploying the Helm chart, ensure the following prerequisites are met: 
Clone the Repository Clone this repository to your local machine and navigate to the directory. Clone the Repository Clone this repository to your local machine and navigate to the directory.
## Deploy Helm Chart: ## Example deploy Helm Chart:
```bash ```bash
helm install cognee ./cognee-chart helm upgrade --install cognee deployment/helm \
--namespace cognee --create-namespace \
--set cognee.env.LLM_API_KEY="$YOUR_KEY"
``` ```
**Uninstall Helm Release**: **Uninstall Helm Release**:
```bash ```bash
helm uninstall cognee helm uninstall cognee
``` ```
## Port forwarding
To access cognee, run
```
kubectl port-forward svc/cognee-service -n cognee 8000
```
it will be available at localhost:8000

View file

@ -43,4 +43,3 @@ networks:
volumes: volumes:
postgres_data: postgres_data:

View file

@ -20,12 +20,35 @@ spec:
ports: ports:
- containerPort: {{ .Values.cognee.port }} - containerPort: {{ .Values.cognee.port }}
env: env:
- name: ENABLE_BACKEND_ACCESS_CONTROL
value: "false"
- name: HOST - name: HOST
value: {{ .Values.cognee.env.HOST }} value: {{ .Values.cognee.env.HOST }}
- name: ENVIRONMENT - name: ENVIRONMENT
value: {{ .Values.cognee.env.ENVIRONMENT }} value: {{ .Values.cognee.env.ENVIRONMENT }}
- name: PYTHONPATH - name: PYTHONPATH
value: {{ .Values.cognee.env.PYTHONPATH }} value: {{ .Values.cognee.env.PYTHONPATH }}
- name: VECTOR_DB_PROVIDER
value: pgvector
- name: DB_HOST
value: {{ .Release.Name }}-postgres
- name: DB_PORT
value: "{{ .Values.postgres.port }}"
- name: DB_NAME
value: {{ .Values.postgres.env.POSTGRES_DB }}
- name: DB_USERNAME
value: {{ .Values.postgres.env.POSTGRES_USER }}
- name: DB_PASSWORD
value: {{ .Values.postgres.env.POSTGRES_PASSWORD }}
- name: LLM_API_KEY
valueFrom:
secretKeyRef:
name: {{ .Release.Name }}-llm-api-key
key: LLM_API_KEY
- name: LLM_MODEL
value: {{ .Values.cognee.env.LLM_MODEL }}
- name: LLM_PROVIDER
value: {{ .Values.cognee.env.LLM_PROVIDER }}
resources: resources:
limits: limits:
cpu: {{ .Values.cognee.resources.cpu }} cpu: {{ .Values.cognee.resources.cpu }}

View file

@ -5,7 +5,7 @@ metadata:
labels: labels:
app: {{ .Release.Name }}-cognee app: {{ .Release.Name }}-cognee
spec: spec:
type: NodePort type: ClusterIP
ports: ports:
- port: {{ .Values.cognee.port }} - port: {{ .Values.cognee.port }}
targetPort: {{ .Values.cognee.port }} targetPort: {{ .Values.cognee.port }}

View file

@ -11,4 +11,3 @@ spec:
targetPort: {{ .Values.postgres.port }} targetPort: {{ .Values.postgres.port }}
selector: selector:
app: {{ .Release.Name }}-postgres app: {{ .Release.Name }}-postgres

View file

@ -0,0 +1,7 @@
apiVersion: v1
kind: Secret
metadata:
name: {{ .Release.Name }}-llm-api-key
type: Opaque
data:
LLM_API_KEY: {{ .Values.cognee.env.LLM_API_KEY | b64enc | quote }}

View file

@ -7,9 +7,11 @@ cognee:
HOST: "0.0.0.0" HOST: "0.0.0.0"
ENVIRONMENT: "local" ENVIRONMENT: "local"
PYTHONPATH: "." PYTHONPATH: "."
LLM_MODEL: "openai/gpt-4o-mini"
LLM_PROVIDER: "openai"
resources: resources:
cpu: "4.0" cpu: "4.0"
memory: "8Gi" memory: "2Gi"
# Configuration for the 'postgres' database service # Configuration for the 'postgres' database service
postgres: postgres:
@ -19,4 +21,4 @@ postgres:
POSTGRES_USER: "cognee" POSTGRES_USER: "cognee"
POSTGRES_PASSWORD: "cognee" POSTGRES_PASSWORD: "cognee"
POSTGRES_DB: "cognee_db" POSTGRES_DB: "cognee_db"
storage: "8Gi" storage: "2Gi"

View file

@ -3,4 +3,4 @@ numpy==1.26.4
matplotlib==3.10.0 matplotlib==3.10.0
seaborn==0.13.2 seaborn==0.13.2
scipy==1.11.4 scipy==1.11.4
pathlib pathlib

View file

@ -34,4 +34,4 @@ What began as an online bookstore has grown into one of the largest e-commerce p
Meta, originally known as Facebook, revolutionized social media by connecting billions of people worldwide. Beyond its core social networking service, Meta is investing in the next generation of digital experiences through virtual and augmented reality technologies, with projects like Oculus. The company's efforts signal a commitment to evolving digital interaction and building the metaverse—a shared virtual space where users can connect and collaborate. Meta, originally known as Facebook, revolutionized social media by connecting billions of people worldwide. Beyond its core social networking service, Meta is investing in the next generation of digital experiences through virtual and augmented reality technologies, with projects like Oculus. The company's efforts signal a commitment to evolving digital interaction and building the metaverse—a shared virtual space where users can connect and collaborate.
Each of these companies has significantly impacted the technology landscape, driving innovation and transforming everyday life through their groundbreaking products and services. Each of these companies has significantly impacted the technology landscape, driving innovation and transforming everyday life through their groundbreaking products and services.
""" """

View file

@ -63,10 +63,10 @@ async def main():
traversals. traversals.
""" """
sample_text_2 = """Neptune Analytics is an ideal choice for investigatory, exploratory, or data-science workloads sample_text_2 = """Neptune Analytics is an ideal choice for investigatory, exploratory, or data-science workloads
that require fast iteration for data, analytical and algorithmic processing, or vector search on graph data. It that require fast iteration for data, analytical and algorithmic processing, or vector search on graph data. It
complements Amazon Neptune Database, a popular managed graph database. To perform intensive analysis, you can load complements Amazon Neptune Database, a popular managed graph database. To perform intensive analysis, you can load
the data from a Neptune Database graph or snapshot into Neptune Analytics. You can also load graph data that's the data from a Neptune Database graph or snapshot into Neptune Analytics. You can also load graph data that's
stored in Amazon S3. stored in Amazon S3.
""" """

View file

@ -165,8 +165,8 @@ async def main():
// If a stored preference exists and it does not match the new value, // If a stored preference exists and it does not match the new value,
// raise an error using APOC's utility procedure. // raise an error using APOC's utility procedure.
CALL apoc.util.validate( CALL apoc.util.validate(
preference IS NOT NULL AND preference.value <> new_size, preference IS NOT NULL AND preference.value <> new_size,
"Conflicting shoe size preference: existing size is " + preference.value + " and new size is " + new_size, "Conflicting shoe size preference: existing size is " + preference.value + " and new size is " + new_size,
[] []
) )

View file

@ -35,16 +35,16 @@ biography_1 = """
biography_2 = """ biography_2 = """
Arnulf Øverland Ole Peter Arnulf Øverland ( 27 April 1889 25 March 1968 ) was a Norwegian poet and artist . He is principally known for his poetry which served to inspire the Norwegian resistance movement during the German occupation of Norway during World War II . Arnulf Øverland Ole Peter Arnulf Øverland ( 27 April 1889 25 March 1968 ) was a Norwegian poet and artist . He is principally known for his poetry which served to inspire the Norwegian resistance movement during the German occupation of Norway during World War II .
Biography . Biography .
Øverland was born in Kristiansund and raised in Bergen . His parents were Peter Anton Øverland ( 18521906 ) and Hanna Hage ( 18541939 ) . The early death of his father , left the family economically stressed . He was able to attend Bergen Cathedral School and in 1904 Kristiania Cathedral School . He graduated in 1907 and for a time studied philology at University of Kristiania . Øverland published his first collection of poems ( 1911 ) . Øverland was born in Kristiansund and raised in Bergen . His parents were Peter Anton Øverland ( 18521906 ) and Hanna Hage ( 18541939 ) . The early death of his father , left the family economically stressed . He was able to attend Bergen Cathedral School and in 1904 Kristiania Cathedral School . He graduated in 1907 and for a time studied philology at University of Kristiania . Øverland published his first collection of poems ( 1911 ) .
Øverland became a communist sympathizer from the early 1920s and became a member of Mot Dag . He also served as chairman of the Norwegian Students Society 192328 . He changed his stand in 1937 , partly as an expression of dissent against the ongoing Moscow Trials . He was an avid opponent of Nazism and in 1936 he wrote the poem Du ikke sove which was printed in the journal Samtiden . It ends with . ( I thought: : Something is imminent . Our era is over Europes on fire! ) . Probably the most famous line of the poem is ( You mustnt endure so well the injustice that doesnt affect you yourself! ) Øverland became a communist sympathizer from the early 1920s and became a member of Mot Dag . He also served as chairman of the Norwegian Students Society 192328 . He changed his stand in 1937 , partly as an expression of dissent against the ongoing Moscow Trials . He was an avid opponent of Nazism and in 1936 he wrote the poem Du ikke sove which was printed in the journal Samtiden . It ends with . ( I thought: : Something is imminent . Our era is over Europes on fire! ) . Probably the most famous line of the poem is ( You mustnt endure so well the injustice that doesnt affect you yourself! )
During the German occupation of Norway from 1940 in World War II , he wrote to inspire the Norwegian resistance movement . He wrote a series of poems which were clandestinely distributed , leading to the arrest of both him and his future wife Margrete Aamot Øverland in 1941 . Arnulf Øverland was held first in the prison camp of Grini before being transferred to Sachsenhausen concentration camp in Germany . He spent a four-year imprisonment until the liberation of Norway in 1945 . His poems were later collected in Vi overlever alt and published in 1945 . During the German occupation of Norway from 1940 in World War II , he wrote to inspire the Norwegian resistance movement . He wrote a series of poems which were clandestinely distributed , leading to the arrest of both him and his future wife Margrete Aamot Øverland in 1941 . Arnulf Øverland was held first in the prison camp of Grini before being transferred to Sachsenhausen concentration camp in Germany . He spent a four-year imprisonment until the liberation of Norway in 1945 . His poems were later collected in Vi overlever alt and published in 1945 .
Øverland played an important role in the Norwegian language struggle in the post-war era . He became a noted supporter for the conservative written form of Norwegian called Riksmål , he was president of Riksmålsforbundet ( an organization in support of Riksmål ) from 1947 to 1956 . In addition , Øverland adhered to the traditionalist style of writing , criticising modernist poetry on several occasions . His speech Tungetale fra parnasset , published in Arbeiderbladet in 1954 , initiated the so-called Glossolalia debate . Øverland played an important role in the Norwegian language struggle in the post-war era . He became a noted supporter for the conservative written form of Norwegian called Riksmål , he was president of Riksmålsforbundet ( an organization in support of Riksmål ) from 1947 to 1956 . In addition , Øverland adhered to the traditionalist style of writing , criticising modernist poetry on several occasions . His speech Tungetale fra parnasset , published in Arbeiderbladet in 1954 , initiated the so-called Glossolalia debate .
Personal life . Personal life .
In 1918 he had married the singer Hildur Arntzen ( 18881957 ) . Their marriage was dissolved in 1939 . In 1940 , he married Bartholine Eufemia Leganger ( 19031995 ) . They separated shortly after , and were officially divorced in 1945 . Øverland was married to journalist Margrete Aamot Øverland ( 19131978 ) during June 1945 . In 1946 , the Norwegian Parliament arranged for Arnulf and Margrete Aamot Øverland to reside at the Grotten . He lived there until his death in 1968 and she lived there for another ten years until her death in 1978 . Arnulf Øverland was buried at Vår Frelsers Gravlund in Oslo . Joseph Grimeland designed the bust of Arnulf Øverland ( bronze , 1970 ) at his grave site . In 1918 he had married the singer Hildur Arntzen ( 18881957 ) . Their marriage was dissolved in 1939 . In 1940 , he married Bartholine Eufemia Leganger ( 19031995 ) . They separated shortly after , and were officially divorced in 1945 . Øverland was married to journalist Margrete Aamot Øverland ( 19131978 ) during June 1945 . In 1946 , the Norwegian Parliament arranged for Arnulf and Margrete Aamot Øverland to reside at the Grotten . He lived there until his death in 1968 and she lived there for another ten years until her death in 1978 . Arnulf Øverland was buried at Vår Frelsers Gravlund in Oslo . Joseph Grimeland designed the bust of Arnulf Øverland ( bronze , 1970 ) at his grave site .
@ -56,7 +56,7 @@ biography_2 = """
- Vi overlever alt ( 1945 ) - Vi overlever alt ( 1945 )
- Sverdet bak døren ( 1956 ) - Sverdet bak døren ( 1956 )
- Livets minutter ( 1965 ) - Livets minutter ( 1965 )
Awards . Awards .
- Gyldendals Endowment ( 1935 ) - Gyldendals Endowment ( 1935 )
- Dobloug Prize ( 1951 ) - Dobloug Prize ( 1951 )

Some files were not shown because too many files have changed in this diff Show more