<!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> # Add Support for ChromaDB ## Summary This PR adds support for ChromaDB as a vector database option in the Cognee application. ChromaDB is a modern, open-source embedding database designed for AI applications. ## Changes - Created a new ChromaDBAdapter implementation for vector database operations - Added comprehensive test suite for ChromaDB functionality - Updated docker-compose.yml to include ChromaDB service - Modified environment configuration to support ChromaDB settings - Updated vector engine creation logic to support ChromaDB as an option ## Technical Details - Implemented `ChromaDBAdapter.py` (347 lines) with full CRUD operations for vector data - Created test suite (`test_chromadb.py`) with 171 lines of test coverage - Updated vector engine creation process to dynamically select ChromaDB when configured - Modified settings router to accommodate new database option - Updated environment template with ChromaDB configuration options ## Docker Changes - Added ChromaDB service to docker-compose.yml with appropriate configuration This PR enhances Cognee's flexibility by providing an alternative vector database option, allowing users to choose the most appropriate database for their specific use case. ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin Tested with UI + tests. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Expanded vector database integration by adding support for Chromadb, enabling enhanced data management and search functionalities. - **Tests** - Added automated tests to validate the Chromadb integration and related operations. - **Chores** - Updated configuration guidance and dependency management to include Chromadb. - Provided an optional container deployment template for Chromadb. - Added a new entry to ignore the `.chromadb_data/` directory in version control. - Introduced a new GitHub Actions workflow for testing Chromadb integration. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com> Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
63 lines
1.7 KiB
YAML
63 lines
1.7 KiB
YAML
name: test | chromadb
|
|
|
|
on:
|
|
workflow_dispatch:
|
|
pull_request:
|
|
types: [labeled, synchronize]
|
|
|
|
concurrency:
|
|
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
|
|
cancel-in-progress: true
|
|
|
|
env:
|
|
RUNTIME__LOG_LEVEL: ERROR
|
|
|
|
jobs:
|
|
run_chromadb_integration_test:
|
|
name: chromadb test
|
|
runs-on: ubuntu-22.04
|
|
defaults:
|
|
run:
|
|
shell: bash
|
|
services:
|
|
chromadb:
|
|
image: chromadb/chroma:0.6.3
|
|
volumes:
|
|
- chroma-data:/chroma/chroma
|
|
ports:
|
|
- 3002:8000
|
|
|
|
steps:
|
|
- name: Check out
|
|
uses: actions/checkout@master
|
|
|
|
- name: Setup Python
|
|
uses: actions/setup-python@v5
|
|
with:
|
|
python-version: '3.11.x'
|
|
|
|
- name: Install Poetry
|
|
uses: snok/install-poetry@v1.4.1
|
|
with:
|
|
virtualenvs-create: true
|
|
virtualenvs-in-project: true
|
|
installer-parallel: true
|
|
|
|
- name: Install dependencies
|
|
run: poetry install --no-interaction
|
|
|
|
- name: Run chromadb test
|
|
env:
|
|
ENV: 'dev'
|
|
VECTOR_DB_PROVIDER: chromadb
|
|
VECTOR_DB_URL: http://localhost:3002
|
|
VECTOR_DB_KEY: test-token
|
|
LLM_MODEL: ${{ secrets.LLM_MODEL }}
|
|
LLM_ENDPOINT: ${{ secrets.LLM_ENDPOINT }}
|
|
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
|
|
LLM_API_VERSION: ${{ secrets.LLM_API_VERSION }}
|
|
EMBEDDING_MODEL: ${{ secrets.EMBEDDING_MODEL }}
|
|
EMBEDDING_ENDPOINT: ${{ secrets.EMBEDDING_ENDPOINT }}
|
|
EMBEDDING_API_KEY: ${{ secrets.EMBEDDING_API_KEY }}
|
|
EMBEDDING_API_VERSION: ${{ secrets.EMBEDDING_API_VERSION }}
|
|
run: poetry run python ./cognee/tests/test_chromadb.py
|