<!-- .github/pull_request_template.md --> ## Description This PR fixes distributed pipeline + updates core changes in distr logic. ## Type of Change <!-- Please check the relevant option --> - [x] Bug fix (non-breaking change that fixes an issue) - [x] New feature (non-breaking change that adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [x] Code refactoring - [x] Performance improvement - [ ] Other (please specify): ## Changes Made Fixes distributed pipeline: -Changed spawning logic + adds incremental loading to run_tasks_diistributed -Adds batching to consumer nodes -Fixes consumer stopping criteria by adding stop signal + handling -Changed edge embedding solution to avoid huge network load in a case of a multicontainer environment ## Testing Tested it by running 1GB on modal + manually ## Screenshots/Videos (if applicable) None ## Pre-submission Checklist <!-- Please check all boxes that apply before submitting your PR --> - [x] **I have tested my changes thoroughly before submitting this PR** - [x] **This PR contains minimal changes necessary to address the issue/feature** - [ ] My code follows the project's coding standards and style guidelines - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added necessary documentation (if applicable) - [ ] All new and existing tests pass - [ ] I have searched existing PRs to ensure this change hasn't been submitted already - [ ] I have linked any relevant issues in the description - [ ] My commits have clear and descriptive messages ## Related Issues None ## Additional Notes None ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin. --------- Co-authored-by: Boris <boris@topoteretes.com> Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
31 lines
665 B
Docker
31 lines
665 B
Docker
FROM python:3.11-slim
|
|
|
|
# Set environment variables
|
|
ENV PIP_NO_CACHE_DIR=true
|
|
ENV PATH="${PATH}:/root/.poetry/bin"
|
|
ENV PYTHONPATH=/app
|
|
ENV RUN_MODE=modal
|
|
ENV SKIP_MIGRATIONS=true
|
|
ENV COGNEE_DISTRIBUTED=true
|
|
|
|
# System dependencies
|
|
RUN apt-get update && apt-get install -y \
|
|
gcc \
|
|
libpq-dev \
|
|
git \
|
|
curl \
|
|
build-essential \
|
|
&& rm -rf /var/lib/apt/lists/*
|
|
|
|
WORKDIR /app
|
|
|
|
COPY pyproject.toml poetry.lock README.md /app/
|
|
|
|
RUN pip install poetry
|
|
|
|
RUN poetry config virtualenvs.create false
|
|
|
|
RUN poetry install --extras neo4j --extras postgres --extras aws --extras distributed --no-root
|
|
|
|
COPY cognee/ /app/cognee
|
|
COPY distributed/ /app/distributed
|