Compare commits

...
Sign in to create a new pull request.

1 commit

Author SHA1 Message Date
Cursor Agent
a48337415f Add comprehensive Docker Compose documentation for Cognee services
Co-authored-by: vasilije <vasilije@topoteretes.com>
2025-08-22 09:10:47 +00:00

View file

@ -0,0 +1,610 @@
# Cognee Docker Compose Documentation
## Overview
Cognee uses Docker Compose to orchestrate multiple services that work together to provide AI memory capabilities. The setup is designed with a modular architecture using Docker Compose profiles, allowing you to run only the services you need for your specific use case.
## Architecture Overview
The Docker Compose setup consists of several key components:
- **Core Services**: Main backend API and optional MCP server
- **Database Services**: Multiple database options (PostgreSQL, Neo4j, FalkorDB, ChromaDB)
- **Frontend Service**: Next.js web interface (work in progress)
- **Network**: Shared network for inter-service communication
- **Profiles**: Optional service groups for different deployment scenarios
## Service Architecture
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Frontend │ │ Cognee API │ │ Cognee MCP │
│ (Next.js) │ │ (Backend) │ │ Server │
│ Port: 3000 │ │ Port: 8000 │ │ Port: 8000 │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└───────────────────────┼───────────────────────┘
┌─────────────────┐
│ cognee-network │
└─────────────────┘
┌───────────┬─────────────┬─────────────┬─────────────┐
│ │ │ │ │
┌─────────┐ ┌─────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────┐
│PostgreSQL│ │ Neo4j │ │ FalkorDB │ │ ChromaDB │ │ ... │
│Port: 5432│ │Port: 7474│ │Port: 6379 │ │Port: 3002 │ │ │
└─────────┘ └─────────┘ └─────────────┘ └─────────────┘ └─────────┘
```
## Services Overview
### Core Services
#### 1. Cognee (Main Backend API)
- **Container Name**: `cognee`
- **Build Context**: Root directory
- **Ports**:
- `8000:8000` (HTTP API)
- `5678:5678` (Debug port)
- **Purpose**: Main Cognee backend API server
- **Resources**: 4 CPUs, 8GB RAM
#### 2. Cognee MCP Server
- **Container Name**: `cognee-mcp`
- **Profile**: `mcp`
- **Build Context**: Root directory (using `cognee-mcp/Dockerfile`)
- **Ports**:
- `8000:8000` (MCP HTTP/SSE)
- `5678:5678` (Debug port)
- **Purpose**: Model Context Protocol server for IDE integration (Cursor, Claude Desktop, VS Code)
- **Resources**: 2 CPUs, 4GB RAM
#### 3. Frontend
- **Container Name**: `frontend`
- **Profile**: `ui`
- **Build Context**: `./cognee-frontend`
- **Port**: `3000:3000`
- **Purpose**: Next.js web interface (work in progress)
- **Note**: Limited functionality - prefer MCP integration for full features
### Database Services (All Optional)
#### 1. PostgreSQL with pgvector
- **Container Name**: `postgres`
- **Profile**: `postgres`
- **Image**: `pgvector/pgvector:pg17`
- **Port**: `5432:5432`
- **Purpose**: Relational database with vector extensions
- **Credentials**: `cognee/cognee` (user/password)
- **Database**: `cognee_db`
#### 2. Neo4j
- **Container Name**: `neo4j`
- **Profile**: `neo4j`
- **Image**: `neo4j:latest`
- **Ports**:
- `7474:7474` (HTTP interface)
- `7687:7687` (Bolt protocol)
- **Purpose**: Graph database
- **Credentials**: `neo4j/pleaseletmein`
- **Plugins**: APOC, Graph Data Science
#### 3. FalkorDB
- **Container Name**: `falkordb`
- **Profile**: `falkordb`
- **Image**: `falkordb/falkordb:edge`
- **Ports**:
- `6379:6379` (Redis-compatible interface)
- `3001:3000` (Web interface)
- **Purpose**: Graph database with Redis interface
#### 4. ChromaDB
- **Container Name**: `chromadb`
- **Profile**: `chromadb`
- **Image**: `chromadb/chroma:0.6.3`
- **Port**: `3002:8000`
- **Purpose**: Vector database
- **Authentication**: Token-based (requires `VECTOR_DB_KEY`)
- **Persistence**: Enabled with local volume
## Docker Compose Profiles
Profiles allow you to selectively run services based on your needs:
### Available Profiles
| Profile | Services | Use Case |
|---------|----------|----------|
| **(default)** | `cognee` only | Basic API server with SQLite |
| `mcp` | `cognee-mcp` | IDE integration (Cursor/Claude Desktop) |
| `ui` | `frontend` | Web interface (limited functionality) |
| `postgres` | `postgres` | PostgreSQL database |
| `neo4j` | `neo4j` | Graph database |
| `falkordb` | `falkordb` | Alternative graph database |
| `chromadb` | `chromadb` | Vector database |
### Profile Usage Examples
```bash
# Basic API server only
docker compose up
# API server + PostgreSQL
docker compose --profile postgres up
# API server + Neo4j + ChromaDB
docker compose --profile neo4j --profile chromadb up
# MCP server + PostgreSQL (for IDE integration)
docker compose --profile mcp --profile postgres up
# Full stack with UI
docker compose --profile ui --profile postgres up
# All services
docker compose --profile mcp --profile ui --profile postgres --profile neo4j --profile chromadb up
```
## Environment Configuration
### Required Environment File
Create a `.env` file in the root directory with your configuration:
```bash
# Core LLM Configuration
LLM_API_KEY=your_openai_api_key_here
LLM_PROVIDER=openai
LLM_MODEL=gpt-4o-mini
# Database Configuration (when using external databases)
DB_PROVIDER=postgres # or sqlite, neo4j, etc.
DB_HOST=localhost
DB_PORT=5432
DB_NAME=cognee_db
DB_USERNAME=cognee
DB_PASSWORD=cognee
# Vector Database (when using ChromaDB)
VECTOR_DB_KEY=your_chroma_auth_token
# Optional: Embedding Configuration
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-large
```
### Environment Variables by Service
#### Cognee (Main API)
- `DEBUG`: Enable/disable debug mode
- `HOST`: Bind host (default: 0.0.0.0)
- `ENVIRONMENT`: Deployment environment (local/dev/prod)
- `LOG_LEVEL`: Logging level (ERROR/INFO/DEBUG)
#### Cognee MCP Server
- `TRANSPORT_MODE`: Communication protocol (stdio/sse/http)
- `MCP_LOG_LEVEL`: MCP-specific logging level
- Database configuration (inherits from main service)
#### Database Services
- **PostgreSQL**: `POSTGRES_USER`, `POSTGRES_PASSWORD`, `POSTGRES_DB`
- **Neo4j**: `NEO4J_AUTH`, `NEO4J_PLUGINS`
- **ChromaDB**: Authentication and persistence settings
## Container Build Process
### Multi-Stage Build Strategy
Both main services use multi-stage Docker builds for optimization:
#### Stage 1: Dependency Installation (UV-based)
```dockerfile
FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim AS uv
# Install system dependencies and Python packages
# Uses UV for fast dependency resolution
```
#### Stage 2: Runtime Environment
```dockerfile
FROM python:3.12-slim-bookworm
# Lightweight runtime with only necessary components
# Copy dependencies from build stage
```
### Build Context and Caching
- **Main Service**: Uses root directory context, installs with multiple extras
- **MCP Service**: Uses root context but builds from `cognee-mcp/` subdirectory
- **Frontend**: Independent Node.js build in `cognee-frontend/`
## Networking
### Shared Network
All services communicate through the `cognee-network` Docker network:
```yaml
networks:
cognee-network:
name: cognee-network
```
### Inter-Service Communication
- Services can reach each other using container names as hostnames
- External access through mapped ports
- `host.docker.internal` for accessing host machine services
## Volume Management
### Application Code Volumes (Development)
```yaml
volumes:
- ./cognee:/app/cognee # Main API code
- ./cognee-frontend/src:/app/src # Frontend source
- .env:/app/.env # Environment configuration
```
### Persistent Data Volumes
```yaml
volumes:
- .chromadb_data/:/chroma/chroma/ # ChromaDB persistence
- postgres_data:/var/lib/postgresql/data # PostgreSQL data
```
## Startup Process
### 1. Database Migrations
Both main services run Alembic migrations on startup:
```bash
alembic upgrade head
```
**Error Handling**: Special handling for `UserAlreadyExists` errors during default user creation - allows safe container restarts.
### 2. Service Initialization
#### Main API Service
- Runs Gunicorn with Uvicorn workers
- Development mode: Hot reloading enabled
- Production mode: Optimized for performance
- Debug mode: Debugpy integration on port 5678
#### MCP Service
- Supports multiple transport modes (stdio/sse/http)
- Configurable via `TRANSPORT_MODE` environment variable
- Debug support with port 5678
## Resource Allocation
### CPU and Memory Limits
| Service | CPU Limit | Memory Limit | Rationale |
|---------|-----------|--------------|-----------|
| cognee | 4.0 cores | 8GB | Main processing service |
| cognee-mcp | 2.0 cores | 4GB | Lighter MCP operations |
| frontend | unlimited | unlimited | Development convenience |
| databases | unlimited | unlimited | Database-specific needs |
## Development Features
### Debug Support
- **Port 5678**: Debugpy integration for both main services
- **Environment Variable**: Set `DEBUG=true` to enable
- **Wait for Client**: Debugger waits for IDE attachment
### Hot Reloading
- **API**: Gunicorn reload mode in development
- **Frontend**: Next.js development server with file watching
- **Volume Mounts**: Live code synchronization
### Development vs Production
#### Development Mode (`ENVIRONMENT=dev/local`)
- Hot reloading enabled
- Debug logging
- Single worker processes
- Extended timeouts
#### Production Mode
- Multiple workers (configurable)
- Error-level logging only
- Optimized performance settings
- No hot reloading
## Usage Patterns
### 1. Basic Development Setup
```bash
# Start with basic API + SQLite
docker compose up
# View logs
docker compose logs -f cognee
```
### 2. Full Development Environment
```bash
# Start with PostgreSQL database
docker compose --profile postgres up -d
# Add vector database for embeddings
docker compose --profile postgres --profile chromadb up -d
```
### 3. IDE Integration Development
```bash
# Start MCP server for Cursor/Claude Desktop integration
docker compose --profile mcp --profile postgres up -d
# Check MCP server status
docker compose logs cognee-mcp
```
### 4. UI Development
```bash
# Start with frontend for web interface testing
docker compose --profile ui --profile postgres up -d
# Access frontend at http://localhost:3000
```
### 5. Graph Analysis Setup
```bash
# Start with graph database for complex relationships
docker compose --profile neo4j --profile postgres up -d
# Access Neo4j browser at http://localhost:7474
```
## Port Mapping
| Service | Internal Port | External Port | Purpose |
|---------|---------------|---------------|---------|
| cognee | 8000 | 8000 | HTTP API |
| cognee | 5678 | 5678 | Debug |
| cognee-mcp | 8000 | 8000 | MCP HTTP/SSE |
| cognee-mcp | 5678 | 5678 | Debug |
| frontend | 3000 | 3000 | Web UI |
| postgres | 5432 | 5432 | Database |
| neo4j | 7474 | 7474 | Web interface |
| neo4j | 7687 | 7687 | Bolt protocol |
| falkordb | 6379 | 6379 | Redis interface |
| falkordb | 3000 | 3001 | Web interface |
| chromadb | 8000 | 3002 | Vector DB API |
## Security Considerations
### Database Authentication
- **PostgreSQL**: Default credentials (`cognee/cognee`)
- **Neo4j**: Default credentials (`neo4j/pleaseletmein`)
- **ChromaDB**: Token-based authentication via `VECTOR_DB_KEY`
### Network Isolation
- All services communicate through isolated `cognee-network`
- External access only through explicitly mapped ports
- `host.docker.internal` for secure host access
## Troubleshooting
### Common Issues
#### Port Conflicts
```bash
# Check for port conflicts
docker compose ps
netstat -tulpn | grep :8000
```
#### Database Connection Issues
```bash
# Check database container status
docker compose --profile postgres ps
# View database logs
docker compose --profile postgres logs postgres
```
#### Service Dependencies
```bash
# Ensure services start in correct order
docker compose up -d postgres # Start database first
docker compose up cognee # Then start main service
```
### Debug Mode
#### Enable Debug Mode
1. Set `DEBUG=true` in your `.env` file or as environment variable
2. Restart the service:
```bash
docker compose down
docker compose up
```
#### Attach Debugger
1. Start service in debug mode
2. Connect your IDE debugger to `localhost:5678`
3. Set breakpoints and debug as needed
### Log Analysis
```bash
# View all service logs
docker compose logs
# Follow logs in real-time
docker compose logs -f
# Service-specific logs
docker compose logs cognee
docker compose logs postgres
```
## Maintenance
### Container Management
```bash
# Stop all services
docker compose down
# Stop and remove volumes
docker compose down -v
# Rebuild containers after code changes
docker compose build
docker compose up --force-recreate
```
### Data Persistence
- **ChromaDB**: Data persisted in `.chromadb_data/` directory
- **PostgreSQL**: Data persisted in named volume `postgres_data`
- **Neo4j**: No explicit persistence configured (data lost on container restart)
### Updates and Rebuilds
```bash
# Pull latest images
docker compose pull
# Rebuild custom images
docker compose build --no-cache
# Update specific service
docker compose up -d --no-deps --build cognee
```
## Performance Optimization
### Resource Tuning
Adjust resource limits in `docker-compose.yml`:
```yaml
deploy:
resources:
limits:
cpus: "4.0" # Adjust based on available CPU
memory: 8GB # Adjust based on available RAM
```
### Database Optimization
- **PostgreSQL**: Consider shared_buffers and work_mem tuning
- **Neo4j**: Configure heap size via NEO4J_dbms_memory_heap_max_size
- **ChromaDB**: Increase memory allocation for large datasets
## Integration Examples
### Local Development
```bash
# Start minimal setup for local development
docker compose up
# Add database when needed
docker compose --profile postgres up -d
```
### IDE Integration (Recommended)
```bash
# Start MCP server for Cursor/Claude Desktop
docker compose --profile mcp --profile postgres up -d
# Configure your IDE to connect to localhost:8000
```
### Production Deployment
```bash
# Production-ready setup
docker compose --profile postgres --profile chromadb up -d
# With environment overrides
ENVIRONMENT=production LOG_LEVEL=ERROR docker compose up -d
```
## Alternative Configurations
### Helm Deployment
For Kubernetes deployment, use the Helm configuration:
- Location: `deployment/helm/`
- Simplified setup with just Cognee + PostgreSQL
- Kubernetes-native resource management
### Distributed Mode
For distributed processing:
- Uses separate `distributed/Dockerfile`
- Configured for Modal.com integration
- Environment: `COGNEE_DISTRIBUTED=true`
## Environment Templates
### Minimal Configuration
```bash
# .env (minimal setup)
LLM_API_KEY=your_openai_api_key_here
```
### Full Configuration
```bash
# .env (full setup)
# LLM Configuration
LLM_API_KEY=your_openai_api_key_here
LLM_PROVIDER=openai
LLM_MODEL=gpt-4o-mini
# Database Configuration
DB_PROVIDER=postgres
DB_HOST=host.docker.internal
DB_PORT=5432
DB_NAME=cognee_db
DB_USERNAME=cognee
DB_PASSWORD=cognee
# Vector Database
VECTOR_DB_KEY=your_chroma_auth_token
# Embedding Configuration
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-large
# Debug Configuration
DEBUG=false
LOG_LEVEL=INFO
```
## Best Practices
1. **Start Simple**: Begin with the basic setup and add profiles as needed
2. **Use Profiles**: Leverage profiles to avoid running unnecessary services
3. **Environment Files**: Always use `.env` files for configuration
4. **Resource Management**: Monitor resource usage and adjust limits accordingly
5. **Data Persistence**: Ensure important data is properly mounted or volumed
6. **Network Security**: Keep services on the isolated network
7. **Debug Safely**: Only enable debug mode in development environments
## Quick Start Commands
```bash
# Clone and setup
git clone <repository>
cd cognee
# Create environment file
cp .env.template .env # Edit with your configuration
# Basic start
docker compose up
# Full development environment
docker compose --profile postgres --profile chromadb up -d
# IDE integration
docker compose --profile mcp --profile postgres up -d
# Check status
docker compose ps
# View logs
docker compose logs -f
# Stop everything
docker compose down
```
This Docker Compose setup provides a flexible, scalable foundation for running Cognee in various configurations, from simple development setups to complex multi-database deployments.