Compare commits
1 commit
main
...
cursor/doc
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
a48337415f |
1 changed files with 610 additions and 0 deletions
610
docker-compose-documentation.md
Normal file
610
docker-compose-documentation.md
Normal file
|
|
@ -0,0 +1,610 @@
|
|||
# Cognee Docker Compose Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
Cognee uses Docker Compose to orchestrate multiple services that work together to provide AI memory capabilities. The setup is designed with a modular architecture using Docker Compose profiles, allowing you to run only the services you need for your specific use case.
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
The Docker Compose setup consists of several key components:
|
||||
|
||||
- **Core Services**: Main backend API and optional MCP server
|
||||
- **Database Services**: Multiple database options (PostgreSQL, Neo4j, FalkorDB, ChromaDB)
|
||||
- **Frontend Service**: Next.js web interface (work in progress)
|
||||
- **Network**: Shared network for inter-service communication
|
||||
- **Profiles**: Optional service groups for different deployment scenarios
|
||||
|
||||
## Service Architecture
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
||||
│ Frontend │ │ Cognee API │ │ Cognee MCP │
|
||||
│ (Next.js) │ │ (Backend) │ │ Server │
|
||||
│ Port: 3000 │ │ Port: 8000 │ │ Port: 8000 │
|
||||
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
||||
│ │ │
|
||||
└───────────────────────┼───────────────────────┘
|
||||
│
|
||||
┌─────────────────┐
|
||||
│ cognee-network │
|
||||
└─────────────────┘
|
||||
│
|
||||
┌───────────┬─────────────┬─────────────┬─────────────┐
|
||||
│ │ │ │ │
|
||||
┌─────────┐ ┌─────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────┐
|
||||
│PostgreSQL│ │ Neo4j │ │ FalkorDB │ │ ChromaDB │ │ ... │
|
||||
│Port: 5432│ │Port: 7474│ │Port: 6379 │ │Port: 3002 │ │ │
|
||||
└─────────┘ └─────────┘ └─────────────┘ └─────────────┘ └─────────┘
|
||||
```
|
||||
|
||||
## Services Overview
|
||||
|
||||
### Core Services
|
||||
|
||||
#### 1. Cognee (Main Backend API)
|
||||
- **Container Name**: `cognee`
|
||||
- **Build Context**: Root directory
|
||||
- **Ports**:
|
||||
- `8000:8000` (HTTP API)
|
||||
- `5678:5678` (Debug port)
|
||||
- **Purpose**: Main Cognee backend API server
|
||||
- **Resources**: 4 CPUs, 8GB RAM
|
||||
|
||||
#### 2. Cognee MCP Server
|
||||
- **Container Name**: `cognee-mcp`
|
||||
- **Profile**: `mcp`
|
||||
- **Build Context**: Root directory (using `cognee-mcp/Dockerfile`)
|
||||
- **Ports**:
|
||||
- `8000:8000` (MCP HTTP/SSE)
|
||||
- `5678:5678` (Debug port)
|
||||
- **Purpose**: Model Context Protocol server for IDE integration (Cursor, Claude Desktop, VS Code)
|
||||
- **Resources**: 2 CPUs, 4GB RAM
|
||||
|
||||
#### 3. Frontend
|
||||
- **Container Name**: `frontend`
|
||||
- **Profile**: `ui`
|
||||
- **Build Context**: `./cognee-frontend`
|
||||
- **Port**: `3000:3000`
|
||||
- **Purpose**: Next.js web interface (work in progress)
|
||||
- **Note**: Limited functionality - prefer MCP integration for full features
|
||||
|
||||
### Database Services (All Optional)
|
||||
|
||||
#### 1. PostgreSQL with pgvector
|
||||
- **Container Name**: `postgres`
|
||||
- **Profile**: `postgres`
|
||||
- **Image**: `pgvector/pgvector:pg17`
|
||||
- **Port**: `5432:5432`
|
||||
- **Purpose**: Relational database with vector extensions
|
||||
- **Credentials**: `cognee/cognee` (user/password)
|
||||
- **Database**: `cognee_db`
|
||||
|
||||
#### 2. Neo4j
|
||||
- **Container Name**: `neo4j`
|
||||
- **Profile**: `neo4j`
|
||||
- **Image**: `neo4j:latest`
|
||||
- **Ports**:
|
||||
- `7474:7474` (HTTP interface)
|
||||
- `7687:7687` (Bolt protocol)
|
||||
- **Purpose**: Graph database
|
||||
- **Credentials**: `neo4j/pleaseletmein`
|
||||
- **Plugins**: APOC, Graph Data Science
|
||||
|
||||
#### 3. FalkorDB
|
||||
- **Container Name**: `falkordb`
|
||||
- **Profile**: `falkordb`
|
||||
- **Image**: `falkordb/falkordb:edge`
|
||||
- **Ports**:
|
||||
- `6379:6379` (Redis-compatible interface)
|
||||
- `3001:3000` (Web interface)
|
||||
- **Purpose**: Graph database with Redis interface
|
||||
|
||||
#### 4. ChromaDB
|
||||
- **Container Name**: `chromadb`
|
||||
- **Profile**: `chromadb`
|
||||
- **Image**: `chromadb/chroma:0.6.3`
|
||||
- **Port**: `3002:8000`
|
||||
- **Purpose**: Vector database
|
||||
- **Authentication**: Token-based (requires `VECTOR_DB_KEY`)
|
||||
- **Persistence**: Enabled with local volume
|
||||
|
||||
## Docker Compose Profiles
|
||||
|
||||
Profiles allow you to selectively run services based on your needs:
|
||||
|
||||
### Available Profiles
|
||||
|
||||
| Profile | Services | Use Case |
|
||||
|---------|----------|----------|
|
||||
| **(default)** | `cognee` only | Basic API server with SQLite |
|
||||
| `mcp` | `cognee-mcp` | IDE integration (Cursor/Claude Desktop) |
|
||||
| `ui` | `frontend` | Web interface (limited functionality) |
|
||||
| `postgres` | `postgres` | PostgreSQL database |
|
||||
| `neo4j` | `neo4j` | Graph database |
|
||||
| `falkordb` | `falkordb` | Alternative graph database |
|
||||
| `chromadb` | `chromadb` | Vector database |
|
||||
|
||||
### Profile Usage Examples
|
||||
|
||||
```bash
|
||||
# Basic API server only
|
||||
docker compose up
|
||||
|
||||
# API server + PostgreSQL
|
||||
docker compose --profile postgres up
|
||||
|
||||
# API server + Neo4j + ChromaDB
|
||||
docker compose --profile neo4j --profile chromadb up
|
||||
|
||||
# MCP server + PostgreSQL (for IDE integration)
|
||||
docker compose --profile mcp --profile postgres up
|
||||
|
||||
# Full stack with UI
|
||||
docker compose --profile ui --profile postgres up
|
||||
|
||||
# All services
|
||||
docker compose --profile mcp --profile ui --profile postgres --profile neo4j --profile chromadb up
|
||||
```
|
||||
|
||||
## Environment Configuration
|
||||
|
||||
### Required Environment File
|
||||
|
||||
Create a `.env` file in the root directory with your configuration:
|
||||
|
||||
```bash
|
||||
# Core LLM Configuration
|
||||
LLM_API_KEY=your_openai_api_key_here
|
||||
LLM_PROVIDER=openai
|
||||
LLM_MODEL=gpt-4o-mini
|
||||
|
||||
# Database Configuration (when using external databases)
|
||||
DB_PROVIDER=postgres # or sqlite, neo4j, etc.
|
||||
DB_HOST=localhost
|
||||
DB_PORT=5432
|
||||
DB_NAME=cognee_db
|
||||
DB_USERNAME=cognee
|
||||
DB_PASSWORD=cognee
|
||||
|
||||
# Vector Database (when using ChromaDB)
|
||||
VECTOR_DB_KEY=your_chroma_auth_token
|
||||
|
||||
# Optional: Embedding Configuration
|
||||
EMBEDDING_PROVIDER=openai
|
||||
EMBEDDING_MODEL=text-embedding-3-large
|
||||
```
|
||||
|
||||
### Environment Variables by Service
|
||||
|
||||
#### Cognee (Main API)
|
||||
- `DEBUG`: Enable/disable debug mode
|
||||
- `HOST`: Bind host (default: 0.0.0.0)
|
||||
- `ENVIRONMENT`: Deployment environment (local/dev/prod)
|
||||
- `LOG_LEVEL`: Logging level (ERROR/INFO/DEBUG)
|
||||
|
||||
#### Cognee MCP Server
|
||||
- `TRANSPORT_MODE`: Communication protocol (stdio/sse/http)
|
||||
- `MCP_LOG_LEVEL`: MCP-specific logging level
|
||||
- Database configuration (inherits from main service)
|
||||
|
||||
#### Database Services
|
||||
- **PostgreSQL**: `POSTGRES_USER`, `POSTGRES_PASSWORD`, `POSTGRES_DB`
|
||||
- **Neo4j**: `NEO4J_AUTH`, `NEO4J_PLUGINS`
|
||||
- **ChromaDB**: Authentication and persistence settings
|
||||
|
||||
## Container Build Process
|
||||
|
||||
### Multi-Stage Build Strategy
|
||||
|
||||
Both main services use multi-stage Docker builds for optimization:
|
||||
|
||||
#### Stage 1: Dependency Installation (UV-based)
|
||||
```dockerfile
|
||||
FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim AS uv
|
||||
# Install system dependencies and Python packages
|
||||
# Uses UV for fast dependency resolution
|
||||
```
|
||||
|
||||
#### Stage 2: Runtime Environment
|
||||
```dockerfile
|
||||
FROM python:3.12-slim-bookworm
|
||||
# Lightweight runtime with only necessary components
|
||||
# Copy dependencies from build stage
|
||||
```
|
||||
|
||||
### Build Context and Caching
|
||||
|
||||
- **Main Service**: Uses root directory context, installs with multiple extras
|
||||
- **MCP Service**: Uses root context but builds from `cognee-mcp/` subdirectory
|
||||
- **Frontend**: Independent Node.js build in `cognee-frontend/`
|
||||
|
||||
## Networking
|
||||
|
||||
### Shared Network
|
||||
All services communicate through the `cognee-network` Docker network:
|
||||
|
||||
```yaml
|
||||
networks:
|
||||
cognee-network:
|
||||
name: cognee-network
|
||||
```
|
||||
|
||||
### Inter-Service Communication
|
||||
- Services can reach each other using container names as hostnames
|
||||
- External access through mapped ports
|
||||
- `host.docker.internal` for accessing host machine services
|
||||
|
||||
## Volume Management
|
||||
|
||||
### Application Code Volumes (Development)
|
||||
```yaml
|
||||
volumes:
|
||||
- ./cognee:/app/cognee # Main API code
|
||||
- ./cognee-frontend/src:/app/src # Frontend source
|
||||
- .env:/app/.env # Environment configuration
|
||||
```
|
||||
|
||||
### Persistent Data Volumes
|
||||
```yaml
|
||||
volumes:
|
||||
- .chromadb_data/:/chroma/chroma/ # ChromaDB persistence
|
||||
- postgres_data:/var/lib/postgresql/data # PostgreSQL data
|
||||
```
|
||||
|
||||
## Startup Process
|
||||
|
||||
### 1. Database Migrations
|
||||
Both main services run Alembic migrations on startup:
|
||||
|
||||
```bash
|
||||
alembic upgrade head
|
||||
```
|
||||
|
||||
**Error Handling**: Special handling for `UserAlreadyExists` errors during default user creation - allows safe container restarts.
|
||||
|
||||
### 2. Service Initialization
|
||||
|
||||
#### Main API Service
|
||||
- Runs Gunicorn with Uvicorn workers
|
||||
- Development mode: Hot reloading enabled
|
||||
- Production mode: Optimized for performance
|
||||
- Debug mode: Debugpy integration on port 5678
|
||||
|
||||
#### MCP Service
|
||||
- Supports multiple transport modes (stdio/sse/http)
|
||||
- Configurable via `TRANSPORT_MODE` environment variable
|
||||
- Debug support with port 5678
|
||||
|
||||
## Resource Allocation
|
||||
|
||||
### CPU and Memory Limits
|
||||
|
||||
| Service | CPU Limit | Memory Limit | Rationale |
|
||||
|---------|-----------|--------------|-----------|
|
||||
| cognee | 4.0 cores | 8GB | Main processing service |
|
||||
| cognee-mcp | 2.0 cores | 4GB | Lighter MCP operations |
|
||||
| frontend | unlimited | unlimited | Development convenience |
|
||||
| databases | unlimited | unlimited | Database-specific needs |
|
||||
|
||||
## Development Features
|
||||
|
||||
### Debug Support
|
||||
- **Port 5678**: Debugpy integration for both main services
|
||||
- **Environment Variable**: Set `DEBUG=true` to enable
|
||||
- **Wait for Client**: Debugger waits for IDE attachment
|
||||
|
||||
### Hot Reloading
|
||||
- **API**: Gunicorn reload mode in development
|
||||
- **Frontend**: Next.js development server with file watching
|
||||
- **Volume Mounts**: Live code synchronization
|
||||
|
||||
### Development vs Production
|
||||
|
||||
#### Development Mode (`ENVIRONMENT=dev/local`)
|
||||
- Hot reloading enabled
|
||||
- Debug logging
|
||||
- Single worker processes
|
||||
- Extended timeouts
|
||||
|
||||
#### Production Mode
|
||||
- Multiple workers (configurable)
|
||||
- Error-level logging only
|
||||
- Optimized performance settings
|
||||
- No hot reloading
|
||||
|
||||
## Usage Patterns
|
||||
|
||||
### 1. Basic Development Setup
|
||||
```bash
|
||||
# Start with basic API + SQLite
|
||||
docker compose up
|
||||
|
||||
# View logs
|
||||
docker compose logs -f cognee
|
||||
```
|
||||
|
||||
### 2. Full Development Environment
|
||||
```bash
|
||||
# Start with PostgreSQL database
|
||||
docker compose --profile postgres up -d
|
||||
|
||||
# Add vector database for embeddings
|
||||
docker compose --profile postgres --profile chromadb up -d
|
||||
```
|
||||
|
||||
### 3. IDE Integration Development
|
||||
```bash
|
||||
# Start MCP server for Cursor/Claude Desktop integration
|
||||
docker compose --profile mcp --profile postgres up -d
|
||||
|
||||
# Check MCP server status
|
||||
docker compose logs cognee-mcp
|
||||
```
|
||||
|
||||
### 4. UI Development
|
||||
```bash
|
||||
# Start with frontend for web interface testing
|
||||
docker compose --profile ui --profile postgres up -d
|
||||
|
||||
# Access frontend at http://localhost:3000
|
||||
```
|
||||
|
||||
### 5. Graph Analysis Setup
|
||||
```bash
|
||||
# Start with graph database for complex relationships
|
||||
docker compose --profile neo4j --profile postgres up -d
|
||||
|
||||
# Access Neo4j browser at http://localhost:7474
|
||||
```
|
||||
|
||||
## Port Mapping
|
||||
|
||||
| Service | Internal Port | External Port | Purpose |
|
||||
|---------|---------------|---------------|---------|
|
||||
| cognee | 8000 | 8000 | HTTP API |
|
||||
| cognee | 5678 | 5678 | Debug |
|
||||
| cognee-mcp | 8000 | 8000 | MCP HTTP/SSE |
|
||||
| cognee-mcp | 5678 | 5678 | Debug |
|
||||
| frontend | 3000 | 3000 | Web UI |
|
||||
| postgres | 5432 | 5432 | Database |
|
||||
| neo4j | 7474 | 7474 | Web interface |
|
||||
| neo4j | 7687 | 7687 | Bolt protocol |
|
||||
| falkordb | 6379 | 6379 | Redis interface |
|
||||
| falkordb | 3000 | 3001 | Web interface |
|
||||
| chromadb | 8000 | 3002 | Vector DB API |
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Database Authentication
|
||||
- **PostgreSQL**: Default credentials (`cognee/cognee`)
|
||||
- **Neo4j**: Default credentials (`neo4j/pleaseletmein`)
|
||||
- **ChromaDB**: Token-based authentication via `VECTOR_DB_KEY`
|
||||
|
||||
### Network Isolation
|
||||
- All services communicate through isolated `cognee-network`
|
||||
- External access only through explicitly mapped ports
|
||||
- `host.docker.internal` for secure host access
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
#### Port Conflicts
|
||||
```bash
|
||||
# Check for port conflicts
|
||||
docker compose ps
|
||||
netstat -tulpn | grep :8000
|
||||
```
|
||||
|
||||
#### Database Connection Issues
|
||||
```bash
|
||||
# Check database container status
|
||||
docker compose --profile postgres ps
|
||||
|
||||
# View database logs
|
||||
docker compose --profile postgres logs postgres
|
||||
```
|
||||
|
||||
#### Service Dependencies
|
||||
```bash
|
||||
# Ensure services start in correct order
|
||||
docker compose up -d postgres # Start database first
|
||||
docker compose up cognee # Then start main service
|
||||
```
|
||||
|
||||
### Debug Mode
|
||||
|
||||
#### Enable Debug Mode
|
||||
1. Set `DEBUG=true` in your `.env` file or as environment variable
|
||||
2. Restart the service:
|
||||
```bash
|
||||
docker compose down
|
||||
docker compose up
|
||||
```
|
||||
|
||||
#### Attach Debugger
|
||||
1. Start service in debug mode
|
||||
2. Connect your IDE debugger to `localhost:5678`
|
||||
3. Set breakpoints and debug as needed
|
||||
|
||||
### Log Analysis
|
||||
```bash
|
||||
# View all service logs
|
||||
docker compose logs
|
||||
|
||||
# Follow logs in real-time
|
||||
docker compose logs -f
|
||||
|
||||
# Service-specific logs
|
||||
docker compose logs cognee
|
||||
docker compose logs postgres
|
||||
```
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Container Management
|
||||
```bash
|
||||
# Stop all services
|
||||
docker compose down
|
||||
|
||||
# Stop and remove volumes
|
||||
docker compose down -v
|
||||
|
||||
# Rebuild containers after code changes
|
||||
docker compose build
|
||||
docker compose up --force-recreate
|
||||
```
|
||||
|
||||
### Data Persistence
|
||||
- **ChromaDB**: Data persisted in `.chromadb_data/` directory
|
||||
- **PostgreSQL**: Data persisted in named volume `postgres_data`
|
||||
- **Neo4j**: No explicit persistence configured (data lost on container restart)
|
||||
|
||||
### Updates and Rebuilds
|
||||
```bash
|
||||
# Pull latest images
|
||||
docker compose pull
|
||||
|
||||
# Rebuild custom images
|
||||
docker compose build --no-cache
|
||||
|
||||
# Update specific service
|
||||
docker compose up -d --no-deps --build cognee
|
||||
```
|
||||
|
||||
## Performance Optimization
|
||||
|
||||
### Resource Tuning
|
||||
Adjust resource limits in `docker-compose.yml`:
|
||||
|
||||
```yaml
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
cpus: "4.0" # Adjust based on available CPU
|
||||
memory: 8GB # Adjust based on available RAM
|
||||
```
|
||||
|
||||
### Database Optimization
|
||||
- **PostgreSQL**: Consider shared_buffers and work_mem tuning
|
||||
- **Neo4j**: Configure heap size via NEO4J_dbms_memory_heap_max_size
|
||||
- **ChromaDB**: Increase memory allocation for large datasets
|
||||
|
||||
## Integration Examples
|
||||
|
||||
### Local Development
|
||||
```bash
|
||||
# Start minimal setup for local development
|
||||
docker compose up
|
||||
|
||||
# Add database when needed
|
||||
docker compose --profile postgres up -d
|
||||
```
|
||||
|
||||
### IDE Integration (Recommended)
|
||||
```bash
|
||||
# Start MCP server for Cursor/Claude Desktop
|
||||
docker compose --profile mcp --profile postgres up -d
|
||||
|
||||
# Configure your IDE to connect to localhost:8000
|
||||
```
|
||||
|
||||
### Production Deployment
|
||||
```bash
|
||||
# Production-ready setup
|
||||
docker compose --profile postgres --profile chromadb up -d
|
||||
|
||||
# With environment overrides
|
||||
ENVIRONMENT=production LOG_LEVEL=ERROR docker compose up -d
|
||||
```
|
||||
|
||||
## Alternative Configurations
|
||||
|
||||
### Helm Deployment
|
||||
For Kubernetes deployment, use the Helm configuration:
|
||||
- Location: `deployment/helm/`
|
||||
- Simplified setup with just Cognee + PostgreSQL
|
||||
- Kubernetes-native resource management
|
||||
|
||||
### Distributed Mode
|
||||
For distributed processing:
|
||||
- Uses separate `distributed/Dockerfile`
|
||||
- Configured for Modal.com integration
|
||||
- Environment: `COGNEE_DISTRIBUTED=true`
|
||||
|
||||
## Environment Templates
|
||||
|
||||
### Minimal Configuration
|
||||
```bash
|
||||
# .env (minimal setup)
|
||||
LLM_API_KEY=your_openai_api_key_here
|
||||
```
|
||||
|
||||
### Full Configuration
|
||||
```bash
|
||||
# .env (full setup)
|
||||
# LLM Configuration
|
||||
LLM_API_KEY=your_openai_api_key_here
|
||||
LLM_PROVIDER=openai
|
||||
LLM_MODEL=gpt-4o-mini
|
||||
|
||||
# Database Configuration
|
||||
DB_PROVIDER=postgres
|
||||
DB_HOST=host.docker.internal
|
||||
DB_PORT=5432
|
||||
DB_NAME=cognee_db
|
||||
DB_USERNAME=cognee
|
||||
DB_PASSWORD=cognee
|
||||
|
||||
# Vector Database
|
||||
VECTOR_DB_KEY=your_chroma_auth_token
|
||||
|
||||
# Embedding Configuration
|
||||
EMBEDDING_PROVIDER=openai
|
||||
EMBEDDING_MODEL=text-embedding-3-large
|
||||
|
||||
# Debug Configuration
|
||||
DEBUG=false
|
||||
LOG_LEVEL=INFO
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Start Simple**: Begin with the basic setup and add profiles as needed
|
||||
2. **Use Profiles**: Leverage profiles to avoid running unnecessary services
|
||||
3. **Environment Files**: Always use `.env` files for configuration
|
||||
4. **Resource Management**: Monitor resource usage and adjust limits accordingly
|
||||
5. **Data Persistence**: Ensure important data is properly mounted or volumed
|
||||
6. **Network Security**: Keep services on the isolated network
|
||||
7. **Debug Safely**: Only enable debug mode in development environments
|
||||
|
||||
## Quick Start Commands
|
||||
|
||||
```bash
|
||||
# Clone and setup
|
||||
git clone <repository>
|
||||
cd cognee
|
||||
|
||||
# Create environment file
|
||||
cp .env.template .env # Edit with your configuration
|
||||
|
||||
# Basic start
|
||||
docker compose up
|
||||
|
||||
# Full development environment
|
||||
docker compose --profile postgres --profile chromadb up -d
|
||||
|
||||
# IDE integration
|
||||
docker compose --profile mcp --profile postgres up -d
|
||||
|
||||
# Check status
|
||||
docker compose ps
|
||||
|
||||
# View logs
|
||||
docker compose logs -f
|
||||
|
||||
# Stop everything
|
||||
docker compose down
|
||||
```
|
||||
|
||||
This Docker Compose setup provides a flexible, scalable foundation for running Cognee in various configurations, from simple development setups to complex multi-database deployments.
|
||||
Loading…
Add table
Reference in a new issue