9.3 KiB
OpenRAG Development Guide
A comprehensive guide for setting up and developing OpenRAG locally.
Table of Contents
- Architecture Overview
- Prerequisites
- Environment Setup
- Development Methods
- Local Development (Non-Docker)
- Docker Development
- API Documentation
- Troubleshooting
- Contributing
Architecture Overview
OpenRAG consists of four main services:
- Backend (
src/) - Python FastAPI/Starlette application with document processing, search, and chat - Frontend (
frontend/) - Next.js React application - OpenSearch - Document storage and vector search engine
- Langflow - AI workflow engine for chat functionality
Key Technologies
- Backend: Python 3.13+, Starlette, OpenAI, Docling, OpenSearch
- Frontend: Next.js 15, React 19, TypeScript, Tailwind CSS
- Dependencies: UV (Python), npm (Node.js)
- Containerization: Docker/Podman with Compose
Prerequisites
System Requirements
- Python: 3.13+ (for local development)
- Node.js: 18+ (for frontend development)
- Container Runtime: Docker or Podman with Compose
- Memory: 8GB+ RAM recommended (especially for GPU workloads)
Development Tools
# Python dependency manager
curl -LsSf https://astral.sh/uv/install.sh | sh
# Node.js (via nvm recommended)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.0/install.sh | bash
nvm install 18
nvm use 18
Environment Setup
1. Clone and Setup
git clone <repository-url>
cd openrag
2. Environment Variables
Create your environment configuration:
cp .env.example .env
Edit .env with your configuration:
# Required
OPENSEARCH_PASSWORD=your_secure_password
OPENAI_API_KEY=sk-your_openai_api_key
# Langflow Configuration
LANGFLOW_PUBLIC_URL=http://localhost:7860
LANGFLOW_SUPERUSER=admin
LANGFLOW_SUPERUSER_PASSWORD=your_langflow_password
LANGFLOW_SECRET_KEY=your_secret_key_min_32_chars
LANGFLOW_AUTO_LOGIN=true
LANGFLOW_NEW_USER_IS_ACTIVE=true
LANGFLOW_ENABLE_SUPERUSER_CLI=true
FLOW_ID=your_flow_id
# OAuth (Optional - for Google Drive/OneDrive connectors)
GOOGLE_OAUTH_CLIENT_ID=your_google_client_id
GOOGLE_OAUTH_CLIENT_SECRET=your_google_client_secret
MICROSOFT_GRAPH_OAUTH_CLIENT_ID=your_microsoft_client_id
MICROSOFT_GRAPH_OAUTH_CLIENT_SECRET=your_microsoft_client_secret
# Webhooks (Optional)
WEBHOOK_BASE_URL=https://your-domain.com
# AWS S3 (Optional)
AWS_ACCESS_KEY_ID=your_aws_key
AWS_SECRET_ACCESS_KEY=your_aws_secret
Development Methods
Choose your preferred development approach:
Local Development (Non-Docker)
Best for rapid development and debugging.
Backend Setup
# Install Python dependencies
uv sync
# Start OpenSearch (required dependency)
docker run -d \
--name opensearch-dev \
-p 9200:9200 \
-p 9600:9600 \
-e "discovery.type=single-node" \
-e "OPENSEARCH_INITIAL_ADMIN_PASSWORD=admin123" \
opensearchproject/opensearch:3.0.0
# Start backend
cd src
uv run python main.py
Backend will be available at: http://localhost:8000
Frontend Setup
# Install Node.js dependencies
cd frontend
npm install
# Start development server
npm run dev
Frontend will be available at: http://localhost:3000
Langflow Setup (Optional)
# Install and run Langflow
pip install langflow
langflow run --host 0.0.0.0 --port 7860
Langflow will be available at: http://localhost:7860
Docker Development
Use this for a production-like environment or when you need all services.
Available Compose Files
docker-compose-dev.yml- Development (builds from source)docker-compose.yml- Production (pre-built images)docker-compose-cpu.yml- CPU-only version
Development with Docker
# Build and start all services
docker compose -f docker-compose-dev.yml up --build
# Or with Podman
podman compose -f docker-compose-dev.yml up --build
# Run in background
docker compose -f docker-compose-dev.yml up --build -d
# View logs
docker compose -f docker-compose-dev.yml logs -f
# Stop services
docker compose -f docker-compose-dev.yml down
Service Ports
- Frontend: http://localhost:3000
- Backend: http://localhost:8000 (internal)
- OpenSearch: http://localhost:9200
- OpenSearch Dashboards: http://localhost:5601
- Langflow: http://localhost:7860
Reset Development Environment
# Complete reset (removes volumes and rebuilds)
docker compose -f docker-compose-dev.yml down -v
docker compose -f docker-compose-dev.yml up --build --force-recreate --remove-orphans
API Documentation
Key Endpoints
| Endpoint | Method | Description |
|---|---|---|
/search |
POST | Search documents with filters |
/upload |
POST | Upload documents |
/upload_path |
POST | Upload from local path |
/tasks |
GET | List processing tasks |
/tasks/{id} |
GET | Get task status |
/connectors |
GET | List available connectors |
/auth/me |
GET | Get current user info |
/knowledge-filter |
POST/GET | Manage knowledge filters |
Example API Calls
# Search all documents
curl -X POST http://localhost:8000/search \
-H "Content-Type: application/json" \
-d '{"query": "*", "limit": 100}'
# Upload a document
curl -X POST http://localhost:8000/upload \
-F "file=@document.pdf"
# Get task status
curl http://localhost:8000/tasks/task_id_here
Frontend API Proxy
The Next.js frontend proxies API calls through /api/* to the backend at http://openrag-backend:8000 (in Docker) or http://localhost:8000 (local).
Troubleshooting
Common Issues
Docker/Podman Issues
Issue: docker: command not found
# Install Docker Desktop or use Podman
brew install podman podman-desktop
podman machine init --memory 8192
podman machine start
Issue: Out of memory during build
# For Podman on macOS
podman machine stop
podman machine rm
podman machine init --memory 8192
podman machine start
Backend Issues
Issue: ModuleNotFoundError or dependency issues
# Ensure you're using the right Python version
python --version # Should be 3.13+
uv sync --reinstall
Issue: OpenSearch connection failed
# Check if OpenSearch is running
curl -k -u admin:admin123 https://localhost:9200
# If using Docker, ensure the container is running
docker ps | grep opensearch
Issue: CUDA/GPU not detected
# Check GPU availability
python -c "import torch; print(torch.cuda.is_available())"
# For CPU-only development, use docker-compose-cpu.yml
Frontend Issues
Issue: Next.js build failures
# Clear cache and reinstall
cd frontend
rm -rf .next node_modules package-lock.json
npm install
npm run dev
Issue: API calls failing
- Check that backend is running on port 8000
- Verify environment variables are set correctly
- Check browser network tab for CORS or proxy issues
Document Processing Issues
Issue: Docling model download failures
# Pre-download models
uv run docling-tools models download
# Or clear cache and retry
rm -rf ~/.cache/docling
Issue: EasyOCR initialization errors
# Clear EasyOCR cache
rm -rf ~/.EasyOCR
# Restart the backend to reinitialize
Development Tips
-
Hot Reloading:
- Backend: Use
uvicorn src.main:app --reloadfor auto-restart - Frontend:
npm run devprovides hot reloading
- Backend: Use
-
Debugging:
- Add
print()statements or usepdb.set_trace()in Python - Use browser dev tools for frontend debugging
- Check Docker logs:
docker compose logs -f service_name
- Add
-
Database Inspection:
- Access OpenSearch Dashboards at http://localhost:5601
- Use curl to query OpenSearch directly
- Check the
documentsindex for uploaded content
-
Performance:
- GPU processing is much faster for document processing
- Use CPU-only mode if GPU issues occur
- Monitor memory usage with
docker statsorhtop
Log Locations
- Backend: Console output or container logs
- Frontend: Browser console and Next.js terminal
- OpenSearch: Container logs (
docker compose logs opensearch) - Langflow: Container logs (
docker compose logs langflow)
Contributing
Code Style
- Python: Follow PEP 8, use
blackfor formatting - TypeScript: Use ESLint configuration in
frontend/ - Commits: Use conventional commit messages
Development Workflow
- Create feature branch from
main - Make changes and test locally
- Run tests (if available)
- Create pull request with description
- Ensure all checks pass
Testing
# Backend tests (if available)
cd src
uv run pytest
# Frontend tests (if available)
cd frontend
npm test
# Integration tests with Docker
docker compose -f docker-compose-dev.yml up --build
# Test API endpoints manually or with automated tests
Additional Resources
- OpenSearch Documentation
- Langflow Documentation
- Next.js Documentation
- Starlette Documentation
- Docling Documentation
For questions or issues, please check the troubleshooting section above or create an issue in the repository.