# LightRAG Offline Deployment Guide This guide provides comprehensive instructions for deploying LightRAG in offline environments where internet access is limited or unavailable. ## Table of Contents - [Overview](#overview) - [Quick Start](#quick-start) - [Layered Dependencies](#layered-dependencies) - [Tiktoken Cache Management](#tiktoken-cache-management) - [Complete Offline Deployment Workflow](#complete-offline-deployment-workflow) - [Troubleshooting](#troubleshooting) ## Overview LightRAG uses dynamic package installation (`pipmaster`) for optional features based on file types and configurations. In offline environments, these dynamic installations will fail. This guide shows you how to pre-install all necessary dependencies and cache files. ### What Gets Dynamically Installed? LightRAG dynamically installs packages for: - **Document Processing**: `docling`, `pypdf2`, `python-docx`, `python-pptx`, `openpyxl` - **Storage Backends**: `redis`, `neo4j`, `pymilvus`, `pymongo`, `asyncpg`, `qdrant-client` - **LLM Providers**: `openai`, `anthropic`, `ollama`, `zhipuai`, `aioboto3`, `voyageai`, `llama-index`, `lmdeploy`, `transformers`, `torch` - Tiktoken Models**: BPE encoding models downloaded from OpenAI CDN ## Quick Start ### Option 1: Using pip with Offline Extras ```bash # Online environment: Install all offline dependencies pip install lightrag-hku[offline] # Download tiktoken cache lightrag-download-cache # Create offline package pip download lightrag-hku[offline] -d ./offline-packages tar -czf lightrag-offline.tar.gz ./offline-packages ~/.tiktoken_cache # Transfer to offline server scp lightrag-offline.tar.gz user@offline-server:/path/to/ # Offline environment: Install tar -xzf lightrag-offline.tar.gz pip install --no-index --find-links=./offline-packages lightrag-hku[offline] export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache ``` ### Option 2: Using Requirements Files ```bash # Online environment: Download packages pip download -r requirements-offline.txt -d ./packages # Transfer to offline server tar -czf packages.tar.gz ./packages scp packages.tar.gz user@offline-server:/path/to/ # Offline environment: Install tar -xzf packages.tar.gz pip install --no-index --find-links=./packages -r requirements-offline.txt ``` ## Layered Dependencies LightRAG provides flexible dependency groups for different use cases: ### Available Dependency Groups | Group | Description | Use Case | |-------|-------------|----------| | `offline-docs` | Document processing | PDF, DOCX, PPTX, XLSX files | | `offline-storage` | Storage backends | Redis, Neo4j, MongoDB, PostgreSQL, etc. | | `offline-llm` | LLM providers | OpenAI, Anthropic, Ollama, etc. | | `offline` | All of the above | Complete offline deployment | ### Installation Examples ```bash # Install only document processing dependencies pip install lightrag-hku[offline-docs] # Install document processing and storage backends pip install lightrag-hku[offline-docs,offline-storage] # Install all offline dependencies pip install lightrag-hku[offline] ``` ### Using Individual Requirements Files ```bash # Document processing only pip install -r requirements-offline-docs.txt # Storage backends only pip install -r requirements-offline-storage.txt # LLM providers only pip install -r requirements-offline-llm.txt # All offline dependencies pip install -r requirements-offline.txt ``` ## Tiktoken Cache Management Tiktoken downloads BPE encoding models on first use. In offline environments, you must pre-download these models. ### Using the CLI Command After installing LightRAG, use the built-in command: ```bash # Download to default location (~/.tiktoken_cache) lightrag-download-cache # Download to specific directory lightrag-download-cache --cache-dir ./tiktoken_cache # Download specific models only lightrag-download-cache --models gpt-4o-mini gpt-4 ``` ### Default Models Downloaded - `gpt-4o-mini` (LightRAG default) - `gpt-4o` - `gpt-4` - `gpt-3.5-turbo` - `text-embedding-ada-002` - `text-embedding-3-small` - `text-embedding-3-large` ### Setting Cache Location in Offline Environment ```bash # Option 1: Environment variable (temporary) export TIKTOKEN_CACHE_DIR=/path/to/tiktoken_cache # Option 2: Add to ~/.bashrc or ~/.zshrc (persistent) echo 'export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache' >> ~/.bashrc source ~/.bashrc # Option 3: Copy to default location cp -r /path/to/tiktoken_cache ~/.tiktoken_cache/ ``` ## Complete Offline Deployment Workflow ### Step 1: Prepare in Online Environment ```bash # 1. Install LightRAG with offline dependencies pip install lightrag-hku[offline] # 2. Download tiktoken cache lightrag-download-cache --cache-dir ./offline_cache/tiktoken # 3. Download all Python packages pip download lightrag-hku[offline] -d ./offline_cache/packages # 4. Create archive for transfer tar -czf lightrag-offline-complete.tar.gz ./offline_cache # 5. Verify contents tar -tzf lightrag-offline-complete.tar.gz | head -20 ``` ### Step 2: Transfer to Offline Environment ```bash # Using scp scp lightrag-offline-complete.tar.gz user@offline-server:/tmp/ # Or using USB/physical media # Copy lightrag-offline-complete.tar.gz to USB drive ``` ### Step 3: Install in Offline Environment ```bash # 1. Extract archive cd /tmp tar -xzf lightrag-offline-complete.tar.gz # 2. Install Python packages pip install --no-index \ --find-links=/tmp/offline_cache/packages \ lightrag-hku[offline] # 3. Set up tiktoken cache mkdir -p ~/.tiktoken_cache cp -r /tmp/offline_cache/tiktoken/* ~/.tiktoken_cache/ export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache # 4. Add to shell profile for persistence echo 'export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache' >> ~/.bashrc ``` ### Step 4: Verify Installation ```bash # Test Python import python -c "from lightrag import LightRAG; print('✓ LightRAG imported')" # Test tiktoken python -c "from lightrag.utils import TiktokenTokenizer; t = TiktokenTokenizer(); print('✓ Tiktoken working')" # Test optional dependencies (if installed) python -c "import docling; print('✓ Docling available')" python -c "import redis; print('✓ Redis available')" ``` ## Troubleshooting ### Issue: Tiktoken fails with network error **Problem**: `Unable to load tokenizer for model gpt-4o-mini` **Solution**: ```bash # Ensure TIKTOKEN_CACHE_DIR is set echo $TIKTOKEN_CACHE_DIR # Verify cache files exist ls -la ~/.tiktoken_cache/ # If empty, you need to download cache in online environment first ``` ### Issue: Dynamic package installation fails **Problem**: `Error installing package xxx` **Solution**: ```bash # Pre-install the specific package you need # For document processing: pip install lightrag-hku[offline-docs] # For storage backends: pip install lightrag-hku[offline-storage] # For LLM providers: pip install lightrag-hku[offline-llm] ``` ### Issue: Missing dependencies at runtime **Problem**: `ModuleNotFoundError: No module named 'xxx'` **Solution**: ```bash # Check what you have installed pip list | grep -i xxx # Install missing component pip install lightrag-hku[offline] # Install all offline deps ``` ### Issue: Permission denied on tiktoken cache **Problem**: `PermissionError: [Errno 13] Permission denied` **Solution**: ```bash # Ensure cache directory has correct permissions chmod 755 ~/.tiktoken_cache chmod 644 ~/.tiktoken_cache/* # Or use a user-writable directory export TIKTOKEN_CACHE_DIR=~/my_tiktoken_cache mkdir -p ~/my_tiktoken_cache ``` ## Best Practices 1. **Test in Online Environment First**: Always test your complete setup in an online environment before going offline. 2. **Keep Cache Updated**: Periodically update your offline cache when new models are released. 3. **Document Your Setup**: Keep notes on which optional dependencies you actually need. 4. **Version Pinning**: Consider pinning specific versions in production: ```bash pip freeze > requirements-production.txt ``` 5. **Minimal Installation**: Only install what you need: ```bash # If you only process PDFs with OpenAI pip install lightrag-hku[offline-docs] # Then manually add: pip install openai ``` ## Additional Resources - [LightRAG GitHub Repository](https://github.com/HKUDS/LightRAG) - [Docker Deployment Guide](./DockerDeployment.md) - [API Documentation](../lightrag/api/README.md) ## Support If you encounter issues not covered in this guide: 1. Check the [GitHub Issues](https://github.com/HKUDS/LightRAG/issues) 2. Review the [project documentation](../README.md) 3. Create a new issue with your offline deployment details