* Remove outdated documentation files: Quick Start Guide, Apache AGE Analysis, and Scratchpad. * Add multi-tenant testing strategy and ADR index documentation - Introduced ADR 008 detailing the multi-tenant testing strategy for the ./starter environment, covering compatibility and multi-tenant modes, testing scenarios, and implementation details. - Created a comprehensive ADR index (README.md) summarizing all architecture decision records related to the multi-tenant implementation, including purpose, key sections, and reading paths for different roles. * feat(docs): Add comprehensive multi-tenancy guide and README for LightRAG Enterprise - Introduced `0008-multi-tenancy.md` detailing multi-tenancy architecture, key concepts, roles, permissions, configuration, and API endpoints. - Created `README.md` as the main documentation index, outlining features, quick start, system overview, and deployment options. - Documented the LightRAG architecture, storage backends, LLM integrations, and query modes. - Established a task log (`2025-01-21-lightrag-documentation-log.md`) summarizing documentation creation actions, decisions, and insights.
407 lines
9.7 KiB
Markdown
407 lines
9.7 KiB
Markdown
# LightRAG Quick Start Guide
|
|
|
|
Get up and running with LightRAG in 5 minutes.
|
|
|
|
## What is LightRAG?
|
|
|
|
LightRAG is a **Graph-Enhanced Retrieval-Augmented Generation (RAG)** system that combines knowledge graphs with vector search for superior context retrieval.
|
|
|
|
```
|
|
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
|
|
│ Document │───▶│ LightRAG │───▶│ Knowledge │
|
|
│ Input │ │ Indexing │ │ Graph │
|
|
└──────────────┘ └──────────────┘ └──────────────┘
|
|
│ │
|
|
▼ ▼
|
|
┌──────────────┐ ┌──────────────┐
|
|
│ Vector │ │ Entity │
|
|
│ Chunks │ │ Relations │
|
|
└──────────────┘ └──────────────┘
|
|
│ │
|
|
└────────┬───────────┘
|
|
▼
|
|
┌──────────────┐
|
|
│ Hybrid │
|
|
│ Query │
|
|
└──────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Installation
|
|
|
|
### Option 1: pip (Recommended)
|
|
|
|
```bash
|
|
# Basic installation
|
|
pip install lightrag-hku
|
|
|
|
# With API server
|
|
pip install "lightrag-hku[api]"
|
|
|
|
# With storage backends
|
|
pip install "lightrag-hku[postgres]" # PostgreSQL
|
|
pip install "lightrag-hku[neo4j]" # Neo4j
|
|
pip install "lightrag-hku[milvus]" # Milvus
|
|
```
|
|
|
|
### Option 2: From Source
|
|
|
|
```bash
|
|
git clone https://github.com/HKUDS/LightRAG.git
|
|
cd LightRAG
|
|
pip install -e ".[api]"
|
|
```
|
|
|
|
### Option 3: Docker
|
|
|
|
```bash
|
|
docker pull ghcr.io/hkuds/lightrag:latest
|
|
docker run -p 9621:9621 -e OPENAI_API_KEY=sk-xxx ghcr.io/hkuds/lightrag:latest
|
|
```
|
|
|
|
---
|
|
|
|
## Quick Start (Python)
|
|
|
|
### 1. Basic Usage
|
|
|
|
```python
|
|
import asyncio
|
|
from lightrag import LightRAG, QueryParam
|
|
|
|
# Initialize LightRAG
|
|
rag = LightRAG(
|
|
working_dir="./rag_storage",
|
|
llm_model_name="gpt-4o-mini",
|
|
embedding_model_name="text-embedding-ada-002"
|
|
)
|
|
|
|
async def main():
|
|
# Insert documents
|
|
await rag.ainsert("Marie Curie was a physicist who discovered radium. "
|
|
"She was born in Poland and later moved to France. "
|
|
"She won the Nobel Prize in Physics in 1903.")
|
|
|
|
# Query with different modes
|
|
result = await rag.aquery(
|
|
"What did Marie Curie discover?",
|
|
param=QueryParam(mode="hybrid")
|
|
)
|
|
print(result)
|
|
|
|
asyncio.run(main())
|
|
```
|
|
|
|
### 2. Insert Documents from Files
|
|
|
|
```python
|
|
from lightrag import LightRAG
|
|
|
|
rag = LightRAG(working_dir="./rag_storage")
|
|
|
|
# Insert from file
|
|
with open("document.txt", "r") as f:
|
|
text = f.read()
|
|
await rag.ainsert(text)
|
|
|
|
# Insert from multiple files
|
|
documents = ["doc1.txt", "doc2.txt", "doc3.txt"]
|
|
for doc in documents:
|
|
with open(doc, "r") as f:
|
|
await rag.ainsert(f.read())
|
|
```
|
|
|
|
### 3. Query Modes Explained
|
|
|
|
```python
|
|
from lightrag import QueryParam
|
|
|
|
# NAIVE: Traditional vector search (fastest)
|
|
result = await rag.aquery("question", param=QueryParam(mode="naive"))
|
|
|
|
# LOCAL: Focuses on specific entities mentioned in query
|
|
result = await rag.aquery("question", param=QueryParam(mode="local"))
|
|
|
|
# GLOBAL: Uses high-level summaries for broad questions
|
|
result = await rag.aquery("question", param=QueryParam(mode="global"))
|
|
|
|
# HYBRID: Combines local + global (recommended)
|
|
result = await rag.aquery("question", param=QueryParam(mode="hybrid"))
|
|
|
|
# MIX: Uses all modes and combines results
|
|
result = await rag.aquery("question", param=QueryParam(mode="mix"))
|
|
```
|
|
|
|
---
|
|
|
|
## Quick Start (REST API)
|
|
|
|
### 1. Start the Server
|
|
|
|
```bash
|
|
# Set your API key
|
|
export OPENAI_API_KEY=sk-xxx
|
|
|
|
# Start server
|
|
python -m lightrag.api.lightrag_server
|
|
|
|
# Server runs at http://localhost:9621
|
|
```
|
|
|
|
### 2. Insert Documents
|
|
|
|
```bash
|
|
# Insert text
|
|
curl -X POST http://localhost:9621/documents/text \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"text": "Albert Einstein developed the theory of relativity."}'
|
|
|
|
# Upload file
|
|
curl -X POST http://localhost:9621/documents/file \
|
|
-F "file=@document.pdf"
|
|
```
|
|
|
|
### 3. Query
|
|
|
|
```bash
|
|
# Query with hybrid mode
|
|
curl -X POST http://localhost:9621/query \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"query": "What is the theory of relativity?",
|
|
"mode": "hybrid"
|
|
}'
|
|
```
|
|
|
|
### 4. View Knowledge Graph
|
|
|
|
```bash
|
|
# Get all entities
|
|
curl http://localhost:9621/graphs/entities
|
|
|
|
# Get entity relationships
|
|
curl http://localhost:9621/graphs/relations
|
|
```
|
|
|
|
---
|
|
|
|
## Configuration
|
|
|
|
### Environment Variables
|
|
|
|
```bash
|
|
# Required
|
|
OPENAI_API_KEY=sk-xxx # OpenAI API key
|
|
|
|
# LLM Configuration
|
|
LLM_MODEL=gpt-4o-mini # Model name
|
|
LLM_BINDING=openai # Provider: openai, ollama, anthropic
|
|
|
|
# Embedding Configuration
|
|
EMBEDDING_MODEL=text-embedding-ada-002
|
|
EMBEDDING_DIM=1536
|
|
|
|
# Server
|
|
PORT=9621 # API server port
|
|
HOST=0.0.0.0 # Bind address
|
|
```
|
|
|
|
### Using Different LLM Providers
|
|
|
|
```python
|
|
# OpenAI (default)
|
|
rag = LightRAG(
|
|
llm_model_name="gpt-4o-mini",
|
|
llm_model_func=openai_complete_if_cache
|
|
)
|
|
|
|
# Ollama (local)
|
|
from lightrag.llm.ollama import ollama_model_complete
|
|
|
|
rag = LightRAG(
|
|
llm_model_name="llama3",
|
|
llm_model_func=ollama_model_complete,
|
|
llm_model_kwargs={"host": "http://localhost:11434"}
|
|
)
|
|
|
|
# Anthropic Claude
|
|
from lightrag.llm.anthropic import anthropic_complete
|
|
|
|
rag = LightRAG(
|
|
llm_model_name="claude-sonnet-4-20250514",
|
|
llm_model_func=anthropic_complete
|
|
)
|
|
```
|
|
|
|
### Storage Backends
|
|
|
|
```python
|
|
# Default (file-based, great for development)
|
|
rag = LightRAG(working_dir="./rag_storage")
|
|
|
|
# PostgreSQL (production)
|
|
rag = LightRAG(
|
|
working_dir="./rag_storage",
|
|
kv_storage="PGKVStorage",
|
|
vector_storage="PGVectorStorage",
|
|
graph_storage="PGGraphStorage"
|
|
)
|
|
|
|
# Neo4j (advanced graph queries)
|
|
rag = LightRAG(
|
|
working_dir="./rag_storage",
|
|
graph_storage="Neo4JStorage"
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
## Common Patterns
|
|
|
|
### Pattern 1: Document Processing Pipeline
|
|
|
|
```python
|
|
import asyncio
|
|
from pathlib import Path
|
|
from lightrag import LightRAG
|
|
|
|
async def process_documents(folder_path: str):
|
|
rag = LightRAG(working_dir="./rag_storage")
|
|
|
|
# Process all text files
|
|
for file_path in Path(folder_path).glob("*.txt"):
|
|
print(f"Processing: {file_path}")
|
|
with open(file_path, "r") as f:
|
|
await rag.ainsert(f.read())
|
|
|
|
print("All documents indexed!")
|
|
|
|
asyncio.run(process_documents("./documents"))
|
|
```
|
|
|
|
### Pattern 2: Conversational RAG
|
|
|
|
```python
|
|
from lightrag import LightRAG, QueryParam
|
|
|
|
async def chat():
|
|
rag = LightRAG(working_dir="./rag_storage")
|
|
|
|
while True:
|
|
question = input("You: ")
|
|
if question.lower() == "quit":
|
|
break
|
|
|
|
response = await rag.aquery(
|
|
question,
|
|
param=QueryParam(mode="hybrid")
|
|
)
|
|
print(f"Assistant: {response}")
|
|
|
|
asyncio.run(chat())
|
|
```
|
|
|
|
### Pattern 3: Batch Queries
|
|
|
|
```python
|
|
async def batch_query(questions: list):
|
|
rag = LightRAG(working_dir="./rag_storage")
|
|
|
|
results = []
|
|
for question in questions:
|
|
result = await rag.aquery(
|
|
question,
|
|
param=QueryParam(mode="hybrid")
|
|
)
|
|
results.append({"question": question, "answer": result})
|
|
|
|
return results
|
|
|
|
questions = [
|
|
"What is quantum computing?",
|
|
"Who invented the telephone?",
|
|
"What is machine learning?"
|
|
]
|
|
|
|
answers = asyncio.run(batch_query(questions))
|
|
```
|
|
|
|
---
|
|
|
|
## Verify Installation
|
|
|
|
```python
|
|
# Test script
|
|
import asyncio
|
|
from lightrag import LightRAG, QueryParam
|
|
|
|
async def test():
|
|
# Initialize
|
|
rag = LightRAG(working_dir="./test_rag")
|
|
|
|
# Insert test document
|
|
await rag.ainsert("Python is a programming language created by Guido van Rossum.")
|
|
|
|
# Query
|
|
result = await rag.aquery("Who created Python?", param=QueryParam(mode="hybrid"))
|
|
print(f"Answer: {result}")
|
|
|
|
# Cleanup
|
|
import shutil
|
|
shutil.rmtree("./test_rag")
|
|
|
|
asyncio.run(test())
|
|
```
|
|
|
|
Expected output:
|
|
```
|
|
Answer: Python was created by Guido van Rossum.
|
|
```
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
1. **[Architecture Overview](0002-architecture-overview.md)** - Understand how LightRAG works
|
|
2. **[API Reference](0003-api-reference.md)** - Complete REST API documentation
|
|
3. **[Storage Backends](0004-storage-backends.md)** - Configure production storage
|
|
4. **[LLM Integration](0005-llm-integration.md)** - Use different LLM providers
|
|
5. **[Deployment Guide](0006-deployment-guide.md)** - Deploy to production
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
| Issue | Solution |
|
|
|-------|----------|
|
|
| `ModuleNotFoundError: lightrag` | Run `pip install lightrag-hku` |
|
|
| `OPENAI_API_KEY not set` | Export your API key: `export OPENAI_API_KEY=sk-xxx` |
|
|
| `Connection refused` on port 9621 | Check if server is running: `python -m lightrag.api.lightrag_server` |
|
|
| Slow indexing | Use batch inserts or consider PostgreSQL for large datasets |
|
|
| Memory errors | Reduce chunk size or use streaming mode |
|
|
|
|
---
|
|
|
|
## Example Projects
|
|
|
|
```bash
|
|
# Clone and explore examples
|
|
git clone https://github.com/HKUDS/LightRAG.git
|
|
cd LightRAG/examples
|
|
|
|
# Run OpenAI demo
|
|
python lightrag_openai_demo.py
|
|
|
|
# Run Ollama demo (local LLM)
|
|
python lightrag_ollama_demo.py
|
|
|
|
# Visualize knowledge graph
|
|
python graph_visual_with_html.py
|
|
```
|
|
|
|
---
|
|
|
|
**Need Help?**
|
|
- GitHub Issues: https://github.com/HKUDS/LightRAG/issues
|
|
- Documentation: https://github.com/HKUDS/LightRAG/wiki
|