LightRAG/docs/PromptCustomization.md
2025-11-11 21:50:13 +07:00

301 lines
7.4 KiB
Markdown

# Prompt Customization Guide
LightRAG cho phép bạn tùy chỉnh tất cả các prompts được sử dụng trong hệ thống thông qua các file Markdown.
## 📁 Vị trí Prompts
Tất cả prompts được lưu trong thư mục:
```
lightrag/prompts/
├── entity_extraction_system_prompt.md
├── entity_extraction_user_prompt.md
├── entity_continue_extraction_user_prompt.md
├── entity_extraction_example_1.md
├── entity_extraction_example_2.md
├── entity_extraction_example_3.md
├── summarize_entity_descriptions.md
├── rag_response.md
├── naive_rag_response.md
├── keywords_extraction.md
├── keywords_extraction_example_1.md
├── keywords_extraction_example_2.md
├── keywords_extraction_example_3.md
├── kg_query_context.md
├── naive_query_context.md
├── fail_response.md
└── README.md
```
## 🔧 Cách Tùy Chỉnh
### Local Development
1. **Mở file prompt cần chỉnh sửa:**
```bash
code lightrag/prompts/entity_extraction_system_prompt.md
```
2. **Chỉnh sửa nội dung** (giữ nguyên placeholders)
3. **Restart application:**
```bash
# Nếu chạy trực tiếp
# Ctrl+C và chạy lại
# Nếu dùng lightrag-server
pkill lightrag-server
lightrag-server
```
### Docker Deployment
Với Docker, prompts được mount từ host vào container:
1. **Chỉnh sửa file trên host:**
```bash
nano lightrag/prompts/entity_extraction_system_prompt.md
```
2. **Restart container:**
```bash
docker-compose restart lightrag
```
**Lợi ích:** Không cần rebuild Docker image!
Chi tiết xem: [lightrag/prompts/DOCKER_USAGE.md](../lightrag/prompts/DOCKER_USAGE.md)
## 📝 Prompt Variables
Các prompts sử dụng placeholders được thay thế runtime:
### Entity Extraction Prompts
- `{entity_types}` - Danh sách các entity types
- `{tuple_delimiter}` - Delimiter giữa các fields (mặc định: `<|#|>`)
- `{completion_delimiter}` - Signal kết thúc (mặc định: `<|COMPLETE|>`)
- `{language}` - Ngôn ngữ output (English, Vietnamese, etc.)
- `{input_text}` - Text cần extract entities
- `{examples}` - Examples được insert từ example files
### RAG Response Prompts
- `{response_type}` - Kiểu response (paragraphs, bullet points, etc.)
- `{user_prompt}` - Additional instructions từ user
- `{context_data}` - Knowledge graph + document chunks
- `{entities_str}` - JSON entities
- `{relations_str}` - JSON relationships
- `{text_chunks_str}` - Document chunks
- `{reference_list_str}` - Reference documents
### Summary Prompts
- `{description_type}` - Entity hoặc Relation
- `{description_name}` - Tên của entity/relation
- `{description_list}` - JSON list các descriptions
- `{summary_length}` - Max tokens cho summary
- `{language}` - Output language
### Keyword Extraction Prompts
- `{query}` - User query
- `{examples}` - Examples từ example files
## 💡 Best Practices
### 1. Backup trước khi thay đổi
```bash
git checkout -b custom-prompts
# ... make changes ...
git commit -am "Customize entity extraction for medical domain"
```
### 2. Giữ nguyên placeholders
❌ **SAI:**
```
Entity types: organization, person, location
```
✅ **ĐÚNG:**
```
Entity types: {entity_types}
```
### 3. Test incremental changes
- Thay đổi một prompt tại một thời điểm
- Test thoroughly trước khi deploy production
- Monitor quality metrics
### 4. Document your changes
Thêm comment hoặc note trong prompt:
```markdown
---Role---
You are a Knowledge Graph Specialist...
<!-- Custom modification for medical domain - 2024-11-11 -->
<!-- Added specific instructions for medical entity types -->
```
### 5. Version control
```bash
# Tag phiên bản prompts
git tag -a prompts-v1.0 -m "Production prompts version 1.0"
# Rollback nếu cần
git checkout prompts-v1.0 -- lightrag/prompts/
```
## 🎯 Common Customization Scenarios
### Scenario 1: Thêm Entity Type mới
**File:** `entity_extraction_system_prompt.md`
```markdown
entity_type: Categorize the entity using one of the following types:
{entity_types}. If none of the provided entity types apply,
do not add new entity type and classify it as `Other`.
```
Thay đổi:
```markdown
entity_type: Categorize the entity using one of the following types:
{entity_types}, MEDICAL_TERM, DRUG_NAME, DISEASE.
If none apply, classify as `Other`.
```
### Scenario 2: Thay đổi Response Format
**File:** `rag_response.md`
Tìm section về References và customize format:
```markdown
4. References Section Format:
- The References section should be under heading: `### References`
- Reference list entries should adhere to the format: `* [n] Document Title`
```
### Scenario 3: Multi-language Support
**File:** `entity_extraction_system_prompt.md`
Thêm instructions cho ngôn ngữ cụ thể:
```markdown
7. Language & Proper Nouns:
- The entire output must be written in `{language}`.
- For Vietnamese: Use diacritics correctly and proper Vietnamese grammar.
- For Chinese: Use simplified or traditional based on context.
```
### Scenario 4: Domain-specific Instructions
Thêm domain knowledge vào prompts:
```markdown
---Domain Context---
For financial documents:
- Identify financial metrics (revenue, profit, loss, etc.)
- Extract temporal information (quarters, fiscal years)
- Recognize financial entities (stocks, bonds, derivatives)
```
## 🔍 Testing Customized Prompts
### Unit Test
```python
from lightrag.prompt import PROMPTS
# Verify prompt loaded correctly
assert "{entity_types}" in PROMPTS["entity_extraction_system_prompt"]
assert len(PROMPTS["entity_extraction_examples"]) == 3
# Test formatting
formatted = PROMPTS["entity_extraction_system_prompt"].format(
entity_types="person, organization",
tuple_delimiter="<|#|>",
language="English",
examples="...",
completion_delimiter="<|COMPLETE|>",
input_text="Test text"
)
print(formatted)
```
### Integration Test
```python
from lightrag import LightRAG
# Initialize with custom prompts
rag = LightRAG(working_dir="./test_dir")
# Insert test data
rag.insert("Your test document here")
# Query and validate
result = rag.query("Test query", mode="hybrid")
print(result)
```
## 🚨 Troubleshooting
### Prompt không load
```python
# Debug prompt loading
from lightrag.prompt import _PROMPT_DIR
print(f"Prompt directory: {_PROMPT_DIR}")
print(f"Directory exists: {_PROMPT_DIR.exists()}")
print(f"Files: {list(_PROMPT_DIR.glob('*.md'))}")
```
### Syntax error trong prompt
- Check placeholders: `{variable_name}`
- Không dùng `{{` hoặc `}}`
- Đảm bảo file UTF-8 encoding
### Performance degradation
- So sánh với baseline metrics
- A/B test với prompts cũ
- Review prompt complexity
## 📚 Related Documentation
- [lightrag/prompts/README.md](../lightrag/prompts/README.md) - Prompt structure overview
- [lightrag/prompts/DOCKER_USAGE.md](../lightrag/prompts/DOCKER_USAGE.md) - Docker-specific usage
- [Algorithm.md](Algorithm.md) - Understanding how prompts are used in the pipeline
## 🤝 Contributing Custom Prompts
Nếu prompts của bạn improve quality đáng kể:
1. Fork repository
2. Create feature branch
3. Test thoroughly
4. Document changes
5. Submit PR with:
- Before/after metrics
- Use cases
- Example outputs
## 📞 Support
Có câu hỏi về prompt customization?
- Check [lightrag/prompts/README.md](../lightrag/prompts/README.md)
- Open issue on GitHub
- Discuss in community channels