gmakstutis/LightRAG

Fork 0

Hầu Phi Dao c941ac03bd update promts to folder

2025-11-11 21:50:13 +07:00

7.4 KiB

Raw Blame History

Prompt Customization Guide

LightRAG cho phép bạn tùy chỉnh tất cả các prompts được sử dụng trong hệ thống thông qua các file Markdown.

📁 Vị trí Prompts

Tất cả prompts được lưu trong thư mục:

lightrag/prompts/
├── entity_extraction_system_prompt.md
├── entity_extraction_user_prompt.md
├── entity_continue_extraction_user_prompt.md
├── entity_extraction_example_1.md
├── entity_extraction_example_2.md
├── entity_extraction_example_3.md
├── summarize_entity_descriptions.md
├── rag_response.md
├── naive_rag_response.md
├── keywords_extraction.md
├── keywords_extraction_example_1.md
├── keywords_extraction_example_2.md
├── keywords_extraction_example_3.md
├── kg_query_context.md
├── naive_query_context.md
├── fail_response.md
└── README.md

🔧 Cách Tùy Chỉnh

Local Development

Mở file prompt cần chỉnh sửa:

code lightrag/prompts/entity_extraction_system_prompt.md

Chỉnh sửa nội dung (giữ nguyên placeholders)

Restart application:

# Nếu chạy trực tiếp
# Ctrl+C và chạy lại

# Nếu dùng lightrag-server
pkill lightrag-server
lightrag-server

Docker Deployment

Với Docker, prompts được mount từ host vào container:

Chỉnh sửa file trên host:

nano lightrag/prompts/entity_extraction_system_prompt.md

Restart container:
```
docker-compose restart lightrag
```

Lợi ích: Không cần rebuild Docker image!

Chi tiết xem: lightrag/prompts/DOCKER_USAGE.md

📝 Prompt Variables

Các prompts sử dụng placeholders được thay thế runtime:

Entity Extraction Prompts

{entity_types} - Danh sách các entity types
{tuple_delimiter} - Delimiter giữa các fields (mặc định: <|#|>)
{completion_delimiter} - Signal kết thúc (mặc định: <|COMPLETE|>)
{language} - Ngôn ngữ output (English, Vietnamese, etc.)
{input_text} - Text cần extract entities
{examples} - Examples được insert từ example files

RAG Response Prompts

{response_type} - Kiểu response (paragraphs, bullet points, etc.)
{user_prompt} - Additional instructions từ user
{context_data} - Knowledge graph + document chunks
{entities_str} - JSON entities
{relations_str} - JSON relationships
{text_chunks_str} - Document chunks
{reference_list_str} - Reference documents

Summary Prompts

{description_type} - Entity hoặc Relation
{description_name} - Tên của entity/relation
{description_list} - JSON list các descriptions
{summary_length} - Max tokens cho summary
{language} - Output language

Keyword Extraction Prompts

{query} - User query
{examples} - Examples từ example files

💡 Best Practices

1. Backup trước khi thay đổi

git checkout -b custom-prompts
# ... make changes ...
git commit -am "Customize entity extraction for medical domain"

2. Giữ nguyên placeholders

❌ SAI:

Entity types: organization, person, location

✅ ĐÚNG:

Entity types: {entity_types}

3. Test incremental changes

Thay đổi một prompt tại một thời điểm
Test thoroughly trước khi deploy production
Monitor quality metrics

4. Document your changes

Thêm comment hoặc note trong prompt:

---Role---
You are a Knowledge Graph Specialist...

<!-- Custom modification for medical domain - 2024-11-11 -->
<!-- Added specific instructions for medical entity types -->

5. Version control

# Tag phiên bản prompts
git tag -a prompts-v1.0 -m "Production prompts version 1.0"

# Rollback nếu cần
git checkout prompts-v1.0 -- lightrag/prompts/

🎯 Common Customization Scenarios

Scenario 1: Thêm Entity Type mới

File: entity_extraction_system_prompt.md

entity_type: Categorize the entity using one of the following types: 
{entity_types}. If none of the provided entity types apply, 
do not add new entity type and classify it as `Other`.

Thay đổi:

entity_type: Categorize the entity using one of the following types: 
{entity_types}, MEDICAL_TERM, DRUG_NAME, DISEASE. 
If none apply, classify as `Other`.

Scenario 2: Thay đổi Response Format

File: rag_response.md

Tìm section về References và customize format:

4. References Section Format:
  - The References section should be under heading: `### References`
  - Reference list entries should adhere to the format: `* [n] Document Title`

Scenario 3: Multi-language Support

File: entity_extraction_system_prompt.md

Thêm instructions cho ngôn ngữ cụ thể:

7. Language & Proper Nouns:
  - The entire output must be written in `{language}`.
  - For Vietnamese: Use diacritics correctly and proper Vietnamese grammar.
  - For Chinese: Use simplified or traditional based on context.

Scenario 4: Domain-specific Instructions

Thêm domain knowledge vào prompts:

---Domain Context---

For financial documents:
- Identify financial metrics (revenue, profit, loss, etc.)
- Extract temporal information (quarters, fiscal years)
- Recognize financial entities (stocks, bonds, derivatives)

🔍 Testing Customized Prompts

Unit Test

from lightrag.prompt import PROMPTS

# Verify prompt loaded correctly
assert "{entity_types}" in PROMPTS["entity_extraction_system_prompt"]
assert len(PROMPTS["entity_extraction_examples"]) == 3

# Test formatting
formatted = PROMPTS["entity_extraction_system_prompt"].format(
    entity_types="person, organization",
    tuple_delimiter="<|#|>",
    language="English",
    examples="...",
    completion_delimiter="<|COMPLETE|>",
    input_text="Test text"
)
print(formatted)

Integration Test

from lightrag import LightRAG

# Initialize with custom prompts
rag = LightRAG(working_dir="./test_dir")

# Insert test data
rag.insert("Your test document here")

# Query and validate
result = rag.query("Test query", mode="hybrid")
print(result)

🚨 Troubleshooting

Prompt không load

# Debug prompt loading
from lightrag.prompt import _PROMPT_DIR
print(f"Prompt directory: {_PROMPT_DIR}")
print(f"Directory exists: {_PROMPT_DIR.exists()}")
print(f"Files: {list(_PROMPT_DIR.glob('*.md'))}")

Syntax error trong prompt

Check placeholders: {variable_name}
Không dùng {{ hoặc }}
Đảm bảo file UTF-8 encoding

Performance degradation

So sánh với baseline metrics
A/B test với prompts cũ
Review prompt complexity

lightrag/prompts/README.md - Prompt structure overview
lightrag/prompts/DOCKER_USAGE.md - Docker-specific usage
Algorithm.md - Understanding how prompts are used in the pipeline

🤝 Contributing Custom Prompts

Nếu prompts của bạn improve quality đáng kể:

Fork repository
Create feature branch
Test thoroughly
Document changes
Submit PR with:
- Before/after metrics
- Use cases
- Example outputs

📞 Support

Có câu hỏi về prompt customization?

Check lightrag/prompts/README.md
Open issue on GitHub
Discuss in community channels

7.4 KiB Raw Blame History