LightRAG

gmakstutis/LightRAG

Fork 0

Commit graph

Author	SHA1	Message	Date
Claude	9b4831d84e	Add comprehensive RAGAS evaluation framework guide This guide provides a complete introduction to RAGAS (Retrieval-Augmented Generation Assessment): Core Concepts: - What is RAGAS and why it's needed for RAG system evaluation - Automated, quantifiable, and trackable quality assessment Four Key Metrics Explained: 1. Context Precision (0.7-1.0): How relevant are retrieved documents? 2. Context Recall (0.7-1.0): Are all key facts retrieved? 3. Faithfulness (0.7-1.0): Is the answer grounded in context (no hallucination)? 4. Answer Relevancy (0.7-1.0): Does the answer address the question? How It Works: - Uses evaluation LLM to judge answer quality - Workflow: test dataset → run RAG → RAGAS scores → optimization insights - Integrated with LightRAG's existing evaluation module Practical Usage: - Quick start guide for LightRAG users - Real output examples with interpretation - Cost analysis (~$1-2 per 100 questions with GPT-4o-mini) - Optimization strategies based on low-scoring metrics Limitations & Best Practices: - Depends on evaluation LLM quality - Requires high-quality ground truth answers - Recommended hybrid approach: RAGAS (scale) + human review (depth) - Decision matrix for when to use RAGAS vs alternatives Use Cases: ✅ Comparing different configurations/models ✅ A/B testing new features ✅ Continuous performance monitoring ❌ Single component evaluation (use Precision/Recall instead) Helps users understand and effectively use RAGAS for RAG system quality assurance.	2025-11-19 12:52:22 +00:00

Author

SHA1

Message

Date

Claude

9b4831d84e

Add comprehensive RAGAS evaluation framework guide

This guide provides a complete introduction to RAGAS (Retrieval-Augmented Generation Assessment):

**Core Concepts:**
- What is RAGAS and why it's needed for RAG system evaluation
- Automated, quantifiable, and trackable quality assessment

**Four Key Metrics Explained:**
1. Context Precision (0.7-1.0): How relevant are retrieved documents?
2. Context Recall (0.7-1.0): Are all key facts retrieved?
3. Faithfulness (0.7-1.0): Is the answer grounded in context (no hallucination)?
4. Answer Relevancy (0.7-1.0): Does the answer address the question?

**How It Works:**
- Uses evaluation LLM to judge answer quality
- Workflow: test dataset → run RAG → RAGAS scores → optimization insights
- Integrated with LightRAG's existing evaluation module

**Practical Usage:**
- Quick start guide for LightRAG users
- Real output examples with interpretation
- Cost analysis (~$1-2 per 100 questions with GPT-4o-mini)
- Optimization strategies based on low-scoring metrics

**Limitations & Best Practices:**
- Depends on evaluation LLM quality
- Requires high-quality ground truth answers
- Recommended hybrid approach: RAGAS (scale) + human review (depth)
- Decision matrix for when to use RAGAS vs alternatives

**Use Cases:**
✅ Comparing different configurations/models
✅ A/B testing new features
✅ Continuous performance monitoring
❌ Single component evaluation (use Precision/Recall instead)

Helps users understand and effectively use RAGAS for RAG system quality assurance.

2025-11-19 12:52:22 +00:00

1 commit