ragflow/api
hsparks-codes 237a66913b
Feat: RAG evaluation (#11674)
### What problem does this PR solve?

Feature: This PR implements a comprehensive RAG evaluation framework to
address issue #11656.

**Problem**: Developers using RAGFlow lack systematic ways to measure
RAG accuracy and quality. They cannot objectively answer:
1. Are RAG results truly accurate?
2. How should configurations be adjusted to improve quality?
3. How to maintain and improve RAG performance over time?

**Solution**: This PR adds a complete evaluation system with:
- **Dataset & test case management** - Create ground truth datasets with
questions and expected answers
- **Automated evaluation** - Run RAG pipeline on test cases and compute
metrics
- **Comprehensive metrics** - Precision, recall, F1 score, MRR, hit rate
for retrieval quality
- **Smart recommendations** - Analyze results and suggest specific
configuration improvements (e.g., "increase top_k", "enable reranking")
- **20+ REST API endpoints** - Full CRUD operations for datasets, test
cases, and evaluation runs

**Impact**: Enables developers to objectively measure RAG quality,
identify issues, and systematically improve their RAG systems through
data-driven configuration tuning.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-03 17:00:58 +08:00
..
apps Feat: RAG evaluation (#11674) 2025-12-03 17:00:58 +08:00
common Feat:admin api (#10642) 2025-10-18 16:09:48 +08:00
db Feat: RAG evaluation (#11674) 2025-12-03 17:00:58 +08:00
utils Refa: make RAGFlow more asynchronous (#11601) 2025-12-01 14:24:06 +08:00
__init__.py Fix: incorrect async chat streamly output (#11679) 2025-12-03 11:15:45 +08:00
constants.py Introduce common/constants.py (#10965) 2025-11-03 16:32:37 +08:00
ragflow_server.py Refa: make RAGFlow more asynchronous (#11601) 2025-12-01 14:24:06 +08:00
settings.py Move api.settings to common.settings (#11036) 2025-11-06 09:36:38 +08:00
validation.py Fix errors detected by Ruff (#3918) 2024-12-08 14:21:12 +08:00