Added efficient RAG evaluation system with optimized API calls and comprehensive benchmarking. Key Features: - Single API call per evaluation (2x faster than before) - Parallel evaluation based on MAX_ASYNC environment variable - Chunk content enrichment in /query endpoint responses - Comprehensive benchmark statistics (moyennes) - NaN-safe metric calculations API Changes: - Added include_chunk_content parameter to QueryRequest (backward compatible) - /query endpoint enriches references with actual chunk content when requested - No breaking changes - default behavior unchanged Evaluation Improvements: - Parallel execution using asyncio.Semaphore (respects MAX_ASYNC) - Shared HTTP client with connection pooling - Proper timeout handling (3min connect, 5min read) - Debug output for context retrieval verification - Benchmark statistics with averages, min/max scores Results: - Moyenne RAGAS Score: 0.9772 - Perfect Faithfulness: 1.0000 - Perfect Context Recall: 1.0000 - Perfect Context Precision: 1.0000 - Excellent Answer Relevance: 0.9087 |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| document_routes.py | ||
| graph_routes.py | ||
| ollama_api.py | ||
| query_routes.py | ||