Commit graph

3554 commits

Author SHA1 Message Date
Alexander Belikov
fc0a417775 tested on example; fixed schema definition 2025-11-13 16:57:41 +01:00
Alexander Belikov
3b37448d5f removed deprecated methods 2025-11-07 14:24:18 +01:00
Alexander Belikov
dc2898d358 Merge branch 'main' into feature-add-tigergraph-support 2025-11-07 14:18:33 +01:00
Alexander Belikov
af33abee40 updated fetch with db side processing 2025-11-07 14:16:53 +01:00
yangdx
fc40a36968 Add timeout support to Gemini LLM and improve parameter handling
• Add timeout parameter to Gemini client
• Convert timeout seconds to milliseconds
• Update function signatures consistently
• Add Gemini thinking config example
• Clean up parameter documentation
2025-11-07 15:50:14 +08:00
yangdx
3cb4eae492 Add Chain of Thought support to Gemini LLM integration
- Extract thoughts from response parts
- Add COT enable/disable parameter
2025-11-07 15:22:14 +08:00
yangdx
6686edfd35 Update Gemini LLM options: add seed and thinking config, remove MIME type 2025-11-07 14:32:42 +08:00
yangdx
8c27555358 Fix Gemini response parsing to avoid warnings from non-text parts 2025-11-07 04:00:37 +08:00
yangdx
ea141e2779 Fix: Remove redundant entity/relation chunk deletions 2025-11-07 02:56:16 +08:00
yangdx
5bcd2926ca Bump API version to 0251 2025-11-06 21:45:47 +08:00
yangdx
04ed709b34 Optimize entity deletion by batching edge queries to avoid N+1 problem
• Add batch get_nodes_edges_batch call
• Remove individual get_node_edges calls
• Improve query performance
2025-11-06 21:34:47 +08:00
yangdx
3276b7a49d Fix linting 2025-11-06 20:48:51 +08:00
yangdx
155f59759b Fix node ID normalization and improve batch operation consistency
• Remove premature ID normalization
• Add lookup mapping for node resolution
• Filter results by requested nodes only
• Improve error logging with workspace
2025-11-06 20:34:53 +08:00
yangdx
807d2461d3 Remove unused chunk-based node/edge retrieval methods 2025-11-06 18:17:10 +08:00
yangdx
831e658ed8 Update readme 2025-11-06 16:26:07 +08:00
yangdx
6e36ff41e1 Fix linting 2025-11-06 16:01:24 +08:00
yangdx
5f49cee20f Merge branch 'main' into VOXWAVE-FOUNDRY/main 2025-11-06 15:37:35 +08:00
Alexander Belikov
d0734b119a added tigergraph as graph storage 2025-11-05 15:23:35 +01:00
yangdx
9c05706062 Add separate endpoint configuration for LLM and embeddings in evaluation
- Split LLM and embedding API configs
- Add fallback chain for API keys
- Update docs with usage examples
2025-11-05 18:54:38 +08:00
yangdx
994a82dc7f Suppress token usage warnings for custom OpenAI-compatible endpoints
• Add warning filter for token usage
• Support vLLM, SGLang endpoints
• Non-critical for RAGAS evaluation
2025-11-05 18:25:28 +08:00
yangdx
f490622b72 Doc: Refactor evaluation README to improve clarity and structure 2025-11-05 10:43:55 +08:00
yangdx
a73314a4ba Refactor evaluation results display and logging format 2025-11-05 10:08:17 +08:00
yangdx
06b91d00f8 Improve RAG evaluation progress eval index display with zero padding 2025-11-05 09:46:07 +08:00
yangdx
2823f92fb6 Fix tqdm progress bar conflicts in concurrent RAG evaluation
• Add position pool for tqdm bars
• Serialize tqdm creation with lock
• Set leave=False to clear completed bars
• Pass position/lock to eval tasks
• Import tqdm.auto for better display
2025-11-05 02:04:13 +08:00
yangdx
e5abe9dd3d Restructure semaphore control to manage entire evaluation pipeline
• Move rag_semaphore to wrap full function
• Increase RAG concurrency to 2x eval limit
• Prevent memory buildup from slow evals
• Keep eval_semaphore for RAGAS control
2025-11-05 01:07:53 +08:00
yangdx
83715a3ac1 Implement two-stage pipeline for RAG evaluation with separate semaphores
• Split RAG gen and eval stages
• Add rag_semaphore for stage 1
• Add eval_semaphore for stage 2
• Improve concurrency control
• Update connection pool limits
2025-11-05 00:36:09 +08:00
yangdx
d36be1f499 Improve RAGAS evaluation progress tracking and clean up output handling
• Add tqdm progress bar for eval steps
• Pass progress bar to RAGAS evaluate
• Ensure progress bar cleanup in finally
• Remove redundant output buffer flushes
2025-11-05 00:16:02 +08:00
yangdx
c358f405a9 Update evaluation defaults and expand sample dataset
• Lower concurrent evals from 3 to 2
• Standardize project names in samples
• Add 3 new evaluation questions
• Expand ground truth detail coverage
• Improve dataset comprehensiveness
2025-11-04 22:17:17 +08:00
yangdx
41c26a3677 feat: add command-line args to RAG evaluation script
- Add --dataset and --ragendpoint flags
- Support short forms -d and -r
- Update README with usage examples
2025-11-04 21:40:27 +08:00
yangdx
d4b8a229b9 Update RAGAS evaluation to use gpt-4o-mini and improve compatibility
- Change default model to gpt-4o-mini
- Add deprecation warning suppression
- Update docs and comments for LightRAG
- Improve output formatting and timing
2025-11-04 18:50:53 +08:00
yangdx
6d61f70b92 Clean up RAG evaluator logging and remove excessive separator lines
• Remove excessive separator lines
• Add RAGAS concurrency comment
• Fix output buffer timing
2025-11-04 18:04:19 +08:00
yangdx
4e4b8d7e25 Update RAG evaluation metrics to use class instances instead of objects
• Import metric classes not instances
• Instantiate metrics with () syntax
2025-11-04 15:56:57 +08:00
yangdx
7abc687742 Add comprehensive configuration and compatibility fixes for RAGAS
- Fix RAGAS LLM wrapper compatibility
- Add concurrency control for rate limits
- Add eval env vars for model config
- Improve error handling and logging
- Update documentation with examples
2025-11-04 14:39:27 +08:00
yangdx
72db042667 Update .env loading and add API authentication to RAG evaluator
• Load .env from current directory
• Support LIGHTRAG_API_KEY auth header
• Override=False for env precedence
• Add Bearer token to API requests
• Enable per-instance .env configs
2025-11-04 10:59:09 +08:00
anouarbm
ad2d3c2cc0 Merge remote-tracking branch 'origin/main' into feat/ragas-evaluation 2025-11-03 13:48:14 +01:00
anouarbm
debfa0ec96 Merge branch 'feat/ragas-evaluation' of https://github.com/anouar-bm/LightRAG into feat/ragas-evaluation 2025-11-03 13:30:16 +01:00
anouarbm
a172cf893d feat(evaluation): Add sample documents for reproducible RAGAS testing
Add 5 markdown documents that users can index to reproduce evaluation results.

Changes:
- Add sample_documents/ folder with 5 markdown files covering LightRAG features
- Update sample_dataset.json with 3 improved, specific test questions
- Shorten and correct evaluation README (removed outdated info about mock responses)
- Add sample_documents reference with expected ~95% RAGAS score

Test Results with sample documents:
- Average RAGAS Score: 95.28%
- Faithfulness: 100%, Answer Relevance: 96.67%
- Context Recall: 88.89%, Context Precision: 95.56%
2025-11-03 13:28:46 +01:00
yangdx
10f6e6955f Improve Langfuse integration and stream response cleanup handling
• Check env vars before enabling Langfuse
• Move imports after env check logic
• Handle wrapper client aclose() issues
• Add debug logs for cleanup failures
2025-11-03 13:09:45 +08:00
ben moussa anouar
5da709b42a
Merge branch 'main' into feat/ragas-evaluation 2025-11-03 06:01:46 +01:00
anouarbm
36694eb9f2 fix(evaluation): Move import-time validation to runtime and improve documentation
Changes:
- Move sys.exit() calls from module level to __init__() method
- Raise proper exceptions (ImportError, ValueError, EnvironmentError) instead of sys.exit()
- Add lazy import for RAGEvaluator in __init__.py using __getattr__
- Update README to clarify sample_dataset.json contains generic test data (not personal)
- Fix README to reflect actual output format (JSON + CSV, not HTML)
- Improve documentation for custom test case creation

Addresses code review feedback about import-time validation and module exports.
2025-11-03 05:56:38 +01:00
anouarbm
9495778c2d refactor: reorder Langfuse import logic for improved clarity
Moved logger import before Langfuse block to fix NameError.
2025-11-03 05:27:41 +01:00
anouarbm
c9e1c6c1c2 fix(api): change content field to list in query responses
BREAKING CHANGE: content field is now List[str] instead of str

- Add ReferenceItem Pydantic model for type safety
- Update /query and /query/stream to return content as list
- Update OpenAPI schema and examples
- Add migration guide to API README
- Fix RAGAS evaluation to handle list format

Addresses PR #2297 feedback. Tested with RAGAS: 97.37% score.
2025-11-03 04:57:08 +01:00
anouarbm
9d69e8d776 fix(api): Change content field from string to list in query responses
BREAKING CHANGE: The `content` field in query response references is now
an array of strings instead of a concatenated string. This preserves
individual chunk boundaries when a single file has multiple chunks.

Changes:
- Update QueryResponse Pydantic model to accept List[str] for content
- Modify query_text endpoint to return content as list (query_routes.py:425)
- Modify query_text_stream endpoint to support chunk content enrichment
- Update OpenAPI schema and examples to reflect array structure
- Update API README with breaking change notice and migration guide
- Fix RAGAS evaluation to flatten chunk content lists
2025-11-03 04:37:09 +01:00
anouarbm
363f3051b1 eval using open ai 2025-11-02 19:39:56 +01:00
anouarbm
77db08038c Merge remote-tracking branch 'lightrag-fork/feat/ragas-evaluation' into feat/ragas-evaluation 2025-11-02 18:47:40 +01:00
anouarbm
0b5e3f9dc4 Use logger in RAG evaluation and optimize reference content joins 2025-11-02 18:43:53 +01:00
ben moussa anouar
98f0464a31
Update lightrag/evaluation/eval_rag_quality.py for launguage
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-02 18:03:54 +01:00
anouarbm
963ad4c637 docs: Add documentation and examples for include_chunk_content parameter
Added comprehensive documentation for the new include_chunk_content parameter
that enables retrieval of actual chunk text content in API responses.

Documentation Updates:
- Added "Include Chunk Content in References" section to API README
- Explained use cases: RAG evaluation, debugging, citations, transparency
- Provided JSON request/response examples
- Clarified parameter interaction with include_references

OpenAPI/Swagger Examples:
- Added "Response with chunk content" example to /query endpoint
- Shows complete reference structure with content field
- Demonstrates realistic chunk text content

This makes the feature discoverable through:
1. API documentation (README.md)
2. Interactive Swagger UI (http://localhost:9621/docs)
3. Code examples for developers
2025-11-02 17:53:27 +01:00
anouarbm
0bbef9814e Optimize RAGAS evaluation with parallel execution and chunk content enrichment
Added efficient RAG evaluation system with optimized API calls and comprehensive benchmarking.

Key Features:
- Single API call per evaluation (2x faster than before)
- Parallel evaluation based on MAX_ASYNC environment variable
- Chunk content enrichment in /query endpoint responses
- Comprehensive benchmark statistics (moyennes)
- NaN-safe metric calculations

API Changes:
- Added include_chunk_content parameter to QueryRequest (backward compatible)
- /query endpoint enriches references with actual chunk content when requested
- No breaking changes - default behavior unchanged

Evaluation Improvements:
- Parallel execution using asyncio.Semaphore (respects MAX_ASYNC)
- Shared HTTP client with connection pooling
- Proper timeout handling (3min connect, 5min read)
- Debug output for context retrieval verification
- Benchmark statistics with averages, min/max scores

Results:
- Moyenne RAGAS Score: 0.9772
- Perfect Faithfulness: 1.0000
- Perfect Context Recall: 1.0000
- Perfect Context Precision: 1.0000
- Excellent Answer Relevance: 0.9087
2025-11-02 17:39:43 +01:00
anouarbm
026bca00d9 fix: Use actual retrieved contexts for RAGAS evaluation
**Critical Fix: Contexts vs Ground Truth**
- RAGAS metrics now evaluate actual retrieval performance
- Previously: Used ground_truth as contexts (always perfect scores)
- Now: Uses retrieved documents from LightRAG API (real evaluation)

**Changes to generate_rag_response (lines 100-156)**:
- Remove unused 'context' parameter
- Change return type: Dict[str, str] → Dict[str, Any]
- Extract contexts as list of strings from references[].text
- Return 'contexts' key instead of 'context' (JSON dump)
- Add response.raise_for_status() for better error handling
- Add httpx.HTTPStatusError exception handler

**Changes to evaluate_responses (lines 180-191)**:
- Line 183: Extract retrieved_contexts from rag_response
- Line 190: Use [retrieved_contexts] instead of [[ground_truth]]
- Now correctly evaluates: retrieval quality, not ground_truth quality

**Impact on RAGAS Metrics**:
- Context Precision: Now ranks actual retrieved docs by relevance
- Context Recall: Compares ground_truth against actual retrieval
- Faithfulness: Verifies answer based on actual retrieved contexts
- Answer Relevance: Unchanged (question-answer relevance)

Fixes incorrect evaluation methodology. Based on RAGAS documentation:
- contexts = retrieved documents from RAG system
- ground_truth = reference answer for context_recall metric

References:
- https://docs.ragas.io/en/stable/concepts/components/eval_dataset/
- https://docs.ragas.io/en/stable/concepts/metrics/
2025-11-02 16:16:00 +01:00