LightRAG

Author	SHA1	Message	Date
Alexander Belikov	fc0a417775	tested on example; fixed schema definition	2025-11-13 16:57:41 +01:00
Alexander Belikov	3b37448d5f	removed deprecated methods	2025-11-07 14:24:18 +01:00
Alexander Belikov	dc2898d358	Merge branch 'main' into feature-add-tigergraph-support	2025-11-07 14:18:33 +01:00
Alexander Belikov	af33abee40	updated fetch with db side processing	2025-11-07 14:16:53 +01:00
yangdx	fc40a36968	Add timeout support to Gemini LLM and improve parameter handling • Add timeout parameter to Gemini client • Convert timeout seconds to milliseconds • Update function signatures consistently • Add Gemini thinking config example • Clean up parameter documentation	2025-11-07 15:50:14 +08:00
yangdx	3cb4eae492	Add Chain of Thought support to Gemini LLM integration - Extract thoughts from response parts - Add COT enable/disable parameter	2025-11-07 15:22:14 +08:00
yangdx	6686edfd35	Update Gemini LLM options: add seed and thinking config, remove MIME type	2025-11-07 14:32:42 +08:00
yangdx	8c27555358	Fix Gemini response parsing to avoid warnings from non-text parts	2025-11-07 04:00:37 +08:00
yangdx	ea141e2779	Fix: Remove redundant entity/relation chunk deletions	2025-11-07 02:56:16 +08:00
yangdx	5bcd2926ca	Bump API version to 0251	2025-11-06 21:45:47 +08:00
yangdx	04ed709b34	Optimize entity deletion by batching edge queries to avoid N+1 problem • Add batch get_nodes_edges_batch call • Remove individual get_node_edges calls • Improve query performance	2025-11-06 21:34:47 +08:00
yangdx	3276b7a49d	Fix linting	2025-11-06 20:48:51 +08:00
yangdx	155f59759b	Fix node ID normalization and improve batch operation consistency • Remove premature ID normalization • Add lookup mapping for node resolution • Filter results by requested nodes only • Improve error logging with workspace	2025-11-06 20:34:53 +08:00
yangdx	807d2461d3	Remove unused chunk-based node/edge retrieval methods	2025-11-06 18:17:10 +08:00
yangdx	831e658ed8	Update readme	2025-11-06 16:26:07 +08:00
yangdx	6e36ff41e1	Fix linting	2025-11-06 16:01:24 +08:00
yangdx	5f49cee20f	Merge branch 'main' into VOXWAVE-FOUNDRY/main	2025-11-06 15:37:35 +08:00
Alexander Belikov	d0734b119a	added tigergraph as graph storage	2025-11-05 15:23:35 +01:00
yangdx	9c05706062	Add separate endpoint configuration for LLM and embeddings in evaluation - Split LLM and embedding API configs - Add fallback chain for API keys - Update docs with usage examples	2025-11-05 18:54:38 +08:00
yangdx	994a82dc7f	Suppress token usage warnings for custom OpenAI-compatible endpoints • Add warning filter for token usage • Support vLLM, SGLang endpoints • Non-critical for RAGAS evaluation	2025-11-05 18:25:28 +08:00
yangdx	f490622b72	Doc: Refactor evaluation README to improve clarity and structure	2025-11-05 10:43:55 +08:00
yangdx	a73314a4ba	Refactor evaluation results display and logging format	2025-11-05 10:08:17 +08:00
yangdx	06b91d00f8	Improve RAG evaluation progress eval index display with zero padding	2025-11-05 09:46:07 +08:00
yangdx	2823f92fb6	Fix tqdm progress bar conflicts in concurrent RAG evaluation • Add position pool for tqdm bars • Serialize tqdm creation with lock • Set leave=False to clear completed bars • Pass position/lock to eval tasks • Import tqdm.auto for better display	2025-11-05 02:04:13 +08:00
yangdx	e5abe9dd3d	Restructure semaphore control to manage entire evaluation pipeline • Move rag_semaphore to wrap full function • Increase RAG concurrency to 2x eval limit • Prevent memory buildup from slow evals • Keep eval_semaphore for RAGAS control	2025-11-05 01:07:53 +08:00
yangdx	83715a3ac1	Implement two-stage pipeline for RAG evaluation with separate semaphores • Split RAG gen and eval stages • Add rag_semaphore for stage 1 • Add eval_semaphore for stage 2 • Improve concurrency control • Update connection pool limits	2025-11-05 00:36:09 +08:00
yangdx	d36be1f499	Improve RAGAS evaluation progress tracking and clean up output handling • Add tqdm progress bar for eval steps • Pass progress bar to RAGAS evaluate • Ensure progress bar cleanup in finally • Remove redundant output buffer flushes	2025-11-05 00:16:02 +08:00
yangdx	c358f405a9	Update evaluation defaults and expand sample dataset • Lower concurrent evals from 3 to 2 • Standardize project names in samples • Add 3 new evaluation questions • Expand ground truth detail coverage • Improve dataset comprehensiveness	2025-11-04 22:17:17 +08:00
yangdx	41c26a3677	feat: add command-line args to RAG evaluation script - Add --dataset and --ragendpoint flags - Support short forms -d and -r - Update README with usage examples	2025-11-04 21:40:27 +08:00
yangdx	d4b8a229b9	Update RAGAS evaluation to use gpt-4o-mini and improve compatibility - Change default model to gpt-4o-mini - Add deprecation warning suppression - Update docs and comments for LightRAG - Improve output formatting and timing	2025-11-04 18:50:53 +08:00
yangdx	6d61f70b92	Clean up RAG evaluator logging and remove excessive separator lines • Remove excessive separator lines • Add RAGAS concurrency comment • Fix output buffer timing	2025-11-04 18:04:19 +08:00
yangdx	4e4b8d7e25	Update RAG evaluation metrics to use class instances instead of objects • Import metric classes not instances • Instantiate metrics with () syntax	2025-11-04 15:56:57 +08:00
yangdx	7abc687742	Add comprehensive configuration and compatibility fixes for RAGAS - Fix RAGAS LLM wrapper compatibility - Add concurrency control for rate limits - Add eval env vars for model config - Improve error handling and logging - Update documentation with examples	2025-11-04 14:39:27 +08:00
yangdx	72db042667	Update .env loading and add API authentication to RAG evaluator • Load .env from current directory • Support LIGHTRAG_API_KEY auth header • Override=False for env precedence • Add Bearer token to API requests • Enable per-instance .env configs	2025-11-04 10:59:09 +08:00
anouarbm	ad2d3c2cc0	Merge remote-tracking branch 'origin/main' into feat/ragas-evaluation	2025-11-03 13:48:14 +01:00
anouarbm	debfa0ec96	Merge branch 'feat/ragas-evaluation' of https://github.com/anouar-bm/LightRAG into feat/ragas-evaluation	2025-11-03 13:30:16 +01:00
anouarbm	a172cf893d	feat(evaluation): Add sample documents for reproducible RAGAS testing Add 5 markdown documents that users can index to reproduce evaluation results. Changes: - Add sample_documents/ folder with 5 markdown files covering LightRAG features - Update sample_dataset.json with 3 improved, specific test questions - Shorten and correct evaluation README (removed outdated info about mock responses) - Add sample_documents reference with expected ~95% RAGAS score Test Results with sample documents: - Average RAGAS Score: 95.28% - Faithfulness: 100%, Answer Relevance: 96.67% - Context Recall: 88.89%, Context Precision: 95.56%	2025-11-03 13:28:46 +01:00
yangdx	10f6e6955f	Improve Langfuse integration and stream response cleanup handling • Check env vars before enabling Langfuse • Move imports after env check logic • Handle wrapper client aclose() issues • Add debug logs for cleanup failures	2025-11-03 13:09:45 +08:00
ben moussa anouar	5da709b42a	Merge branch 'main' into feat/ragas-evaluation	2025-11-03 06:01:46 +01:00
anouarbm	36694eb9f2	fix(evaluation): Move import-time validation to runtime and improve documentation Changes: - Move sys.exit() calls from module level to __init__() method - Raise proper exceptions (ImportError, ValueError, EnvironmentError) instead of sys.exit() - Add lazy import for RAGEvaluator in __init__.py using __getattr__ - Update README to clarify sample_dataset.json contains generic test data (not personal) - Fix README to reflect actual output format (JSON + CSV, not HTML) - Improve documentation for custom test case creation Addresses code review feedback about import-time validation and module exports.	2025-11-03 05:56:38 +01:00
anouarbm	9495778c2d	refactor: reorder Langfuse import logic for improved clarity Moved logger import before Langfuse block to fix NameError.	2025-11-03 05:27:41 +01:00
anouarbm	c9e1c6c1c2	fix(api): change content field to list in query responses BREAKING CHANGE: content field is now List[str] instead of str - Add ReferenceItem Pydantic model for type safety - Update /query and /query/stream to return content as list - Update OpenAPI schema and examples - Add migration guide to API README - Fix RAGAS evaluation to handle list format Addresses PR #2297 feedback. Tested with RAGAS: 97.37% score.	2025-11-03 04:57:08 +01:00
anouarbm	9d69e8d776	fix(api): Change content field from string to list in query responses BREAKING CHANGE: The `content` field in query response references is now an array of strings instead of a concatenated string. This preserves individual chunk boundaries when a single file has multiple chunks. Changes: - Update QueryResponse Pydantic model to accept List[str] for content - Modify query_text endpoint to return content as list (query_routes.py:425) - Modify query_text_stream endpoint to support chunk content enrichment - Update OpenAPI schema and examples to reflect array structure - Update API README with breaking change notice and migration guide - Fix RAGAS evaluation to flatten chunk content lists	2025-11-03 04:37:09 +01:00
anouarbm	363f3051b1	eval using open ai	2025-11-02 19:39:56 +01:00
anouarbm	77db08038c	Merge remote-tracking branch 'lightrag-fork/feat/ragas-evaluation' into feat/ragas-evaluation	2025-11-02 18:47:40 +01:00
anouarbm	0b5e3f9dc4	Use logger in RAG evaluation and optimize reference content joins	2025-11-02 18:43:53 +01:00
ben moussa anouar	98f0464a31	Update lightrag/evaluation/eval_rag_quality.py for launguage Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-02 18:03:54 +01:00
anouarbm	963ad4c637	docs: Add documentation and examples for include_chunk_content parameter Added comprehensive documentation for the new include_chunk_content parameter that enables retrieval of actual chunk text content in API responses. Documentation Updates: - Added "Include Chunk Content in References" section to API README - Explained use cases: RAG evaluation, debugging, citations, transparency - Provided JSON request/response examples - Clarified parameter interaction with include_references OpenAPI/Swagger Examples: - Added "Response with chunk content" example to /query endpoint - Shows complete reference structure with content field - Demonstrates realistic chunk text content This makes the feature discoverable through: 1. API documentation (README.md) 2. Interactive Swagger UI (http://localhost:9621/docs) 3. Code examples for developers	2025-11-02 17:53:27 +01:00
anouarbm	0bbef9814e	Optimize RAGAS evaluation with parallel execution and chunk content enrichment Added efficient RAG evaluation system with optimized API calls and comprehensive benchmarking. Key Features: - Single API call per evaluation (2x faster than before) - Parallel evaluation based on MAX_ASYNC environment variable - Chunk content enrichment in /query endpoint responses - Comprehensive benchmark statistics (moyennes) - NaN-safe metric calculations API Changes: - Added include_chunk_content parameter to QueryRequest (backward compatible) - /query endpoint enriches references with actual chunk content when requested - No breaking changes - default behavior unchanged Evaluation Improvements: - Parallel execution using asyncio.Semaphore (respects MAX_ASYNC) - Shared HTTP client with connection pooling - Proper timeout handling (3min connect, 5min read) - Debug output for context retrieval verification - Benchmark statistics with averages, min/max scores Results: - Moyenne RAGAS Score: 0.9772 - Perfect Faithfulness: 1.0000 - Perfect Context Recall: 1.0000 - Perfect Context Precision: 1.0000 - Excellent Answer Relevance: 0.9087	2025-11-02 17:39:43 +01:00
anouarbm	026bca00d9	fix: Use actual retrieved contexts for RAGAS evaluation Critical Fix: Contexts vs Ground Truth - RAGAS metrics now evaluate actual retrieval performance - Previously: Used ground_truth as contexts (always perfect scores) - Now: Uses retrieved documents from LightRAG API (real evaluation) Changes to generate_rag_response (lines 100-156): - Remove unused 'context' parameter - Change return type: Dict[str, str] → Dict[str, Any] - Extract contexts as list of strings from references[].text - Return 'contexts' key instead of 'context' (JSON dump) - Add response.raise_for_status() for better error handling - Add httpx.HTTPStatusError exception handler Changes to evaluate_responses (lines 180-191): - Line 183: Extract retrieved_contexts from rag_response - Line 190: Use [retrieved_contexts] instead of [[ground_truth]] - Now correctly evaluates: retrieval quality, not ground_truth quality Impact on RAGAS Metrics: - Context Precision: Now ranks actual retrieved docs by relevance - Context Recall: Compares ground_truth against actual retrieval - Faithfulness: Verifies answer based on actual retrieved contexts - Answer Relevance: Unchanged (question-answer relevance) Fixes incorrect evaluation methodology. Based on RAGAS documentation: - contexts = retrieved documents from RAG system - ground_truth = reference answer for context_recall metric References: - https://docs.ragas.io/en/stable/concepts/components/eval_dataset/ - https://docs.ragas.io/en/stable/concepts/metrics/	2025-11-02 16:16:00 +01:00

1 2 3 4 5 ...

3554 commits