• Add position pool for tqdm bars
• Serialize tqdm creation with lock
• Set leave=False to clear completed bars
• Pass position/lock to eval tasks
• Import tqdm.auto for better display
• Move rag_semaphore to wrap full function
• Increase RAG concurrency to 2x eval limit
• Prevent memory buildup from slow evals
• Keep eval_semaphore for RAGAS control
• Split RAG gen and eval stages
• Add rag_semaphore for stage 1
• Add eval_semaphore for stage 2
• Improve concurrency control
• Update connection pool limits
• Add tqdm progress bar for eval steps
• Pass progress bar to RAGAS evaluate
• Ensure progress bar cleanup in finally
• Remove redundant output buffer flushes
- Change default model to gpt-4o-mini
- Add deprecation warning suppression
- Update docs and comments for LightRAG
- Improve output formatting and timing
- Fix RAGAS LLM wrapper compatibility
- Add concurrency control for rate limits
- Add eval env vars for model config
- Improve error handling and logging
- Update documentation with examples
• Load .env from current directory
• Support LIGHTRAG_API_KEY auth header
• Override=False for env precedence
• Add Bearer token to API requests
• Enable per-instance .env configs
Add 5 markdown documents that users can index to reproduce evaluation results.
Changes:
- Add sample_documents/ folder with 5 markdown files covering LightRAG features
- Update sample_dataset.json with 3 improved, specific test questions
- Shorten and correct evaluation README (removed outdated info about mock responses)
- Add sample_documents reference with expected ~95% RAGAS score
Test Results with sample documents:
- Average RAGAS Score: 95.28%
- Faithfulness: 100%, Answer Relevance: 96.67%
- Context Recall: 88.89%, Context Precision: 95.56%
Changes:
- Move sys.exit() calls from module level to __init__() method
- Raise proper exceptions (ImportError, ValueError, EnvironmentError) instead of sys.exit()
- Add lazy import for RAGEvaluator in __init__.py using __getattr__
- Update README to clarify sample_dataset.json contains generic test data (not personal)
- Fix README to reflect actual output format (JSON + CSV, not HTML)
- Improve documentation for custom test case creation
Addresses code review feedback about import-time validation and module exports.
BREAKING CHANGE: content field is now List[str] instead of str
- Add ReferenceItem Pydantic model for type safety
- Update /query and /query/stream to return content as list
- Update OpenAPI schema and examples
- Add migration guide to API README
- Fix RAGAS evaluation to handle list format
Addresses PR #2297 feedback. Tested with RAGAS: 97.37% score.
BREAKING CHANGE: The `content` field in query response references is now
an array of strings instead of a concatenated string. This preserves
individual chunk boundaries when a single file has multiple chunks.
Changes:
- Update QueryResponse Pydantic model to accept List[str] for content
- Modify query_text endpoint to return content as list (query_routes.py:425)
- Modify query_text_stream endpoint to support chunk content enrichment
- Update OpenAPI schema and examples to reflect array structure
- Update API README with breaking change notice and migration guide
- Fix RAGAS evaluation to flatten chunk content lists
Added comprehensive documentation for the new include_chunk_content parameter
that enables retrieval of actual chunk text content in API responses.
Documentation Updates:
- Added "Include Chunk Content in References" section to API README
- Explained use cases: RAG evaluation, debugging, citations, transparency
- Provided JSON request/response examples
- Clarified parameter interaction with include_references
OpenAPI/Swagger Examples:
- Added "Response with chunk content" example to /query endpoint
- Shows complete reference structure with content field
- Demonstrates realistic chunk text content
This makes the feature discoverable through:
1. API documentation (README.md)
2. Interactive Swagger UI (http://localhost:9621/docs)
3. Code examples for developers
**Critical Fix: Contexts vs Ground Truth**
- RAGAS metrics now evaluate actual retrieval performance
- Previously: Used ground_truth as contexts (always perfect scores)
- Now: Uses retrieved documents from LightRAG API (real evaluation)
**Changes to generate_rag_response (lines 100-156)**:
- Remove unused 'context' parameter
- Change return type: Dict[str, str] → Dict[str, Any]
- Extract contexts as list of strings from references[].text
- Return 'contexts' key instead of 'context' (JSON dump)
- Add response.raise_for_status() for better error handling
- Add httpx.HTTPStatusError exception handler
**Changes to evaluate_responses (lines 180-191)**:
- Line 183: Extract retrieved_contexts from rag_response
- Line 190: Use [retrieved_contexts] instead of [[ground_truth]]
- Now correctly evaluates: retrieval quality, not ground_truth quality
**Impact on RAGAS Metrics**:
- Context Precision: Now ranks actual retrieved docs by relevance
- Context Recall: Compares ground_truth against actual retrieval
- Faithfulness: Verifies answer based on actual retrieved contexts
- Answer Relevance: Unchanged (question-answer relevance)
Fixes incorrect evaluation methodology. Based on RAGAS documentation:
- contexts = retrieved documents from RAG system
- ground_truth = reference answer for context_recall metric
References:
- https://docs.ragas.io/en/stable/concepts/components/eval_dataset/
- https://docs.ragas.io/en/stable/concepts/metrics/
**Lint Fixes (ruff)**:
- Sort imports alphabetically (I001)
- Add blank line after import traceback (E302)
- Add trailing comma to dict literals (COM812)
- Reformat writer.writerow for readability (E501)
**Rename test_dataset.json → sample_dataset.json**:
- Avoids .gitignore pattern conflict (test_* is ignored)
- More descriptive name - it's a sample/template, not actual test data
- Updated all references in eval_rag_quality.py and README.md
Resolves lint-and-format CI check failure.
Addresses reviewer feedback about test dataset naming.
This contribution adds optional Langfuse support for LLM observability and tracing.
Langfuse provides a drop-in replacement for the OpenAI client that automatically
tracks all LLM interactions without requiring code changes.
Features:
- Optional Langfuse integration with graceful fallback
- Automatic LLM request/response tracing
- Token usage tracking
- Latency metrics
- Error tracking
- Zero code changes required for existing functionality
Implementation:
- Modified lightrag/llm/openai.py to conditionally use Langfuse's AsyncOpenAI
- Falls back to standard OpenAI client if Langfuse is not installed
- Logs observability status on import
Configuration:
To enable Langfuse tracing, install the observability extras and set environment variables:
```bash
pip install lightrag-hku[observability]
export LANGFUSE_PUBLIC_KEY="your_public_key"
export LANGFUSE_SECRET_KEY="your_secret_key"
export LANGFUSE_HOST="https://cloud.langfuse.com" # or your self-hosted instance
```
If Langfuse is not installed or environment variables are not set, LightRAG
will use the standard OpenAI client without any functionality changes.
Changes:
- Modified lightrag/llm/openai.py (added optional Langfuse import)
- Updated pyproject.toml with optional 'observability' dependencies
Dependencies (optional):
- langfuse>=3.8.1
- Rename _build_llm_context to _build_context_str
- Change text_units_context to chunks_context
- Move string building before early return
- Update log messages and comments
- Consistent variable naming throughout
- Return bool from check_frontend_build()
- Add ⚠️ symbol to outdated versions
- Show tooltip with rebuild message
- Add translations for warning text
- Fix tailwind config filename typo