anouarbm
026bca00d9
fix: Use actual retrieved contexts for RAGAS evaluation
...
**Critical Fix: Contexts vs Ground Truth**
- RAGAS metrics now evaluate actual retrieval performance
- Previously: Used ground_truth as contexts (always perfect scores)
- Now: Uses retrieved documents from LightRAG API (real evaluation)
**Changes to generate_rag_response (lines 100-156)**:
- Remove unused 'context' parameter
- Change return type: Dict[str, str] → Dict[str, Any]
- Extract contexts as list of strings from references[].text
- Return 'contexts' key instead of 'context' (JSON dump)
- Add response.raise_for_status() for better error handling
- Add httpx.HTTPStatusError exception handler
**Changes to evaluate_responses (lines 180-191)**:
- Line 183: Extract retrieved_contexts from rag_response
- Line 190: Use [retrieved_contexts] instead of [[ground_truth]]
- Now correctly evaluates: retrieval quality, not ground_truth quality
**Impact on RAGAS Metrics**:
- Context Precision: Now ranks actual retrieved docs by relevance
- Context Recall: Compares ground_truth against actual retrieval
- Faithfulness: Verifies answer based on actual retrieved contexts
- Answer Relevance: Unchanged (question-answer relevance)
Fixes incorrect evaluation methodology. Based on RAGAS documentation:
- contexts = retrieved documents from RAG system
- ground_truth = reference answer for context_recall metric
References:
- https://docs.ragas.io/en/stable/concepts/components/eval_dataset/
- https://docs.ragas.io/en/stable/concepts/metrics/
2025-11-02 16:16:00 +01:00
anouarbm
b12b693a81
fixed ruff format of csv path
2025-11-02 11:46:22 +01:00
anouarbm
5cdb4b0ef2
fix: Apply ruff formatting and rename test_dataset to sample_dataset
...
**Lint Fixes (ruff)**:
- Sort imports alphabetically (I001)
- Add blank line after import traceback (E302)
- Add trailing comma to dict literals (COM812)
- Reformat writer.writerow for readability (E501)
**Rename test_dataset.json → sample_dataset.json**:
- Avoids .gitignore pattern conflict (test_* is ignored)
- More descriptive name - it's a sample/template, not actual test data
- Updated all references in eval_rag_quality.py and README.md
Resolves lint-and-format CI check failure.
Addresses reviewer feedback about test dataset naming.
2025-11-02 10:36:03 +01:00
anouarbm
aa916f28d2
docs: add generic test_dataset.json for evaluation examples
...
Test cases with generic examples about:
- LightRAG framework features and capabilities
- RAG system architecture and components
- Vector database support (ChromaDB, Neo4j, Milvus, etc.)
- LLM provider integrations (OpenAI, Anthropic, Ollama, etc.)
- RAG evaluation metrics explanation
- Deployment options (Docker, FastAPI, direct integration)
- Knowledge graph-based retrieval concepts
Changes:
- Added generic test_dataset.json with 8 LightRAG-focused test cases
- File added with git add -f to override test_* pattern
This provides realistic, reusable examples for users testing their
LightRAG deployments and helps demonstrate the evaluation framework.
2025-11-01 22:27:26 +01:00
anouarbm
1ad0bf82f9
feat: add RAGAS evaluation framework for RAG quality assessment
...
This contribution adds a comprehensive evaluation system using the RAGAS
framework to assess LightRAG's retrieval and generation quality.
Features:
- RAGEvaluator class with four key metrics:
* Faithfulness: Answer accuracy vs context
* Answer Relevance: Query-response alignment
* Context Recall: Retrieval completeness
* Context Precision: Retrieved context quality
- HTTP API integration for live system testing
- JSON and CSV report generation
- Configurable test datasets
- Complete documentation with examples
- Sample test dataset included
Changes:
- Added lightrag/evaluation/eval_rag_quality.py (RAGAS evaluator implementation)
- Added lightrag/evaluation/README.md (comprehensive documentation)
- Added lightrag/evaluation/__init__.py (package initialization)
- Updated pyproject.toml with optional 'evaluation' dependencies
- Updated .gitignore to exclude evaluation results directory
Installation:
pip install lightrag-hku[evaluation]
Dependencies:
- ragas>=0.3.7
- datasets>=4.3.0
- httpx>=0.28.1
- pytest>=8.4.2
- pytest-asyncio>=1.2.0
2025-11-01 21:36:39 +01:00
Daniel.y
ece0398dfc
Merge pull request #2296 from danielaskdd/pdf-decryption
...
Feat: Add PDF Decryption Support for Password-Protected Files
2025-11-01 15:14:24 +08:00
yangdx
61b57cbb5d
Add PDF decryption support for password-protected files
...
• Add PDF_DECRYPT_PASSWORD env variable
• Check encryption status before reading
• Handle decrypt errors gracefully
• Log detailed error messages
• Support both encrypted/plain PDFs
2025-11-01 15:01:17 +08:00
yangdx
728721b14f
Remove redundant separator lines in gunicorn shutdown handler
2025-11-01 12:53:54 +08:00
yangdx
6d4a55100e
Remove redundant shutdown message from gunicorn
2025-11-01 12:52:22 +08:00
Daniel.y
bc8a8842c5
Merge pull request #2295 from danielaskdd/mix-query-without-kg
...
Fix empty context validation bug and improve naming consistency in query context building
2025-11-01 12:20:16 +08:00
yangdx
ec2ea4fd3f
Rename function and variables for clarity in context building
...
- Rename _build_llm_context to _build_context_str
- Change text_units_context to chunks_context
- Move string building before early return
- Update log messages and comments
- Consistent variable naming throughout
2025-11-01 12:15:24 +08:00
yangdx
9a8742da59
Improve entity merge logging by removing redundant message and fixing typo
2025-10-31 17:16:59 +08:00
yangdx
6b4514c8ef
Reduce logging verbosity in entity merge relation processing
2025-10-31 17:02:10 +08:00
yangdx
2496d87148
Add data/ directory to .gitignore
2025-10-31 14:51:53 +08:00
yangdx
7ccc1fdd27
Add frontend rebuild warning indicator to version display
...
- Return bool from check_frontend_build()
- Add ⚠️ symbol to outdated versions
- Show tooltip with rebuild message
- Add translations for warning text
- Fix tailwind config filename typo
2025-10-31 06:09:46 +08:00
yangdx
e5414c61ef
Bump core version to 1.4.9.8 and API version to 0250
2025-10-31 05:23:48 +08:00
Daniel.y
08b0283b04
Merge pull request #2291 from danielaskdd/reload-popular-labels
...
Refact: Auto-refresh of Popular Labels When Pipeline Completes
2025-10-31 05:20:54 +08:00
yangdx
58c83f9da5
Add auto-refresh of popular labels when pipeline completes
...
• Monitor pipeline busy->idle transitions
• Reload labels on dropdown open if needed
• Add onBeforeOpen callback to AsyncSelect
• Clear refresh flags after processing
• Improve label sync with backend state
2025-10-31 04:45:35 +08:00
Daniel.y
94cdbe77c5
Merge pull request #2290 from danielaskdd/delete-residual-edges
...
Fix: Clean Residual Edges from VDB During Entity Deletion
2025-10-31 02:44:23 +08:00
yangdx
afb5e5c1cb
Fix edge cleanup when deleting entities to prevent orphaned relationships
...
- Track edges to delete in set
- Clean VDB before node deletion
- Remove from relation chunks storage
- Prevent orphaned relationship data
2025-10-31 02:36:15 +08:00
Daniel.y
3b48cf1643
Merge pull request #2289 from danielaskdd/fix-pycrptodome-missing
...
Fix: Add PyCryptodome dependency for encrypted PDF processing
2025-10-31 01:52:58 +08:00
yangdx
c46c1b26a9
Add pycryptodome dependency for PDF encryption support
2025-10-31 01:49:42 +08:00
Daniel.y
bda52a8773
Merge pull request #2287 from danielaskdd/fix-ui
...
Refact: Enhance Property editing UI for KG Nodes
2025-10-31 00:23:39 +08:00
yangdx
71b27ec4aa
Optimize property edit dialog to use trimmed value consistently
2025-10-31 00:08:02 +08:00
yangdx
4cbd876126
feat: Update node color and legent after entity_type changed
...
- Move color constants to utils module
- Extract resolveNodeColor function
- Update node colors on type changes
- Simplify hook color logic
2025-10-31 00:03:55 +08:00
yangdx
79a17c3f7f
Fix graph value handling for entity_id updates
...
• Use finalValue for entity_id changes
• Keep original value for other props
• Fix property update logic
2025-10-30 23:43:46 +08:00
yangdx
c36afecba4
Remove redundant await call in file extraction pipeline
2025-10-30 20:35:41 +08:00
yangdx
c9e73bb450
Bump core version to 1.4.9.7 and API version to 0249
2025-10-30 19:43:35 +08:00
yangdx
042cbad047
Merge branch 'qdrant-multi-tenancy'
2025-10-30 19:32:25 +08:00
yangdx
5f4a280458
Add Qdrant legacy collection migration with workspace support
...
- Add QdrantMigrationError exception
- Implement automatic data migration
- Support workspace-based partitioning
- Add migration verification logic
- Update collection naming scheme
2025-10-30 19:16:33 +08:00
yangdx
0498e80a42
Merge branch 'main' into qdrant-multi-tenancy
2025-10-30 14:11:00 +08:00
yangdx
78ccc4f6fd
Refactor .gitignore
2025-10-30 12:56:40 +08:00
yangdx
783e2f3b1f
Update uv.lock
2025-10-30 11:18:10 +08:00
yangdx
f610fdaf9b
Merge branch 'main' into Anush008/main
2025-10-30 11:07:39 +08:00
Daniel.y
8145201d2e
Merge pull request #2284 from danielaskdd/fix-static-missiing
...
HotFix: Include static files in package distribution
2025-10-30 10:52:53 +08:00
yangdx
16d3d82a0e
Include static files in package distribution
...
- Add static dir to MANIFEST.in
- Update package data config
- Ensure static assets are bundled
- Fix missing static file issue
2025-10-30 10:50:28 +08:00
yangdx
8af8bd80d2
docs: add frontend build steps to server installation guide
2025-10-29 21:54:47 +08:00
yangdx
0fa2fc9cab
Refactor systemd service config to use environment variables
...
• Add LIGHTRAG_HOME environment variable
• Use .venv instead of venv directory
2025-10-29 20:14:17 +08:00
yangdx
6dc027cb75
Merge branch 'fix-exit-handler'
2025-10-29 19:15:24 +08:00
Daniel.y
a1cf01dcc1
Merge pull request #2280 from danielaskdd/fix-exit-handler
...
Refact: Graceful shutdown and signal handling in Gunicorn Mode
2025-10-29 19:14:46 +08:00
Daniel.y
c5ad9982d9
Merge pull request #2281 from danielaskdd/restore-query-example
...
Restore query generation example and fix README path reference
2025-10-29 19:12:53 +08:00
yangdx
14a015d4ad
Restore query generation example and fix README path reference
...
• Fix path from example/ to examples/
• Add generate_query.py implementation
2025-10-29 19:11:40 +08:00
yangdx
3a7f753560
Bump core version to 1.4.9.6 and API version to 0248
2025-10-29 19:08:32 +08:00
yangdx
d5bcd14c6f
Refactor service deployment to use direct process execution
...
- Remove bash wrapper script
- Update systemd service configuration
- Improve process management for gunicorn
- Simplify shared storage cleanup logic
- Update documentation for deployment
2025-10-29 18:55:47 +08:00
yangdx
6489aaa7f0
Remove worker_exit hook and improve cleanup logging
...
• Remove unreliable worker_exit function
• Add debug logs for cleanup modes
• Move DEBUG_LOCKS to top of file
2025-10-29 15:15:13 +08:00
yangdx
4a46d39c93
Replace GUNICORN_CMD_ARGS with custom LIGHTRAG_GUNICORN_MODE flag
...
• Use custom env var for mode detection
• Improve Gunicorn mode reliability
2025-10-29 14:06:03 +08:00
yangdx
816feefd84
Fix cleanup coordination between Gunicorn and UvicornWorker lifecycles
...
• Document UvicornWorker hook limitations
• Add GUNICORN_CMD_ARGS cleanup guard
• Prevent double cleanup in workers
2025-10-29 13:53:46 +08:00
yangdx
72b29659c9
Fix worker process cleanup to prevent shared resource conflicts
...
• Add worker_exit hook in gunicorn config
• Add shutdown_manager parameter in finalize_share_data of share_storage
• Prevent Manager shutdown in workers
• Remove custom signal handlers
2025-10-29 13:33:21 +08:00
yangdx
0692175c7b
Remove enable_logging parameter from get_data_init_lock call in MilvusVectorDBStorage
2025-10-29 09:49:59 +08:00
Daniel.y
ec797276b2
Merge pull request #2279 from danielaskdd/fix-edge-merge-stage
...
Fix Entity Source IDs Tracking ProblemDuring Relationship Processing
2025-10-29 02:34:09 +08:00