LightRAG

Author	SHA1	Message	Date
anouarbm	026bca00d9	fix: Use actual retrieved contexts for RAGAS evaluation Critical Fix: Contexts vs Ground Truth - RAGAS metrics now evaluate actual retrieval performance - Previously: Used ground_truth as contexts (always perfect scores) - Now: Uses retrieved documents from LightRAG API (real evaluation) Changes to generate_rag_response (lines 100-156): - Remove unused 'context' parameter - Change return type: Dict[str, str] → Dict[str, Any] - Extract contexts as list of strings from references[].text - Return 'contexts' key instead of 'context' (JSON dump) - Add response.raise_for_status() for better error handling - Add httpx.HTTPStatusError exception handler Changes to evaluate_responses (lines 180-191): - Line 183: Extract retrieved_contexts from rag_response - Line 190: Use [retrieved_contexts] instead of [[ground_truth]] - Now correctly evaluates: retrieval quality, not ground_truth quality Impact on RAGAS Metrics: - Context Precision: Now ranks actual retrieved docs by relevance - Context Recall: Compares ground_truth against actual retrieval - Faithfulness: Verifies answer based on actual retrieved contexts - Answer Relevance: Unchanged (question-answer relevance) Fixes incorrect evaluation methodology. Based on RAGAS documentation: - contexts = retrieved documents from RAG system - ground_truth = reference answer for context_recall metric References: - https://docs.ragas.io/en/stable/concepts/components/eval_dataset/ - https://docs.ragas.io/en/stable/concepts/metrics/	2025-11-02 16:16:00 +01:00
anouarbm	b12b693a81	fixed ruff format of csv path	2025-11-02 11:46:22 +01:00
anouarbm	5cdb4b0ef2	fix: Apply ruff formatting and rename test_dataset to sample_dataset Lint Fixes (ruff): - Sort imports alphabetically (I001) - Add blank line after import traceback (E302) - Add trailing comma to dict literals (COM812) - Reformat writer.writerow for readability (E501) Rename test_dataset.json → sample_dataset.json: - Avoids .gitignore pattern conflict (test_* is ignored) - More descriptive name - it's a sample/template, not actual test data - Updated all references in eval_rag_quality.py and README.md Resolves lint-and-format CI check failure. Addresses reviewer feedback about test dataset naming.	2025-11-02 10:36:03 +01:00
anouarbm	aa916f28d2	docs: add generic test_dataset.json for evaluation examples Test cases with generic examples about: - LightRAG framework features and capabilities - RAG system architecture and components - Vector database support (ChromaDB, Neo4j, Milvus, etc.) - LLM provider integrations (OpenAI, Anthropic, Ollama, etc.) - RAG evaluation metrics explanation - Deployment options (Docker, FastAPI, direct integration) - Knowledge graph-based retrieval concepts Changes: - Added generic test_dataset.json with 8 LightRAG-focused test cases - File added with git add -f to override test_* pattern This provides realistic, reusable examples for users testing their LightRAG deployments and helps demonstrate the evaluation framework.	2025-11-01 22:27:26 +01:00
anouarbm	1ad0bf82f9	feat: add RAGAS evaluation framework for RAG quality assessment This contribution adds a comprehensive evaluation system using the RAGAS framework to assess LightRAG's retrieval and generation quality. Features: - RAGEvaluator class with four key metrics: * Faithfulness: Answer accuracy vs context * Answer Relevance: Query-response alignment * Context Recall: Retrieval completeness * Context Precision: Retrieved context quality - HTTP API integration for live system testing - JSON and CSV report generation - Configurable test datasets - Complete documentation with examples - Sample test dataset included Changes: - Added lightrag/evaluation/eval_rag_quality.py (RAGAS evaluator implementation) - Added lightrag/evaluation/README.md (comprehensive documentation) - Added lightrag/evaluation/__init__.py (package initialization) - Updated pyproject.toml with optional 'evaluation' dependencies - Updated .gitignore to exclude evaluation results directory Installation: pip install lightrag-hku[evaluation] Dependencies: - ragas>=0.3.7 - datasets>=4.3.0 - httpx>=0.28.1 - pytest>=8.4.2 - pytest-asyncio>=1.2.0	2025-11-01 21:36:39 +01:00
yangdx	61b57cbb5d	Add PDF decryption support for password-protected files • Add PDF_DECRYPT_PASSWORD env variable • Check encryption status before reading • Handle decrypt errors gracefully • Log detailed error messages • Support both encrypted/plain PDFs	2025-11-01 15:01:17 +08:00
yangdx	728721b14f	Remove redundant separator lines in gunicorn shutdown handler	2025-11-01 12:53:54 +08:00
yangdx	6d4a55100e	Remove redundant shutdown message from gunicorn	2025-11-01 12:52:22 +08:00
yangdx	ec2ea4fd3f	Rename function and variables for clarity in context building - Rename _build_llm_context to _build_context_str - Change text_units_context to chunks_context - Move string building before early return - Update log messages and comments - Consistent variable naming throughout	2025-11-01 12:15:24 +08:00
yangdx	9a8742da59	Improve entity merge logging by removing redundant message and fixing typo	2025-10-31 17:16:59 +08:00
yangdx	6b4514c8ef	Reduce logging verbosity in entity merge relation processing	2025-10-31 17:02:10 +08:00
yangdx	7ccc1fdd27	Add frontend rebuild warning indicator to version display - Return bool from check_frontend_build() - Add ⚠️ symbol to outdated versions - Show tooltip with rebuild message - Add translations for warning text - Fix tailwind config filename typo	2025-10-31 06:09:46 +08:00
yangdx	e5414c61ef	Bump core version to 1.4.9.8 and API version to 0250	2025-10-31 05:23:48 +08:00
yangdx	afb5e5c1cb	Fix edge cleanup when deleting entities to prevent orphaned relationships - Track edges to delete in set - Clean VDB before node deletion - Remove from relation chunks storage - Prevent orphaned relationship data	2025-10-31 02:36:15 +08:00
yangdx	c46c1b26a9	Add pycryptodome dependency for PDF encryption support	2025-10-31 01:49:42 +08:00
yangdx	c36afecba4	Remove redundant await call in file extraction pipeline	2025-10-30 20:35:41 +08:00
yangdx	c9e73bb450	Bump core version to 1.4.9.7 and API version to 0249	2025-10-30 19:43:35 +08:00
yangdx	5f4a280458	Add Qdrant legacy collection migration with workspace support - Add QdrantMigrationError exception - Implement automatic data migration - Support workspace-based partitioning - Add migration verification logic - Update collection naming scheme	2025-10-30 19:16:33 +08:00
yangdx	f610fdaf9b	Merge branch 'main' into Anush008/main	2025-10-30 11:07:39 +08:00
yangdx	3a7f753560	Bump core version to 1.4.9.6 and API version to 0248	2025-10-29 19:08:32 +08:00
yangdx	d5bcd14c6f	Refactor service deployment to use direct process execution - Remove bash wrapper script - Update systemd service configuration - Improve process management for gunicorn - Simplify shared storage cleanup logic - Update documentation for deployment	2025-10-29 18:55:47 +08:00
yangdx	6489aaa7f0	Remove worker_exit hook and improve cleanup logging • Remove unreliable worker_exit function • Add debug logs for cleanup modes • Move DEBUG_LOCKS to top of file	2025-10-29 15:15:13 +08:00
yangdx	4a46d39c93	Replace GUNICORN_CMD_ARGS with custom LIGHTRAG_GUNICORN_MODE flag • Use custom env var for mode detection • Improve Gunicorn mode reliability	2025-10-29 14:06:03 +08:00
yangdx	816feefd84	Fix cleanup coordination between Gunicorn and UvicornWorker lifecycles • Document UvicornWorker hook limitations • Add GUNICORN_CMD_ARGS cleanup guard • Prevent double cleanup in workers	2025-10-29 13:53:46 +08:00
yangdx	72b29659c9	Fix worker process cleanup to prevent shared resource conflicts • Add worker_exit hook in gunicorn config • Add shutdown_manager parameter in finalize_share_data of share_storage • Prevent Manager shutdown in workers • Remove custom signal handlers	2025-10-29 13:33:21 +08:00
yangdx	0692175c7b	Remove enable_logging parameter from get_data_init_lock call in MilvusVectorDBStorage	2025-10-29 09:49:59 +08:00
yangdx	da2e9efd11	Bump API version to 0247	2025-10-29 01:39:55 +08:00
yangdx	3fa79026e0	Fix Entity Source IDs Tracking Problem - Handle existing node updates properly in edge merging stage - Fix source_ids merging logic - Reorder entity deletion and optimize node operations - Delete relationships before entities - Add edge existence debugging logs	2025-10-29 01:19:55 +08:00
yangdx	29c4a91dc3	Move relationship ID sorting to before vector DB operations • Remove verbose entity rebuild logging • Sort IDs before vector DB updates • Keep graph storage with original order	2025-10-28 19:13:48 +08:00
yangdx	c81a56a113	Fix entity and relationship deletion when no chunk references remain	2025-10-28 16:02:35 +08:00
yangdx	88d12beae2	Add offline Swagger UI support with custom static file serving - Disable default docs URL - Add custom /docs endpoint - Mount static Swagger UI files - Include OAuth2 redirect handler - Support offline documentation access	2025-10-28 02:23:08 +08:00
yangdx	ea006bd386	Fix entity update logic to handle renaming operations - Add is_renaming condition check - Ensure updates when entity renamed	2025-10-28 00:12:23 +08:00
yangdx	5155edd8d2	feat: Improve entity merge and edit UX - API: The `graph/entity/edit` endpoint now returns a detailed `operation_summary` for better client-side handling of update, rename, and merge outcomes. - Web UI: Added an "auto-merge on rename" option. The UI now gracefully handles merge success, partial failures (update OK, merge fail), and other errors with specific user feedback.	2025-10-27 23:42:08 +08:00
yangdx	97034f06e3	Add allow_merge parameter to entity update API endpoint	2025-10-27 14:30:27 +08:00
yangdx	11a1631d76	Refactor entity edit and merge functions to support merge-on-rename • Extract internal implementation helpers • Add allow_merge parameter to aedit_entity • Support merging when renaming to existing name • Improve code reusability and modularity • Maintain backward compatibility	2025-10-27 14:23:51 +08:00
yangdx	411e92e6b9	Fix vector deletion logging to show actual deleted count	2025-10-27 14:22:16 +08:00
yangdx	94f24a66f2	Bump API version to 0246	2025-10-27 12:28:46 +08:00
yangdx	8dfd3bf428	Replace global graph DB lock with fine-grained keyed locking • Use entity/relation-specific locks • Lock multiple entities when needed	2025-10-27 02:55:58 +08:00
yangdx	2c09adb8d3	Add chunk tracking support to entity merge functionality - Pass chunk storages to merge function - Merge relation chunk tracking data - Merge entity chunk tracking data - Delete old chunk tracking records - Persist chunk storage updates	2025-10-27 02:06:21 +08:00
yangdx	a25003c336	Fix relation deduplication logic and standardize log message prefixes	2025-10-27 00:52:56 +08:00
yangdx	ab32456a79	Refactor entity merging with unified attribute merge function • Update GRAPH_FIELD_SEP comment clarity • Deprecate merge_strategy parameter • Unify entity/relation merge logic • Add join_unique_comma strategy	2025-10-27 00:04:17 +08:00
yangdx	38559373b3	Fix entity merging to include target entity relationships * Include target entity in collection * Merge all relevant relationships * Prevent relationship loss * Fix merge completeness	2025-10-26 23:13:50 +08:00
yangdx	6015e8bc68	Refactor graph utils to use unified persistence callback - Add _persist_graph_updates function - Remove duplicate callback functions	2025-10-26 20:20:16 +08:00
yangdx	a3370b024d	Add chunk tracking cleanup to entity/relation deletion and creation • Clean up chunk storage on delete • Track chunks in create operations • Normalize relation keys consistently	2025-10-26 17:06:16 +08:00
yangdx	bf1897a67e	Normalize entity order for undirected graph consistency • Normalize entity pairs for storage • Update API docs for undirected edges	2025-10-26 15:53:31 +08:00
yangdx	3fbd704bf9	Enhance entity/relation editing with chunk tracking synchronization • Add chunk storage sync to edit ops • Implement incremental chunk ID updates • Support entity renaming migrations • Normalize relation keys consistently • Preserve chunk references on edits	2025-10-26 14:34:56 +08:00
Anush008	8584980e3a	refactor: Qdrant Multi-tenancy (Include staged) Signed-off-by: Anush008 <anushshetty90@gmail.com>	2025-10-26 09:58:24 +05:30
yangdx	29bf593663	Fix entity and relation chunk cleanup in deletion pipeline • Delete from entity_chunks storage • Delete from relation_chunks storage	2025-10-25 22:32:27 +08:00
yangdx	5ee9a2f8c6	Fix entity consistency in knowledge graph rebuilding and merging • Sort src/tgt for consistent ordering • Create missing nodes before edges • Update entity chunks storage • Pass entity_vdb to rebuild function • Ensure entities exist in all storages	2025-10-25 21:37:03 +08:00
yangdx	a97e5dad4c	Optimize PostgreSQL graph queries to avoid Cypher overhead and complexity • Replace Cypher with native SQL queries • Fix O(N²) to O(E) performance issue • Add error handling for parse failures • Use direct table access pattern • Eliminate Cartesian product joins	2025-10-25 14:37:18 +08:00

1 2 3 4 5 ...

3503 commits