LightRAG

Author	SHA1	Message	Date
clssck	ef7327bb3e	chore(docker-compose, lightrag): optimize test infrastructure and add evaluation tools Add comprehensive E2E testing infrastructure with PostgreSQL performance tuning, Gunicorn multi-worker support, and evaluation scripts for RAGAS-based quality assessment. Introduces 4 new evaluation utilities: compare_results.py for A/B test analysis, download_wikipedia.py for reproducible test datasets, e2e_test_harness.py for automated evaluation pipelines, and ingest_test_docs.py for batch document ingestion. Updates docker-compose.test.yml with aggressive async settings, memory limits, and optimized chunking parameters. Parallelize entity summarization in operate.py for improved extraction performance. Fix typos in merge node/edge logs.	2025-11-29 10:39:20 +01:00
clssck	d2c9e6e2ec	test(lightrag): add orphan connection feature with quality validation tests Implement automatic orphan entity connection system that identifies entities with no relationships and creates meaningful connections via vector similarity + LLM validation. This improves knowledge graph connectivity and retrieval quality. Changes: - Add orphan connection configuration parameters (thresholds, cross-connect settings) - Implement aconnect_orphan_entities() method with 4-step validation pipeline - Add SQL templates for efficient orphan and candidate entity queries - Create POST /graph/orphans/connect API endpoint with configurable parameters - Add orphan connection validation prompt for LLM-based relationship verification - Include relationship density requirement in extraction prompts to prevent orphans - Update docker-compose.test.yml with optimized extraction parameters - Add quality validation test suite (run_quality_tests.py) for retrieval evaluation - Add unit test framework (test_orphan_connection_quality.py) with test cases - Enable auto-run of orphan connection after document processing	2025-11-28 18:23:30 +01:00
clssck	90825e823a	remove inherited workflows, keep only docker-publish	2025-11-28 09:10:38 +00:00
clssck	3b250fd0d0	simplify docker workflow to manual trigger only	2025-11-28 08:43:36 +00:00
clssck	b6074b9a81	chore(lightrag, lightrag_webui): improve code quality and security - Extract PostgreSQL storage check into named variable for clarity - Move APIRouter initialization into create_table_routes function scope - Add robust type handling for database query results - Add input validation for table names and pagination parameters - Add regex-based SQL injection prevention for table name sanitization - Improve clipboard copy fallback logic and error handling - Add memoization for JSON serialization to prevent unnecessary recalculations - Hide meta column from table explorer UI display - Sort table columns alphabetically for consistent ordering - Add keyboard accessibility to status filter buttons - Add preprocessed status filter to document manager - Update @tanstack/react-query from 5.60.0 to 5.87.1 - Extract dev storage config into constant to reduce duplication - Update documentation comments for clarity	2025-11-27 21:39:42 +01:00
clssck	a9edadef45	feat: add Table Explorer feature with dynamic table data fetching and schema display - Implemented Table Explorer component to allow users to select and view database tables. - Added API calls for fetching table list, schema, and paginated data. - Introduced row detail modal for displaying and copying row data. - Enhanced DataTable component to support row click events. - Updated UI components for better user experience and accessibility. - Added mock data for development mode to facilitate testing. - Updated localization files to include new terms related to tables. - Modified settings store to include storage configuration for conditional UI rendering. - Improved styling and layout for various components to align with new design standards.	2025-11-27 18:27:14 +01:00
clssck	48c7732edc	feat: add automatic entity resolution with 3-layer matching Implement automatic entity resolution to prevent duplicate nodes in the knowledge graph. The system uses a 3-layer approach: 1. Case-insensitive exact matching (free, instant) 2. Fuzzy string matching >85% threshold (free, instant) 3. Vector similarity + LLM verification (for acronyms/synonyms) Key features: - Pre-resolution phase prevents race conditions in parallel processing - Numeric suffix detection blocks false matches (IL-4 ≠ IL-13) - PostgreSQL alias cache for fast lookups on subsequent ingestion - Configurable thresholds via environment variables Bug fixes included: - Fix fuzzy matching false positives for numbered entities - Fix alias cache not being populated (missing db parameter) - Skip entity_aliases table from generic id index creation New files: - lightrag/entity_resolution/ - Core resolution module - tests/test_entity_resolution/ - Unit tests - docker/postgres-age-vector/ - Custom PG image with pgvector + AGE - docker-compose.test.yml - Integration test environment Configuration (env.example): - ENTITY_RESOLUTION_ENABLED=true - ENTITY_RESOLUTION_FUZZY_THRESHOLD=0.85 - ENTITY_RESOLUTION_VECTOR_THRESHOLD=0.5 - ENTITY_RESOLUTION_MAX_CANDIDATES=3	2025-11-27 15:35:02 +01:00
yangdx	4f12fe121d	Change entity extraction logging from warning to info level • Reduce log noise for empty entities	2025-11-27 11:00:34 +08:00
yangdx	93d445dfdd	Add pipeline status lock function for legacy compatibility - Add get_pipeline_status_lock function - Return NamespaceLock for consistency - Support workspace parameter - Enable logging option - Legacy code compatibility	2025-11-25 18:24:39 +08:00
Daniel.y	d2cd1c0722	Merge pull request #2421 from EightyOliveira/fix_catch_order fix:exception handling order error	2025-11-25 17:52:56 +08:00
yangdx	777c91794b	Add Langfuse observability configuration to env.example - Add Langfuse environment variables - Include setup instructions - Support OpenAI compatible APIs - Enable tracing configuration - Add cloud/self-host options	2025-11-25 17:16:55 +08:00
EightyOliveira	8994c70f2f	fix:exception handling order error	2025-11-25 16:36:41 +08:00
Daniel.y	2539b4e2c8	Merge pull request #2418 from danielaskdd/start-without-webui Refact: Allow API Server to Start Without Built WebUI Assets	2025-11-25 03:02:15 +08:00
yangdx	48b67d3077	Handle missing WebUI assets gracefully without blocking server startup - Change build check from error to warning - Redirect to /docs when WebUI unavailable - Add webui_available to health endpoint - Only mount /webui if assets exist - Return status tuple from build check	2025-11-25 02:51:55 +08:00
Daniel.y	2832a2ca7e	Merge pull request #2417 from danielaskdd/neo4j-retry Fix: Add Comprehensive Retry Mechanism for Neo4j Storage Operations	2025-11-25 02:03:48 +08:00
yangdx	5f91063c7a	Add ruff as dependency to pytest and evaluation extras	2025-11-25 02:03:28 +08:00
yangdx	8c4d7a00ad	Refactor: Extract retry decorator to reduce code duplication in Neo4J storage • Define READ_RETRY_EXCEPTIONS constant • Create reusable READ_RETRY decorator • Replace 11 duplicate retry decorators • Improve code maintainability • Add missing retry to edge_degrees_batch	2025-11-25 01:35:21 +08:00
Daniel.y	5b81ef000e	Merge pull request #2410 from netbrah/create-copilot-setup-steps feat: create copilot-setup-steps.yml	2025-11-24 22:36:33 +08:00
yangdx	7aaa51cda9	Add retry decorators to Neo4j read operations for resilience	2025-11-24 22:28:15 +08:00
palanisd	c233da6318	Update copilot-setup-steps.yml	2025-11-23 17:42:04 -05:00
palanisd	1b0413ee74	Create copilot-setup-steps.yml	2025-11-22 15:29:05 -05:00
chaohuang-ai	16eb0d5bee	Merge pull request #2409 from HKUDS/chaohuang-ai-patch-3 Update README.md	2025-11-23 00:54:04 +08:00
chaohuang-ai	37178462ab	Update README.md	2025-11-23 00:53:39 +08:00
chaohuang-ai	6d3bfe46d0	Merge pull request #2408 from HKUDS/chaohuang-ai-patch-2 Update README.md	2025-11-23 00:50:16 +08:00
chaohuang-ai	babbcb566b	Update README.md	2025-11-23 00:48:52 +08:00
yangdx	5f53de8866	Fix Azure configuration examples and correct typos in env.example	2025-11-22 09:05:52 +08:00
yangdx	fa6797f246	Update env.example	2025-11-22 00:32:12 +08:00
yangdx	49fb11e205	Update Azure OpenAI configuration examples	2025-11-22 00:19:23 +08:00
yangdx	7b76211066	Add fallback to AZURE_OPENAI_API_VERSION for embedding API version	2025-11-22 00:14:35 +08:00
yangdx	ffd8da512e	Improve Azure OpenAI compatibility and error handling • Reduce log noise for Azure content filters • Add default API version fallback • Change warning to debug log level • Handle empty choices in streaming • Better Azure OpenAI integration	2025-11-21 23:51:18 +08:00
yangdx	fafa1791f4	Fix Azure OpenAI model parameter to use deployment name consistently - Use deployment name for Azure API calls - Fix model param in embed function - Consistent api_model logic - Prevent Azure model name conflicts	2025-11-21 23:41:52 +08:00
Daniel.y	021b637dc3	Merge pull request #2403 from danielaskdd/azure-cot-handling Refact: Consolidate Azure OpenAI and OpenAI implementations	2025-11-21 19:36:12 +08:00
yangdx	ac9f2574a5	Improve Azure OpenAI wrapper functions with full parameter support • Add missing parameters to wrappers • Update docstrings for clarity • Ensure API consistency • Fix parameter forwarding • Maintain backward compatibility	2025-11-21 19:24:32 +08:00
yangdx	45f4f82392	Refactor Azure OpenAI client creation to support client_configs merging - Handle None client_configs case - Merge configs with explicit params - Override client_configs with params - Use dict unpacking for client init - Maintain parameter precedence	2025-11-21 19:14:16 +08:00
yangdx	0c4cba3860	Fix double decoration in azure_openai_embed and document decorator usage • Remove redundant @retry decorator • Call openai_embed.func directly • Add detailed decorator documentation • Prevent double parameter injection • Fix EmbeddingFunc wrapping issues	2025-11-21 18:03:53 +08:00
yangdx	b46c152306	Fix linting	2025-11-21 17:16:44 +08:00
yangdx	b709f8f869	Consolidate Azure OpenAI implementation into main OpenAI module • Unified OpenAI/Azure client creation • Azure module now re-exports functions • Backward compatibility maintained • Reduced code duplication	2025-11-21 17:12:33 +08:00
yangdx	66d6c7dd6f	Refactor main function to provide sync CLI entry point	2025-11-21 13:11:55 +08:00
Daniel.y	8777895efc	Merge pull request #2401 from danielaskdd/fix-openai-keyword-extraction Refactor: Centralize keyword_extraction parameter handling in OpenAI LLM implementations	2025-11-21 13:08:15 +08:00
yangdx	1e477e95ef	Add lightrag-clean-llmqc console script entry point - Add clean_llm_query_cache tool - New console script for cache cleanup - Extend CLI tool availability	2025-11-21 12:59:49 +08:00
yangdx	02fdceb959	Update OpenAI client to use stable API and bump minimum version to 2.0.0 - Remove beta prefix from completions.parse - Update OpenAI dependency to >=2.0.0 - Fix whitespace formatting - Update all requirement files - Clean up pyproject.toml dependencies	2025-11-21 12:55:44 +08:00
yangdx	9f69c5bf85	feat: Support structured output `parsed` from OpenAI Added support for structured output (JSON mode) from the OpenAI API in `openai.py` and `azure_openai.py`. When `response_format` is used to request structured data, the new logic checks for the `message.parsed` attribute. If it exists, it's serialized into a JSON string as the final content. If not, the code falls back to the existing `message.content` handling, ensuring backward compatibility.	2025-11-21 12:46:31 +08:00
yangdx	c9e1c86e81	Refactor keyword extraction handling to centralize response format logic • Move response format to core function • Remove duplicate format assignments • Standardize keyword extraction flow • Clean up redundant parameter handling • Improve Azure OpenAI compatibility	2025-11-21 12:10:04 +08:00
yangdx	46ce6d9a13	Fix Azure OpenAI embedding model parameter fallback - Use model param if provided - Fall back to deployment name - Fix embedding API call - Improve parameter handling	2025-11-20 18:20:22 +08:00
Daniel.y	cc78e2df10	Merge pull request #2395 from Amrit75/issue-2394 issue-2394: use deployment variable instead of model for embeddings API call	2025-11-20 18:10:49 +08:00
Amritpal Singh	30e86fa331	use deployment variable which extracted value from .env file or have default value	2025-11-20 09:00:27 +00:00
yangdx	ecea93992a	Fix lingting	2025-11-20 13:03:31 +08:00
yangdx	1d2f534f3d	Fix linting	2025-11-20 13:02:25 +08:00
yangdx	72ece7343a	Remove obsolete config file and paging design doc	2025-11-20 13:00:13 +08:00
yangdx	1e415cff95	Update postgreSQL docker image link	2025-11-20 12:34:49 +08:00

1 2 3 4 5 ...

5834 commits