LightRAG

Author	SHA1	Message	Date
clssck	082a5a8fad	test(lightrag,api): add comprehensive test coverage and S3 support Add extensive test suites for API routes and utilities: - Implement test_search_routes.py (406 lines) for search endpoint validation - Implement test_upload_routes.py (724 lines) for document upload workflows - Implement test_s3_client.py (618 lines) for S3 storage operations - Implement test_citation_utils.py (352 lines) for citation extraction - Implement test_chunking.py (216 lines) for text chunking validation Add S3 storage client implementation: - Create lightrag/storage/s3_client.py with S3 operations - Add storage module initialization with exports - Integrate S3 client with document upload handling Enhance API routes and core functionality: - Add search_routes.py with full-text and graph search endpoints - Add upload_routes.py with multipart document upload support - Update operate.py with bulk operations and health checks - Enhance postgres_impl.py with bulk upsert and parameterized queries - Update lightrag_server.py to register new API routes - Improve utils.py with citation and formatting utilities Update dependencies and configuration: - Add S3 and test dependencies to pyproject.toml - Update docker-compose.test.yml for testing environment - Sync uv.lock with new dependencies Apply code quality improvements across all modified files: - Add type hints to function signatures - Update imports and router initialization - Fix logging and error handling	2025-12-05 23:13:39 +01:00
clssck	dd1413f3eb	test(lightrag,examples): add prompt accuracy and quality tests Add comprehensive test suites for prompt evaluation: - test_prompt_accuracy.py: 365 lines testing prompt extraction accuracy - test_prompt_quality_deep.py: 672 lines for deep quality analysis - Refactor prompt.py to consolidate optimized variants (removed prompt_optimized.py) - Apply ruff formatting and type hints across 30 files - Update pyrightconfig.json for static type checking - Modernize reproduce scripts and examples with improved type annotations - Sync uv.lock dependencies	2025-12-05 16:39:52 +01:00
clssck	69358d830d	test(lightrag,examples,api): comprehensive ruff formatting and type hints Format entire codebase with ruff and add type hints across all modules: - Apply ruff formatting to all Python files (121 files, 17K insertions) - Add type hints to function signatures throughout lightrag core and API - Update test suite with improved type annotations and docstrings - Add pyrightconfig.json for static type checking configuration - Create prompt_optimized.py and test_extraction_prompt_ab.py test files - Update ruff.toml and .gitignore for improved linting configuration - Standardize code style across examples, reproduce scripts, and utilities	2025-12-05 15:17:06 +01:00
yangdx	d54d0d55d9	Standardize empty workspace handling from "_" to "" across storage * Unify empty workspace behavior by changing workspace from "_" to "" * Fixed incorrect empty workspace detection in get_all_update_flags_status()	2025-11-17 12:54:33 +08:00
yangdx	926960e957	Refactor workspace handling to use default workspace and namespace locks - Remove DB-specific workspace configs - Add default workspace auto-setting - Replace global locks with namespace locks - Simplify pipeline status management - Remove redundant graph DB locking	2025-11-17 12:54:33 +08:00
yangdx	e5e16b7bd1	Fix Redis data migration error • Use proper Redis connection context • Fix namespace pattern for key scanning • Propagate storage check exceptions • Remove defensive error swallowing	2025-10-21 16:27:04 +08:00
yangdx	dc62c78f98	Add entity/relation chunk tracking with configurable source ID limits - Add entity_chunks & relation_chunks storage - Implement KEEP/FIFO limit strategies - Update env.example with new settings - Add migration for chunk tracking data - Support all KV storage	2025-10-20 15:24:15 +08:00
yangdx	9be22dd666	Preserve ordering in get_by_ids methods across all storage implementations - Fix result ordering in vector stores - Update KV storage get_by_ids methods - Maintain order in doc status storage - Return None for missing IDs	2025-10-11 12:37:59 +08:00
yangdx	2adb8efdc7	Add duplicate document detection and skip processed files in scanning - Add get_doc_by_file_path to all storages - Skip processed files in scan operation - Check duplicates in upload endpoints - Check duplicates in text insert APIs - Return status info in duplicate responses	2025-09-23 17:30:54 +08:00
yangdx	a4c4b1182a	Fix logging level usage in Redis retry decorator * Replace string with logging.WARNING constant	2025-08-20 05:21:15 +08:00
yangdx	61469c0a56	Add Chinese pinyin sorting support across document operations • Replace pyuca with centralized utils function • Add pinyin sort keys for file paths • Update MongoDB indexes with zh collation • Migrate existing indexes for compatibility • Support Chinese chars in Redis/JSON storage • Keep PostgreSQL sorting order controled by Database Collate order	2025-08-17 12:45:48 +08:00
yangdx	0b22ffb252	Refac: uniformly protected with the get_data_init_lock for all storage initializations	2025-08-14 03:46:19 +08:00
yangdx	fc8ca1a706	Fix: add muti-process lock for initialize and drop method for all storage	2025-08-12 04:25:09 +08:00
yangdx	095e0cbfa2	Refac: Add workspace infomation to all logger output for all storage type	2025-08-12 01:19:09 +08:00
yangdx	cc1f7118e7	Remove deprecated cache_by_modes functionality from all storage	2025-08-05 23:20:26 +08:00
yangdx	2af8a93dc7	fix: resolve _sort_key error in Redis get_docs_paginated function	2025-07-31 02:16:56 +08:00
yangdx	0eac1a883a	Feat: add file path sorting for document manager - Add file_path sorting support to all database backends (JSON, Redis, PostgreSQL, MongoDB) - Implement smart column header switching between "ID" and "File Name" based on display mode - Add automatic sort field switching when toggling between ID and file name display - Create composite indexes for workspace+file_path in PostgreSQL and MongoDB for better query performance - Update frontend to maintain sort state when switching display modes - Add internationalization support for "fileName" in English and Chinese locales This enhancement improves user experience by providing intuitive file-based sorting while maintaining performance through optimized database indexes.	2025-07-30 18:46:55 +08:00
yangdx	74eecc46e5	feat(pagination): Implement document list pagination backends and frontend UI - Add pagination support to BaseDocStatusStorage interface and all implementations (PostgreSQL, MongoDB, Redis, JSON) - Implement RESTful API endpoints for paginated document queries and status counts - Create reusable pagination UI components with internationalization support - Optimize performance with database-level pagination and efficient in-memory processing - Maintain backward compatibility while adding configurable page sizes (10-200 items)	2025-07-30 17:58:32 +08:00
yangdx	75de799353	Remove deprecated content field from doc status storage - Remove content field from JSON storage - Remove content field from MongoDB storage - Remove content field from Redis storage	2025-07-30 01:00:06 +08:00
yangdx	93afa7d8a7	feat: add processing time tracking to document status with metadata field - Add metadata field to DocProcessingStatus with start_time and end_time tracking - Record processing timestamps using Unix time format (seconds precision) - Update all storage backends (JSON, MongoDB, Redis, PostgreSQL) for new field support - Maintain backward compatibility with default values for existing data - Add error_msg field for better error tracking during document processing	2025-07-29 23:42:33 +08:00
yangdx	6014b9bf73	feat: add track_id support for document processing progress monitoring - Add get_docs_by_track_id() method to all storage backends (MongoDB, PostgreSQL, Redis, JSON) - Implement automatic track_id generation with upload_/insert_ prefixes - Add /track_status/{track_id} API endpoint for frontend progress queries - Create database indexes for efficient track_id lookups - Enable real-time document processing status tracking across all storage types	2025-07-29 22:24:21 +08:00
yangdx	dafdf92715	Remove content fallback logic in get_docs_by_status from Redis	2025-07-29 19:13:07 +08:00
yangdx	96b94acc83	Enhance Redis connection handling with retries and timeouts - Added Redis connection timeout configurations - Implemented retry logic for Redis operations - Updated error handling for timeout cases - Improved connection pool management - Added environment variable support	2025-07-19 10:15:26 +08:00
yangdx	45d38fa083	Fix JSON error logging in Redis storage implementations	2025-07-16 01:35:07 +08:00
yangdx	6730a89d7c	Hotfix: Resolves connection pool bugs for Redis - The previous implementation of the shared Redis connection pool had a critical issue where any Redis storage instance would disconnect the global shared pool upon closing. This caused `ConnectionError` exceptions for other instances still using the pool. - This commit resolves the issue by introducing a reference counting mechanism in `RedisConnectionManager`.	2025-07-13 22:54:34 +08:00
yangdx	033098c1bc	Feat: Add WORKSPACE support to all storage types	2025-07-07 00:57:21 +08:00
yangdx	6c2ae40d7d	Refac: Enhance KG rebuild stability by incorporating `create_time` into the LLM cache	2025-07-03 17:08:29 +08:00
yangdx	e56734cb8b	Refac: Optimize document deletion performance - Adding chunks_list to dock_status - Adding llm_cache_list to text_chunks - Implemented storage types: JsonKV and Redis	2025-07-03 04:18:25 +08:00
yangdx	86c9a0cda2	Fix linting	2025-07-02 16:29:43 +08:00
yangdx	271722405f	feat: Flatten LLM cache structure for improved recall efficiency Refactored the LLM cache to a flat Key-Value (KV) structure, replacing the previous nested format. The old structure used the 'mode' as a key and stored specific cache content as JSON nested under it. This change significantly enhances cache recall efficiency.	2025-07-02 16:11:53 +08:00
yangdx	4c2b4b4b6b	Revert "Fix LLM cache handling for Redis to address document deletion scenarios." This reverts commit `14cda93988`.	2025-06-29 22:35:40 +08:00
yangdx	10cd9c90e7	Revert "Fix linting" This reverts commit `abd9de2a63`.	2025-06-29 22:35:26 +08:00
yangdx	abd9de2a63	Fix linting	2025-06-29 15:15:49 +08:00
yangdx	14cda93988	Fix LLM cache handling for Redis to address document deletion scenarios. - Implements bulk scan for "extract" cache entries - Maintains backward compatibility for normal IDs	2025-06-29 15:13:42 +08:00
yangdx	b2284c8b9d	Fix linting	2025-04-06 17:45:32 +08:00
yangdx	b45c5f9304	Change get_by_id batch size from 25 to 5 to reserve db connection resouces	2025-04-06 17:42:13 +08:00
Alex Z	e69a128832	Merge branch 'main' into main	2025-04-05 15:27:59 -07:00
Alex Z	d0d246bef8	Fix 'TOO MANY OPEN FILE' problem while using redis vector DB: Enhance RedisKVStorage: Implement connection pooling and error handling. Refactor async methods to use context managers for Redis operations, improving resource management and error logging. Batch processing added for key operations to optimize performance.	2025-04-02 21:06:49 -07:00
yangdx	95a8ee27ed	Fix linting	2025-03-31 23:22:27 +08:00
yangdx	3d4f8f67c9	Add drop_cace_by_modes to all KV storage implementation	2025-03-31 23:10:21 +08:00
yangdx	b411ce2fed	Add drop support for RedisKVStorage	2025-03-31 01:40:14 +08:00
yangdx	77bc9594cf	Remove delete_entity and delete_entity_relation from RediskKVStorage	2025-03-31 01:34:41 +08:00
zrguo	81568f3bad	fix linting	2025-03-04 15:53:20 +08:00
zrguo	3a2a636862	Implement the missing methods.	2025-03-04 15:50:53 +08:00
Yannick Stephan	9277fe8c29	fixed return	2025-02-19 22:22:41 +01:00
Yannick Stephan	2524e02428	remove tqdm and cleaned readme and ollama	2025-02-18 19:58:03 +01:00
Yannick Stephan	0994d478f0	cleaned code	2025-02-18 10:21:54 +01:00
Yannick Stephan	fc0cf2934e	fixed drop	2025-02-18 10:21:14 +01:00
Yannick Stephan	fc4b830036	fallback default drops	2025-02-18 08:43:23 +01:00
Yannick Stephan	66c4b01fdd	remove drops unused	2025-02-17 23:16:23 +01:00

1 2

71 commits