LightRAG

Author	SHA1	Message	Date
yangdx	a97e5dad4c	Optimize PostgreSQL graph queries to avoid Cypher overhead and complexity • Replace Cypher with native SQL queries • Fix O(N²) to O(E) performance issue • Add error handling for parse failures • Use direct table access pattern • Eliminate Cartesian product joins	2025-10-25 14:37:18 +08:00
Daniel.y	907204714b	Merge pull request #2237 from yrangana/feat/optimize-postgres-initialization Optimize PostgreSQL initialization performance	2025-10-21 22:17:46 +08:00
Yasiru Rangana	2f22336ace	Optimize PostgreSQL initialization performance - Batch index existence checks into single query (16+ queries -> 1 query) - Batch timestamp column checks into single query (8 queries -> 1 query) - Batch field length checks into single query (5 queries -> 1 query) Performance improvement: ~70-80% faster initialization (35s -> 5-10s) Key optimizations: 1. check_tables(): Use ANY($1) to check all indexes at once 2. _migrate_timestamp_columns(): Batch all column type checks 3. _migrate_field_lengths(): Batch all field definition checks All changes are backward compatible with no schema or API changes. Reduces database round-trips by batching information_schema queries.	2025-10-21 01:09:48 +11:00
yangdx	dc62c78f98	Add entity/relation chunk tracking with configurable source ID limits - Add entity_chunks & relation_chunks storage - Implement KEEP/FIFO limit strategies - Update env.example with new settings - Add migration for chunk tracking data - Support all KV storage	2025-10-20 15:24:15 +08:00
yangdx	813f4af9d7	Fix linting	2025-10-18 11:44:48 +08:00
Lucky Verma	917e41aa78	Refactor SQL queries and improve input handling in PGKVStorage and PGDocStatusStorage	2025-10-17 15:40:32 -05:00
yangdx	9be22dd666	Preserve ordering in get_by_ids methods across all storage implementations - Fix result ordering in vector stores - Update KV storage get_by_ids methods - Maintain order in doc status storage - Return None for missing IDs	2025-10-11 12:37:59 +08:00
yangdx	b3ed264707	Refactor PostgreSQL retry config to use centralized configuration • Move retry config to ClientManager • Remove env var parsing from PostgreSQLDB • Add config params to test setup	2025-10-10 03:44:13 +08:00
yangdx	e758204ab2	Add PostgreSQL connection retry mechanism with comprehensive error handling • Implement connection retry with backoff • Add transient error detection • Pool management with timeout guards	2025-10-10 03:06:01 +08:00
yangdx	f2c0b41e78	Make PostgreSQL statement_cache_size configuration optional • Remove forced int conversion • Allow None values for cache size • Add conditional parameter setting	2025-10-07 22:57:21 +08:00
kevinnkansah	fdcb034da0	chore: distinguish settings	2025-10-06 12:01:40 +02:00
kevinnkansah	22a7b482c5	fix: renamed PostGreSQL options env variable and allowed LRU cache to be an optional env variable	2025-10-06 11:56:09 +02:00
kevinnkansah	d8a9617c0e	fix: fix: asyncpg bouncer connection pool error Prepared statement caching is disabled by setting `statement_cache_size=0` in the `asyncpg` connection pool parameters. This is necessary to prevent `asyncpg.exceptions.InvalidSQLStatementNameError` when using transaction-level connection poolers like Supabase Supavisor or pgbouncer, which do not support prepared statements.	2025-10-06 00:36:25 +02:00
kevinnkansah	108cdbe133	feat: add options for PostGres connection	2025-10-05 23:29:04 +02:00
yangdx	457d51952e	Add doc_name field to full docs storage - Store file_path in full_docs storage - Update PostgreSQL implementation by map file_path to doc_name - Other storage implementation automatically handles the new field	2025-10-05 11:44:27 +08:00
yangdx	2adb8efdc7	Add duplicate document detection and skip processed files in scanning - Add get_doc_by_file_path to all storages - Skip processed files in scan operation - Check duplicates in upload endpoints - Check duplicates in text insert APIs - Return status info in duplicate responses	2025-09-23 17:30:54 +08:00
yangdx	6b3a341977	Increase default PostgreSQL max connections from 20 to 50	2025-09-22 18:11:28 +08:00
yangdx	3296bcb553	Add high-performance label search methods to PostgreSQL graph storage - Add get_popular_labels() method - Add search_labels() with fuzzy matching - Use native SQL for better performance - Include proper scoring and ranking	2025-09-20 12:39:53 +08:00
Matt23-star	24cb11f3f5	style: ruff-format	2025-08-29 21:09:14 -07:00
Hao Feng	b860ffe510	Merge branch 'main' into main	2025-08-29 21:03:37 -07:00
yangdx	03d0fa3014	perf: add optional query_embedding parameter to avoid redundant embedding calls	2025-08-29 18:15:45 +08:00
yangdx	a923d378dd	Remove deprecated ID-based filtering from vector storage queries - Remove ids param from QueryParam - Simplify BaseVectorStorage.query signature - Update all vector storage implementations - Streamline PostgreSQL query templates - Remove ID filtering from operate.py calls	2025-08-29 17:06:48 +08:00
Matt23-star	aa1ef3f053	feat: optimize database query methods for improved performance and readability	2025-08-28 16:18:15 -07:00
Matt23-star	9804a1885b	feat: refactor parameter handling in database queries to use lists for improved consistency	2025-08-28 16:17:35 -07:00
Matt23-star	015e9ae3dd	Merge branch 'main' into feature/optimization # Conflicts: # lightrag/kg/postgres_impl.py	2025-08-20 16:05:38 +08:00
Matt23-star	874ddda605	feat: remove unused parameter from query methods across multiple implementations	2025-08-20 15:59:05 +08:00
Matt23-star	60564cf453	fix: correct parameter usage in database query for improved reliability	2025-08-17 13:50:41 +08:00
yangdx	185b576101	Fix parameter reference and apply code formatting improvements	2025-08-17 04:02:43 +08:00
Matt23-star	a0593ec1c9	feat: enhance query performance by restructuring relationships, entities, and chunks retrieval in PostgreSQL. Fixed: duplicate items query	2025-08-16 22:49:54 +08:00
Matt23-star	6a7e3092ea	feat: optimize node and edge queries in PostgreSQL. query tables Directly	2025-08-16 22:37:48 +08:00
Matt23-star	a7da48e05c	feat: add batch size parameter to node and edge retrieval methods	2025-08-16 22:35:22 +08:00
yangdx	7a7385a200	Add efficient vector retrieval by IDs to PGVectorStorage	2025-08-15 16:51:41 +08:00
yangdx	0b22ffb252	Refac: uniformly protected with the get_data_init_lock for all storage initializations	2025-08-14 03:46:19 +08:00
yangdx	fc8ca1a706	Fix: add muti-process lock for initialize and drop method for all storage	2025-08-12 04:25:09 +08:00
yangdx	ca00b9c8ee	Fix: Resolve workspace isolation problem for PostgreSQL with multiple LightRAG instances	2025-08-12 01:27:05 +08:00
yangdx	16c9a81f4c	feat: support config.ini for PostgreSQL vector index settings - Add support for reading vector_index_type, hnsw_m, hnsw_ef, and ivfflat_lists from config.ini - Maintain backward compatibility with environment variables - Update config.ini.example with new PostgreSQL vector index options - Follow existing configuration priority: env vars > config.ini > defaults	2025-08-08 02:55:49 +08:00
yangdx	f38e10559e	Update PostgreSQL vector index configuration - Remove FLAT index support - Standardize on HNSW as default - Add dimension validation - Improve error logging - Clean up index creation code	2025-08-08 02:21:06 +08:00
Matt23-star	727ca43d3c	feat: add vector index creation functionality for PostgreSQL	2025-08-07 23:07:18 +08:00
yangdx	cc1f7118e7	Remove deprecated cache_by_modes functionality from all storage	2025-08-05 23:20:26 +08:00
yangdx	8294d6d1b7	Remove deprecated mode field from LLM cache schema - Drop mode column from LLM cache table - Update primary key to exclude mode - Remove mode from all SQL queries - Deprecate mode-related methods - Update schema migration logic	2025-08-05 23:18:54 +08:00
yangdx	0463963520	fix: include all query parameters in LLM cache hash key generation - Add missing query parameters (top_k, enable_rerank, max_tokens, etc.) to cache key generation in kg_query, naive_query, and extract_keywords_only functions - Add queryparam field to CacheData structure and PostgreSQL storage for debugging - Update PostgreSQL schema with automatic migration for queryparam JSONB column - Prevent incorrect cache hits between queries with different parameters Fixes issue where different query parameters incorrectly shared the same cached results.	2025-08-05 18:03:10 +08:00
yangdx	7b3a9c09ca	Fix: add missing colume to LLM cache of PostgreSQL implementation	2025-08-04 11:12:59 +08:00
yangdx	5513155808	Fix namespace tablename translate error - Reorder namespace table map for PostgreSQL - Ensure specific namespaces come first	2025-08-04 00:21:20 +08:00
yangdx	952d1feb07	feat: Add support for KV_STORE_FULL_ENTITIES and KV_STORE_FULL_RELATIONS namespaces in PGKVStorage - Add LIGHTRAG_FULL_ENTITIES and LIGHTRAG_FULL_RELATIONS table schemas - Implement complete CRUD operations for both namespaces - Add automatic table creation and migration support - Add SQL templates and namespace mappings - Ensure workspace isolation and proper indexing	2025-08-03 22:54:56 +08:00
yangdx	d2dd137f83	feat: implement get_all_nodes and get_all_edges methods for graph storage backends Add get_all_nodes() and get_all_edges() methods to Neo4JStorage, PGGraphStorage, MongoGraphStorage, and MemgraphStorage classes. These methods return all nodes and edges in the graph with consistent formatting matching NetworkXStorage for compatibility across different storage backends.	2025-08-03 11:02:37 +08:00
yangdx	2f0aa7ed12	Optimize graph query by simplifying MATCH pattern - Simplify MATCH clause to ()-[r]-() - Remove node type constraints - Improve query performance	2025-08-02 12:54:22 +08:00
yangdx	9a8f58826d	fix: Add safe handling for missing file_path and metadata in PostgreSQL doc status functions - Add null-safe file_path handling with "no-file-path" fallback in get_docs_by_status and get_docs_by_track_id - Enhance metadata validation to ensure dict type after JSON parsing - Align PostgreSQL implementation with JSON implementation safety patterns - Prevent KeyError exceptions when database records have missing fields	2025-07-31 18:07:53 +08:00
yangdx	0eac1a883a	Feat: add file path sorting for document manager - Add file_path sorting support to all database backends (JSON, Redis, PostgreSQL, MongoDB) - Implement smart column header switching between "ID" and "File Name" based on display mode - Add automatic sort field switching when toggling between ID and file name display - Create composite indexes for workspace+file_path in PostgreSQL and MongoDB for better query performance - Update frontend to maintain sort state when switching display modes - Add internationalization support for "fileName" in English and Chinese locales This enhancement improves user experience by providing intuitive file-based sorting while maintaining performance through optimized database indexes.	2025-07-30 18:46:55 +08:00
yangdx	74eecc46e5	feat(pagination): Implement document list pagination backends and frontend UI - Add pagination support to BaseDocStatusStorage interface and all implementations (PostgreSQL, MongoDB, Redis, JSON) - Implement RESTful API endpoints for paginated document queries and status counts - Create reusable pagination UI components with internationalization support - Optimize performance with database-level pagination and efficient in-memory processing - Maintain backward compatibility while adding configurable page sizes (10-200 items)	2025-07-30 17:58:32 +08:00
yangdx	cfb7117dd6	Fix track_id missing for query in PostgreSQL	2025-07-30 03:44:20 +08:00

1 2 3 4 5 ...

281 commits