The `_migrate_entity_relation_data` function previously processed directed edges from `get_all_edges`, which could lead to duplicates (e.g., (A,B) and (B,A)) and an incorrect relation count.
This commit normalizes edges by sorting their source and target nodes before adding them to the relation set. This ensures all edges are treated as undirected and are properly deduplicated.
- The `initialize_storages` method must be explicitly called after LightRAG creation.
The `finalize_storages` method should be called before LightRAG destyoyed.
- Added explicit data migration check
Add get_all_nodes() and get_all_edges() methods to Neo4JStorage, PGGraphStorage, MongoGraphStorage, and MemgraphStorage classes. These methods return all nodes and edges in the graph with consistent formatting matching NetworkXStorage for compatibility across different storage backends.
- Introduces an index mapping documents to their corresponding entities and relations. This significantly speeds up `adelete_by_doc_id` by replacing slow graph traversal with a fast key-value lookup.
- Refactors the ingestion pipeline (`merge_nodes_and_edges`) to populate this new index. Adds a one-time data migration script to backfill the index for existing data.
Replace regex-based JSON extraction with json-repair for better handling of malformed LLM responses. Remove deprecated JSON parsing utilities and clean up keyword_extraction parameter across LLM providers.
- Remove locate_json_string_body_from_string() and convert_response_to_json()
- Use json-repair.loads() in extract_keywords_only() for robust parsing
- Clean up LLM interfaces and remove unused parameters
- Add json-repair dependency
- Add null-safe file_path handling with "no-file-path" fallback in get_docs_by_status and get_docs_by_track_id
- Enhance metadata validation to ensure dict type after JSON parsing
- Align PostgreSQL implementation with JSON implementation safety patterns
- Prevent KeyError exceptions when database records have missing fields
- Remove History Turns UI component from QuerySettings
- Update settings store version to 17 with migration
- Force history_turns parameter to always be 0 in queries
- Prevent future modifications to history_turns setting
- Fix handleDocumentsCleared to use proper 30s idle interval instead of 500ms
- Update smart polling useEffect to depend on complete statusCounts object
- Reset status counts immediately when documents are cleared
Resolves 'coroutine' object has no attribute 'to_list' error in document pagination endpoint by adding missing await keyword before self._data.aggregate() call.
- Relocate the health check functionality from aap.tsx to state.ts to enable initialization by other components.
- Replaced the fixed 5-second polling with a dynamic interval. The polling interval is extended to 30 seconds when the no files in pending an processing state.
- Data refresh is triggered instantly when `pipelineBusy` state changed
- Health check is triggered after clicking "Scan New Documents" or "Clear Documents"
- Add file_path sorting support to all database backends (JSON, Redis, PostgreSQL, MongoDB)
- Implement smart column header switching between "ID" and "File Name" based on display mode
- Add automatic sort field switching when toggling between ID and file name display
- Create composite indexes for workspace+file_path in PostgreSQL and MongoDB for better query performance
- Update frontend to maintain sort state when switching display modes
- Add internationalization support for "fileName" in English and Chinese locales
This enhancement improves user experience by providing intuitive file-based sorting
while maintaining performance through optimized database indexes.
- Change default documents page size from 20 to 10 rows
- Add documentsPageSize setting to settings store with persistence
- Update DocumentManager to use and save page size preference
- Include migration for existing users to get new default value
- Add pagination support to BaseDocStatusStorage interface and all implementations (PostgreSQL, MongoDB, Redis, JSON)
- Implement RESTful API endpoints for paginated document queries and status counts
- Create reusable pagination UI components with internationalization support
- Optimize performance with database-level pagination and efficient in-memory processing
- Maintain backward compatibility while adding configurable page sizes (10-200 items)