Refactored the LLM cache to a flat Key-Value (KV) structure, replacing the previous nested format. The old structure used the 'mode' as a key and stored specific cache content as JSON nested under it. This change significantly enhances cache recall efficiency.
- Support fetching all "default_" prefixed documents
- Maintain original behavior for other IDs
- Return dictionary of documents for "default"
- Keep backward compatibility
- Move $limit operation early in pipeline for "*" queries to reduce memory usage
- Remove memory-intensive $sort operation for large dataset queries
- Add fallback mechanism for memory limit errors with simple query
- Implement additional safety checks to enforce max_nodes limit
- Improve error handling and logging for memory-related issues
- Enable disk use for large aggregations
- Fix cursor handling for list_search_indexes
- Improve query performance for big datasets
- Update vector search index check
- Set proper length for to_list results
- To enhance performance during document deletion, new batch-get methods, `get_nodes_by_chunk_ids` and `get_edges_by_chunk_ids`, have been added to the graph storage layer (`BaseGraphStorage` and its implementations). The [`adelete_by_doc_id`](lightrag/lightrag.py:1681) function now leverages these methods to avoid unnecessary iteration over the entire knowledge graph, significantly improving efficiency.
- Graph storage updated: Networkx, Neo4j, Postgres AGE
As far as I can tell this is no longer actually used and its usage was removed in this commit:
83353ab9a6 (diff-a346bcfb05aab0cc0c0baa6018976f4efab339e8cade9f6f8fb658fcbd54ae2e)
Our systems are flagging this package as having a dependency on a package with a less permissive license so I would appreciate if it can be removed if its no longer needed. Let me know if that is not the case.