Commit graph

891 commits

Author SHA1 Message Date
yangdx
c8c3545454 refactor: extract file path length limit to shared constant
• Add DEFAULT_MAX_FILE_PATH_LENGTH constant
• Replace hardcoded 4090 in Milvus impl
2025-07-26 10:45:03 +08:00
xuewei
55e2678a1e Improve file_path FieldSchema 4090 2025-07-26 00:22:25 +08:00
yangdx
44b7ce222e feat: add default storage dependencies and optimize imports
- Add nano-vectordb and networkx to pyproject.toml dependencies
- Replace dynamic imports with direct imports for 4 default storage implementations
- Improve startup performance while maintaining backward compatibility
2025-07-24 16:14:26 +08:00
yangdx
f57ed21593 Merge branch 'main' into context-builder 2025-07-24 14:07:05 +08:00
yangdx
5574a30856 fix(postgres): handle ssl_mode="allow" in _create_ssl_context
Add "allow" to the list of recognized SSL modes in PostgreSQL connection helper. Previously, ssl_mode="allow" would fall through to "Unknown SSL mode" warning. Now it's properly handled alongside "require" and "prefer" modes.
2025-07-24 12:45:13 +08:00
yangdx
5a5d32dc32 Optimize logger message 2025-07-24 02:13:39 +08:00
yangdx
df8b4202f3 feat: Add SSL support for PostgreSQL database connections
- Add SSL configuration options (ssl_mode, ssl_cert, ssl_key, ssl_root_cert, ssl_crl)
- Support all PostgreSQL SSL modes (disable, allow, prefer, require, verify-ca, verify-full)
- Add SSL context creation with certificate validation
- Update initdb() method to handle SSL connection parameters
- Add SSL environment variables to env.example
- Maintain backward compatibility with existing non-SSL configurations
2025-07-21 02:03:06 +08:00
yangdx
19a38d9310 Feat: add PostgreSQL extensions for vector and AGE
- Ensure VECTOR extension is available when PostgreSQL init
- Ensure AGE extension is available when PGGraphStorage init
2025-07-21 01:46:41 +08:00
yangdx
2c7d2b3f5f Increase Neo4j connection pool size and timeouts
- Bump default connection pool size to 100
- Add new Neo4j timeout env variables to env.example
2025-07-19 13:27:34 +08:00
yangdx
9f5399c2f1 Replace tenacity retries with manual Memgraph transaction retries
- Implement manual retry logic
- Add exponential backoff with jitter
- Improve error handling for transient errors
2025-07-19 11:31:21 +08:00
yangdx
99e58ac752 fix: add retry mechanism for Memgraph transient errors
- Implement exponential backoff retry for transaction conflicts
- Add tenacity-based retry decorator with 5 attempts
- Handle TransientError in upsert_node and upsert_edge operations
- Resolve "Cannot resolve conflicting transactions" errors
- Improve system reliability under concurrent load
2025-07-19 10:34:35 +08:00
yangdx
96b94acc83 Enhance Redis connection handling with retries and timeouts
- Added Redis connection timeout configurations
- Implemented retry logic for Redis operations
- Updated error handling for timeout cases
- Improved connection pool management
- Added environment variable support
2025-07-19 10:15:26 +08:00
yangdx
f033fd6f87 fix(postgres): improve AGE agtype parsing and simplify error logging
- Fix JSON parsing errors caused by :: characters in data content
- Implement precise agtype string parsing using rfind() to separate JSON content from type identifiers
- Add robust error handling for malformed JSON in graph data
2025-07-18 08:50:47 +08:00
yangdx
14d9fe49b0 refactor(milvus): remove entity_type and weight fields from schema
- Remove entity_type field from entities collections
- Remove weight field from relationships collections
- Update schema definitions and index creation logic
- Maintain backward compatibility with existing data via dynamic fields
2025-07-17 12:08:35 +08:00
yangdx
7184c7b3ab fix: change default edge weight from 0.0 to 1.0 in entity extraction and graph storage
- Update extract_entities function in operate.py to use 1.0 as default weight
- Fix Neo4j implementation to use 1.0 instead of 0.0 for missing edge weights
- Fix Memgraph implementation to use 1.0 instead of 0.0 for missing edge weights
- Ensures consistent non-zero default weights across all graph storage backends
2025-07-17 11:30:49 +08:00
yangdx
57c8c19628 Add datetime format migration for doc status table 2025-07-16 22:21:51 +08:00
yangdx
c7b566f6d5 Fix cache migration MD5 error for PostgreSQL 2025-07-16 19:24:57 +08:00
yangdx
80f7e37168 Fix default workspace name for PostgreSQL AGE graph storage 2025-07-16 19:16:22 +08:00
yangdx
bab2803953 Optimize PostgreSQL database migrations for LLM cache
- Combine column migration into single operation
- Optimize LLM cache key migration query
- Improve migration error handling
- Add conflict detection for cache migration
2025-07-16 17:32:53 +08:00
yangdx
bd340fece6 Fix timestamp column migration comment typos
- Correct timezone-related comments
- Fix typo in debug log message
- Update migration success message
- Maintain same migration logic
2025-07-16 14:27:52 +08:00
yangdx
45d38fa083 Fix JSON error logging in Redis storage implementations 2025-07-16 01:35:07 +08:00
yangdx
6d66cde4ac Reorder query settings in web UI 2025-07-15 18:06:00 +08:00
yangdx
47341d3a71 Merge branch 'main' into rerank 2025-07-15 16:12:33 +08:00
Daniel.y
6d1260aafa
Merge pull request #1766 from HKUDS/fix-memgraph-max-nodes-issue
Fix Memgraph get_knowledge_graph issues
2025-07-15 16:07:04 +08:00
zrguo
91d0f65476 Update QueryParam 2025-07-15 14:21:58 +08:00
yangdx
3da9f8aab4 Fix logging output condition in shared_storage.py. Early return if logging disabled 2025-07-15 13:38:05 +08:00
DavIvek
2914b21b34 remove unused query parameter 2025-07-14 16:25:58 +02:00
DavIvek
9beb2456ec update subgraph query comment 2025-07-14 16:25:17 +02:00
DavIvek
45815f1eae remove redundant UNWIND 2025-07-14 15:39:39 +02:00
DavIvek
593ce552af run pre-commit 2025-07-14 14:26:39 +02:00
DavIvek
f961f1aa7d remove fallback query 2025-07-14 14:26:23 +02:00
DavIvek
81c93f6950 dont use mage procedure 2025-07-14 14:16:20 +02:00
yangdx
7e988158a9 Fix: Resolve timezone handling problem in PostgreSQL storage
- Changed timestamp columns to naive UTC
- Added datetime formatting utilities
- Updated SQL templates for timestamp extraction
- Simplified timestamp migration logic
2025-07-14 04:12:52 +08:00
yangdx
157fb4c871 Increase field lengths for entity and file paths for PostgreSQL
- Expand entity_name length to 512 chars
- Increase source/target ID lengths
- Convert file_path to TEXT type
- Add migration logic
2025-07-14 00:24:54 +08:00
yangdx
187a623125 Increase max length limits for Milvus storage fields
- Extended entity_name max_length to 512
- Increased entity_type max_length to 128
- Expanded file_path limits to 1024
- Raised src_id/tgt_id limits to 512
2025-07-13 23:13:45 +08:00
yangdx
6730a89d7c Hotfix: Resolves connection pool bugs for Redis
- The previous implementation of the shared Redis connection pool had a critical issue where any Redis storage instance would disconnect the global shared pool upon closing. This caused `ConnectionError` exceptions for other instances still using the pool.
- This commit resolves the issue by introducing a reference counting mechanism in `RedisConnectionManager`.
2025-07-13 22:54:34 +08:00
yangdx
85cd1178a1 fix: prevent premature lock cleanup in multiprocess mode
- Change cleanup condition from count == 1 to count == 0 to properly
remove reused locks from cleanup list
- Fix RuntimeError: Attempting to release lock for xxxx more times than it was acquired
2025-07-13 13:51:48 +08:00
yangdx
a2eeae9661 Fixes incorrect cleanup count 2025-07-13 02:38:36 +08:00
yangdx
582e952020 Disable direct logging by default for shared storage module 2025-07-13 01:58:50 +08:00
yangdx
cbf544b3c1 Remvoe redundant log message 2025-07-13 01:51:30 +08:00
yangdx
0e3aaa318f Feat: Add keyed lock cleanup and status monitoring 2025-07-13 00:09:00 +08:00
yangdx
2ade3067f8 Refac: Generalize keyed lock with namespace support
Refactored the `KeyedUnifiedLock` to be generic and support dynamic namespaces. This decouples the locking mechanism from a specific "GraphDB" implementation, allowing it to be reused across different components and workspaces safely.

Key changes:
- `KeyedUnifiedLock` now takes a `namespace` parameter on lock acquisition.
- Renamed `_graph_db_lock_keyed` to a more generic _storage_keyed_lock`
- Replaced `get_graph_db_lock_keyed` with get_storage_keyed_lock` to support namespaces
2025-07-12 12:10:12 +08:00
yangdx
f2d875f8ab Update comments 2025-07-12 11:05:25 +08:00
yangdx
5ee509e671 Fix linting 2025-07-12 05:17:44 +08:00
yangdx
964293f21b Optimize lock cleanup with time tracking and intervals
- Add cleanup time tracking variables
- Implement minimum cleanup intervals
- Track earliest cleanup times
- Handle time rollback cases
- Improve cleanup logging
2025-07-12 04:34:26 +08:00
yangdx
39965d7ded Move merging stage back controled by max parallel insert semhore 2025-07-12 03:32:08 +08:00
yangdx
7490a18481 Optimize lock cleanup parameters 2025-07-12 03:10:03 +08:00
yangdx
3d8e6924bc Show lock clean up message 2025-07-12 02:58:05 +08:00
yangdx
22c36f2fd2 Optimize log messages 2025-07-12 02:41:31 +08:00
yangdx
a64c767298 optimize: improve lock cleanup performance with threshold-based strategy
- Add CLEANUP_THRESHOLD constant (100) to control cleanup frequency
- Modify _release_shared_raw_mp_lock to only scan when cleanup list exceeds threshold
- Modify _release_async_lock to only scan when cleanup list exceeds threshold
2025-07-11 23:43:40 +08:00