Commit graph

3251 commits

Author SHA1 Message Date
yangdx
8e2a1fa59e Enhance Neo4j fulltext search with Chinese language support
• Add CJK analyzer for Chinese text
• Auto-detect Chinese characters
• Recreate index if needed
• Separate Chinese/Latin search logic
• Improve fallback for Chinese queries
2025-09-20 15:19:22 +08:00
yangdx
3b502af858 Update webui assets 2025-09-20 14:36:34 +08:00
yangdx
9330ccb14e Fix graph truncation logging to correctly identify truncation cause 2025-09-20 13:33:19 +08:00
yangdx
1dd164a122 Fix graph truncation detection for depth-limited BFS
- Track unexplored neighbors at max depth
- Improve truncation flag accuracy
2025-09-20 13:12:25 +08:00
yangdx
b897eedaef Update webui assets and bump API version to 0225 2025-09-20 12:41:52 +08:00
yangdx
26c9ba4cb5 Make graph label methods required in BaseGraphStorage interface
• Remove fallback compatibility code
• Add get_popular_labels to ABC
• Add search_labels to ABC
• Enforce consistent implementation
• Clean up error handling paths
2025-09-20 12:40:36 +08:00
yangdx
3296bcb553 Add high-performance label search methods to PostgreSQL graph storage
- Add get_popular_labels() method
- Add search_labels() with fuzzy matching
- Use native SQL for better performance
- Include proper scoring and ranking
2025-09-20 12:39:53 +08:00
yangdx
6f85bd6b19 Add workspace-aware MongoDB indexing and Atlas Search support
• Add workspace attribute to storage classes
• Use workspace-specific index names
• Implement Atlas Search with fallbacks
• Add entity search and popular labels
• Improve index migration strategy
2025-09-20 12:38:41 +08:00
yangdx
223397a247 Add label search and popularity methods to MemgraphStorage
• Get popular labels by node degree
• Search labels with fuzzy matching
• Sort by relevance and connection count
2025-09-20 12:38:04 +08:00
yangdx
e14cee69a3 Fix Neo4j typo and add fulltext search with performance optimizations
- Fix NEO4J_DATABASE typo in env.example
- Add fulltext index for entity searches
- Implement get_popular_labels method
- Add search_labels with fuzzy matching
- Simplify B-Tree index creation logic
2025-09-20 12:37:13 +08:00
yangdx
9db8f2fce5 feat: Add popular labels and search APIs with history management
- Add popular/search label endpoints
- Implement SearchHistoryManager utility
- Replace client-side with server search
- Add graph data version tracking
- Update UI for better label discovery
2025-09-20 02:03:47 +08:00
yangdx
361ea5b069 Update webui assets 2025-09-19 15:17:27 +08:00
yangdx
89a4471ae1 Bump core version to v1.4.9 2025-09-17 02:57:28 +08:00
yangdx
77569ddea2 Add chunk key to entity extraction logging output 2025-09-17 02:21:11 +08:00
yangdx
fdf8b176ad Update webui assets 2025-09-17 02:05:26 +08:00
yangdx
dac156ac8e Update webui assets 2025-09-17 01:53:26 +08:00
yangdx
983fe31af5 Bump API version and improve tooltip text wrapping in DocumentManager
- Update API version to 0224
- Add word-break: break-all to tooltip
- Improve pre tag text wrapping
- Enhance tooltip readability
2025-09-17 01:47:40 +08:00
yangdx
8f6287e27e Add path traversal security validation for file deletion operations
• Add validate_file_path_security function
• Prevent path traversal attacks
• Validate file paths before deletion
• Check both input and enqueued dirs
• Log security violations
2025-09-17 01:12:44 +08:00
yangdx
050a00b693 Update webui assets 2025-09-16 17:33:05 +08:00
yangdx
db524532f1 Bump core version to v.1.4.8.2 and API version to 0223 2025-09-16 17:16:57 +08:00
yangdx
0e8d973d44 Shorten progress prefix in entity extraction error messages 2025-09-16 15:48:37 +08:00
yangdx
ecaee43788 Add error handling with chunk ID prefixing in entity extraction 2025-09-16 13:41:49 +08:00
yangdx
5f45ff56be Merge remote-tracking branch 'origin/main' 2025-09-15 12:34:04 +08:00
yangdx
7b371309dd Update README 2025-09-15 12:31:39 +08:00
yangdx
02c0066df0 Bump core version to 1.4.8.1 2025-09-15 05:34:34 +08:00
yangdx
37d01e2df8 fix: Ensures complete metadata (source_id, created_at, file_path) is preserved in aquery_data responses 2025-09-15 03:45:09 +08:00
yangdx
e71229698d refactor: centralize metadata generation in query functions
- Remove processing_info generation from _convert_to_user_format function
- Move all metadata generation (keywords, processing_info) to kg_query and naive_query functions
- Simplify _convert_to_user_format to focus only on data format conversion
2025-09-15 03:11:07 +08:00
yangdx
c0d5abba6b Fix linting 2025-09-15 02:59:21 +08:00
yangdx
b1c8206346 Add aquery_data endpoint for structured retrieval without LLM generation
- Add QueryDataResponse model
- Implement /query/data endpoint
- Add aquery_data method to LightRAG
- Return entities, relationships, chunks
2025-09-15 02:15:14 +08:00
yangdx
f69c5dfd9a Add language control and format clarity to extraction prompts 2025-09-14 18:26:41 +08:00
yangdx
3ae827c255 Bump API version to 0222 2025-09-14 17:52:27 +08:00
yangdx
6e37460964 Improve entity extraction prompt clarity and make sure LLM output content only 2025-09-14 17:50:56 +08:00
yangdx
82a67354d0 Code formatting improvements and style consistency fixes
* Remove trailing whitespace
* Fix function signature ellipsis style
2025-09-14 17:49:02 +08:00
yangdx
87bb8a023b Fix tuple delimiter regex patterns and add debug logging
- Add debug logs for malformed records
- Fix regex for consecutive delimiters
- Handle missing closing brackets
2025-09-14 17:29:27 +08:00
yangdx
4de1473875 Improve entity extraction prompts and error message formatting
• Fix typo in error log message
• Clarify format requirements in prompts
• Make extraction instructions clearer
• Improve user prompt consistency
2025-09-14 13:45:59 +08:00
yangdx
70fee5bbeb Fix syntax warning by removin examples from fix_tuple_delimiter_corruption docstring 2025-09-14 12:37:21 +08:00
yangdx
20c5127c7c Merge branch 'optimize-extraction' into return-data-only 2025-09-14 12:33:37 +08:00
yangdx
619553021e Fix delimiter processing and optimize case-sensitive handling
• Fix completion_delimiter reference bug
• Add case check before lowercase conversion
• Improve delimiter corruption handling
• Optimize redundant processing logic
2025-09-14 12:23:48 +08:00
yangdx
ff705a2323 Fix tuple delimiter corruption when missing closing bracket, Handle <|#: -> <|#|> pattern 2025-09-14 11:44:21 +08:00
yangdx
fd48afdb00 Use "relation" instead of "relationship" in extration prompt, and support both format for safty 2025-09-14 11:43:35 +08:00
yangdx
1dc96f3959 Merge branch 'optimize-extraction' into return-data-only 2025-09-14 05:37:48 +08:00
yangdx
b820d8d588 Fix entity/relationship record parsing in extraction result processing 2025-09-14 05:35:01 +08:00
yangdx
4f5ad76c2c Add entity vector database upsert for newly added entities by edges upserts 2025-09-14 05:04:45 +08:00
yangdx
7cc2b69bcf Fix linting 2025-09-14 05:02:02 +08:00
yangdx
cddd81a86c Fix LLM output format errors in extraction result processing
- Handle tuple_delimiter as record separator
- Add format validation and correction
- Add warning for format errors
2025-09-14 04:13:01 +08:00
yangdx
419f4f0268 Update web assets 2025-09-14 02:31:42 +08:00
yangdx
d993464a92 Restructure entity extraction prompt with clearer formatting and examples
* Improved instruction clarity
* Added better formatting structure
* Enhanced delimiter usage rules
* Clarified relationship handling
* Better third-person guidelines
2025-09-14 02:30:32 +08:00
yangdx
5311083f43 Rename "Process" entity type to "Method" across all components 2025-09-14 02:30:05 +08:00
yangdx
7060cf17f0 Add Process and Data entity types to LLM extraction system
• Add Process and Data to default types
• Update env.example configuration
• Add translations for new entities
• Support 5 languages (en/zh/fr/ar/tw)
2025-09-14 01:14:47 +08:00
yangdx
2686fc526e Change entity type from CreativeWork to Content and update delimiter
• Replace CreativeWork with Content type
• Improve LLM output error messages
• Update prompt for binary relationships
• Fix delimiter corruption examples
2025-09-14 00:55:15 +08:00