yangdx
dc62c78f98
Add entity/relation chunk tracking with configurable source ID limits
...
- Add entity_chunks & relation_chunks storage
- Implement KEEP/FIFO limit strategies
- Update env.example with new settings
- Add migration for chunk tracking data
- Support all KV storage
2025-10-20 15:24:15 +08:00
yangdx
9f49e56a44
Merge branch 'main' into feat-entity-size-caps
2025-10-17 15:59:44 +08:00
yangdx
35cd567c9e
Allow related chunks missing in knowledge graph queries
2025-10-17 00:19:30 +08:00
DivinesLight
c06522b927
Get max source Id config from .env and lightRAG init
2025-10-15 18:24:38 +05:00
yangdx
29bac49fb9
Handle empty query results by returning None instead of fail responses
...
• Return None when no context found
• Add structured failure metadata
• Use PROMPTS["fail_response"] for content
• Keep API compatible
2025-10-15 12:04:49 +08:00
haseebuchiha
d52c3377b4
Import from env and use default if none and removed useless import
2025-10-14 16:14:03 +05:00
DivinesLight
54f0a7d1ca
Quick fix to limit source_id ballooning while inserting nodes
2025-10-14 14:47:04 +05:00
yangdx
85d1a563b3
Merge branch 'adminunblinded/main'
2025-10-10 12:31:47 +08:00
NeelM0906
f6d1fb98ac
Fix Linting errors
2025-10-09 16:52:22 -04:00
yangdx
aac787bafb
Clarify chunk tracking log message in _build_llm_context
2025-10-05 13:33:55 +08:00
yangdx
37e8898cf6
Simplify reference formatting in LLM context generation
...
- Remove extra newlines in reference lists
- Change code block type from text to generic
2025-10-01 22:20:58 +08:00
yangdx
dbb0b3afb4
Fix hl_keywords and ll_keywords cache logic
...
- Remove hl_keywords and ll_keywords from keywork extracht cache
- Add hl_keywords and ll_keywords to LLM query cache
2025-09-27 15:26:52 +08:00
yangdx
8cd4139cbf
refactor: fix double query problem by add aquery_llm function for consistent response handling
...
- Add new aquery_llm/query_llm methods providing structured responses
- Consolidate /query and /query/stream endpoints to use unified aquery_llm
- Optimize cache handling by moving cache checks before LLM calls
2025-09-26 19:05:03 +08:00
yangdx
cbdc4c4bdf
Refactor prompts and context building for better maintainability
...
- Extract context templates to PROMPTS
- Unify token calculation logic
- Simplify user_prompt formatting
- Reduce code duplication
- Improve prompt structure consistency
2025-09-26 12:39:06 +08:00
yangdx
fba2356c81
Move user_prompt to system prompt
...
- Refactor query prompt handling to separate user prompts in system context
- Simplify user_query to only contain query
- Apply changes to both kg_query and naive_query
2025-09-26 10:02:01 +08:00
yangdx
b848ca49e6
Fix linting
2025-09-25 16:22:00 +08:00
yangdx
b08b8a6a6a
Add reference list support to query API endpoints with unified result handling
...
• Add include_references param to QueryRequest
• Extend QueryResponse with references field
• Create unified QueryResult data structures
• Refactor kg_query and naive_query functions
• Update streaming to send references first
2025-09-25 16:21:42 +08:00
yangdx
5eb4a4b799
feat: simplify citations, add reference merging, and restructure API response format
2025-09-24 14:30:10 +08:00
yangdx
367f3df038
Fix log message
2025-09-23 11:25:55 +08:00
yangdx
a4442a8613
Optimize log message
2025-09-23 11:22:14 +08:00
yangdx
86186c0c85
Update log message
2025-09-23 11:08:33 +08:00
yangdx
6e2eab5c23
Add ID fields to entities, relations, and chunks in raw data query results
2025-09-21 23:31:35 +08:00
yangdx
18e886d7e9
Improve context item identification with meaningful IDs
...
- Add EN prefix to entitie IDs
- Add RE prefix to relation IDs
-Add DC prefix chunk IDs
- Enhance traceability across contexts
2025-09-21 20:19:14 +08:00
yangdx
8f0fb3c9eb
Include user query in prompt returns
2025-09-21 15:24:20 +08:00
yangdx
6eb37e270a
Refactor query handling and improve RAG response prompts
...
- Move user_prompt to query concatenation
- Remove DEFAULT_USER_PROMPT constant
- Enhance prompt clarity and structure
- Standardize citation formatting
- Improve step-by-step instructions
2025-09-21 15:16:24 +08:00
yangdx
523028f8d0
Remove deprecated truncated fields from token truncation return
...
• Drop truncated_entities field
• Drop truncated_relations field
2025-09-21 11:00:48 +08:00
yangdx
7c463f0fb5
Change entity type formatting from title case to lowercase without spaces
2025-09-21 00:56:56 +08:00
yangdx
77569ddea2
Add chunk key to entity extraction logging output
2025-09-17 02:21:11 +08:00
yangdx
0e8d973d44
Shorten progress prefix in entity extraction error messages
2025-09-16 15:48:37 +08:00
yangdx
ecaee43788
Add error handling with chunk ID prefixing in entity extraction
2025-09-16 13:41:49 +08:00
yangdx
37d01e2df8
fix: Ensures complete metadata (source_id, created_at, file_path) is preserved in aquery_data responses
2025-09-15 03:45:09 +08:00
yangdx
e71229698d
refactor: centralize metadata generation in query functions
...
- Remove processing_info generation from _convert_to_user_format function
- Move all metadata generation (keywords, processing_info) to kg_query and naive_query functions
- Simplify _convert_to_user_format to focus only on data format conversion
2025-09-15 03:11:07 +08:00
yangdx
c0d5abba6b
Fix linting
2025-09-15 02:59:21 +08:00
yangdx
b1c8206346
Add aquery_data endpoint for structured retrieval without LLM generation
...
- Add QueryDataResponse model
- Implement /query/data endpoint
- Add aquery_data method to LightRAG
- Return entities, relationships, chunks
2025-09-15 02:15:14 +08:00
yangdx
82a67354d0
Code formatting improvements and style consistency fixes
...
* Remove trailing whitespace
* Fix function signature ellipsis style
2025-09-14 17:49:02 +08:00
yangdx
87bb8a023b
Fix tuple delimiter regex patterns and add debug logging
...
- Add debug logs for malformed records
- Fix regex for consecutive delimiters
- Handle missing closing brackets
2025-09-14 17:29:27 +08:00
yangdx
4de1473875
Improve entity extraction prompts and error message formatting
...
• Fix typo in error log message
• Clarify format requirements in prompts
• Make extraction instructions clearer
• Improve user prompt consistency
2025-09-14 13:45:59 +08:00
yangdx
20c5127c7c
Merge branch 'optimize-extraction' into return-data-only
2025-09-14 12:33:37 +08:00
yangdx
619553021e
Fix delimiter processing and optimize case-sensitive handling
...
• Fix completion_delimiter reference bug
• Add case check before lowercase conversion
• Improve delimiter corruption handling
• Optimize redundant processing logic
2025-09-14 12:23:48 +08:00
yangdx
fd48afdb00
Use "relation" instead of "relationship" in extration prompt, and support both format for safty
2025-09-14 11:43:35 +08:00
yangdx
1dc96f3959
Merge branch 'optimize-extraction' into return-data-only
2025-09-14 05:37:48 +08:00
yangdx
b820d8d588
Fix entity/relationship record parsing in extraction result processing
2025-09-14 05:35:01 +08:00
yangdx
4f5ad76c2c
Add entity vector database upsert for newly added entities by edges upserts
2025-09-14 05:04:45 +08:00
yangdx
7cc2b69bcf
Fix linting
2025-09-14 05:02:02 +08:00
yangdx
cddd81a86c
Fix LLM output format errors in extraction result processing
...
- Handle tuple_delimiter as record separator
- Add format validation and correction
- Add warning for format errors
2025-09-14 04:13:01 +08:00
yangdx
2686fc526e
Change entity type from CreativeWork to Content and update delimiter
...
• Replace CreativeWork with Content type
• Improve LLM output error messages
• Update prompt for binary relationships
• Fix delimiter corruption examples
2025-09-14 00:55:15 +08:00
yangdx
0ffb5d5f2d
Replace search API with aquery_data for consistent raw data retrieval, mirroring aquery results
...
• Reuse existing query logic paths and remove kg_search function entirely
• Update kg_query/naive_query to return raw data as needed
2025-09-13 15:30:29 +08:00
yangdx
4ce5f9014c
Improve error messages in entity and relationship extraction
2025-09-13 11:20:03 +08:00
yangdx
9a2e8be5a7
Fix extraction validation and delimiter comment accuracy
...
• Change < to != for exact length check
• Fix entity validation from 4 to exact 4
• Fix relationship validation to exact 5
• Correct delimiter comment example
2025-09-12 18:13:25 +08:00
yangdx
69ca447f45
Sort description by timestamp then description length to improves merge consistency
2025-09-12 13:59:26 +08:00