Raphaël MANSUY
c53b7cba76
cherry-pick ec2ea4fd
2025-12-04 19:19:00 +08:00
Raphaël MANSUY
f7c8804a52
cherry-pick 3fa79026
2025-12-04 19:18:40 +08:00
Raphaël MANSUY
d8e98ca362
cherry-pick 29c4a91d
2025-12-04 19:18:39 +08:00
Raphaël MANSUY
803315e60c
cherry-pick 97a2ee4e
2025-12-04 19:18:38 +08:00
Raphaël MANSUY
458c3aa38a
cherry-pick 5ee9a2f8
2025-12-04 19:18:38 +08:00
Raphaël MANSUY
09bab5f49f
cherry-pick 78ad8873
2025-12-04 19:18:38 +08:00
Raphaël MANSUY
77a715f61b
cherry-pick 904b1f46
2025-12-04 19:18:37 +08:00
Raphaël MANSUY
4231e38281
cherry-pick fe890fca
2025-12-04 19:18:37 +08:00
Raphaël MANSUY
2054c35d15
cherry-pick cd1c48be
2025-12-04 19:18:37 +08:00
Raphaël MANSUY
18a8f57b89
cherry-pick be3d274a
2025-12-04 19:18:37 +08:00
Raphaël MANSUY
ef2355a7ac
cherry-pick a809245a
2025-12-04 19:18:37 +08:00
Raphaël MANSUY
646b1fad38
cherry-pick 80668aae
2025-12-04 19:18:37 +08:00
Raphaël MANSUY
b5d68c1756
cherry-pick 665f60b9
2025-12-04 19:18:37 +08:00
Raphaël MANSUY
ab6e8a9cf4
cherry-pick 3ed2abd8
2025-12-04 19:18:37 +08:00
Raphaël MANSUY
5ac376ed63
cherry-pick e01c998e
2025-12-04 19:18:36 +08:00
Raphaël MANSUY
b38177de80
cherry-pick a9fec267
2025-12-04 19:18:36 +08:00
Raphaël MANSUY
9b1579f2df
cherry-pick 29bac49f
2025-12-04 19:18:35 +08:00
Raphaël MANSUY
a3d7f4b985
cherry-pick 17c2a929
2025-12-04 19:18:35 +08:00
Raphaël MANSUY
d85c5a5875
cherry-pick 4e740af7
2025-12-04 19:18:16 +08:00
Raphaël MANSUY
93778770ab
fix: sync core modules with upstream after Wave 2
2025-12-04 19:14:52 +08:00
Raphaël MANSUY
f5e653451a
cherry-pick 37e8898c
2025-12-04 19:14:28 +08:00
Raphaël MANSUY
f7f9a9e6cf
fix: sync all core modules with upstream after Wave 1
2025-12-04 19:13:48 +08:00
yangdx
2ea1fccf1a
Refactor deduplication calculation and remove unused variables
...
(cherry picked from commit 1154c5683f )
2025-12-04 19:11:23 +08:00
DivinesLight
b9fc6f19dd
Quick fix to limit source_id ballooning while inserting nodes
...
(cherry picked from commit 54f0a7d1ca )
2025-12-04 19:11:23 +08:00
yangdx
7f7574c8b7
Add token limit validation for character-only chunking
...
- Add ChunkTokenLimitExceededError exception
- Validate chunks against token limits
- Include chunk preview in error messages
- Add comprehensive test coverage
- Log warnings for oversized chunks
(cherry picked from commit f988a22652 )
2025-12-04 19:11:22 +08:00
yangdx
6e3ff18570
Adjust chunking parameters to match the default environment variable settings
...
(cherry picked from commit e77340d4a1 )
2025-12-04 19:11:21 +08:00
EightyOliveira
b8dc5de81a
refactor(chunking): rename params and improve docstring for chunking_by_token_size
...
(cherry picked from commit dacca334e0 )
2025-12-04 19:11:21 +08:00
yangdx
cb5451faf8
Add entity/relation chunk tracking with configurable source ID limits
...
- Add entity_chunks & relation_chunks storage
- Implement KEEP/FIFO limit strategies
- Update env.example with new settings
- Add migration for chunk tracking data
- Support all KV storage
(cherry picked from commit dc62c78f98 )
2025-12-04 19:11:19 +08:00
yangdx
687d2b6b13
Improve error handling and add cancellation checks in pipeline
...
(cherry picked from commit 77336e50b6 )
2025-12-04 19:11:15 +08:00
yangdx
a471f1ca0e
Add pipeline cancellation feature for graceful processing termination
...
• Add cancel_pipeline API endpoint
• Implement PipelineCancelledException
• Add cancellation checks in main loop
• Handle task cancellation gracefully
• Mark cancelled docs as FAILED
(cherry picked from commit 743aefc655 )
2025-12-04 19:11:15 +08:00
yangdx
37d48bafb6
Simplify skip logging and reduce pipeline status updates
...
(cherry picked from commit a5253244f9 )
2025-12-04 19:11:14 +08:00
Raphaël MANSUY
ed73def994
fix: sync core modules with upstream for compatibility
2025-12-04 19:10:46 +08:00
yangdx
a42222d7f9
Resolve lock leakage issue during user cancellation handling
...
• Change default log level to INFO
• Force enable error logging output
• Add lock cleanup rollback protection
• Handle LLM cache persistence errors
• Fix async task exception handling
(cherry picked from commit a9ec15e669 )
2025-12-04 19:09:01 +08:00
yangdx
e4be3549c3
Improve entity identifier truncation warning message format
...
(cherry picked from commit 00aa5e53a7 )
2025-12-04 19:09:00 +08:00
yangdx
6de4bb9113
Fix logging message formatting
...
(cherry picked from commit e0fd31a60d )
2025-12-04 19:08:46 +08:00
yangdx
dbb0b3afb4
Fix hl_keywords and ll_keywords cache logic
...
- Remove hl_keywords and ll_keywords from keywork extracht cache
- Add hl_keywords and ll_keywords to LLM query cache
2025-09-27 15:26:52 +08:00
yangdx
8cd4139cbf
refactor: fix double query problem by add aquery_llm function for consistent response handling
...
- Add new aquery_llm/query_llm methods providing structured responses
- Consolidate /query and /query/stream endpoints to use unified aquery_llm
- Optimize cache handling by moving cache checks before LLM calls
2025-09-26 19:05:03 +08:00
yangdx
cbdc4c4bdf
Refactor prompts and context building for better maintainability
...
- Extract context templates to PROMPTS
- Unify token calculation logic
- Simplify user_prompt formatting
- Reduce code duplication
- Improve prompt structure consistency
2025-09-26 12:39:06 +08:00
yangdx
fba2356c81
Move user_prompt to system prompt
...
- Refactor query prompt handling to separate user prompts in system context
- Simplify user_query to only contain query
- Apply changes to both kg_query and naive_query
2025-09-26 10:02:01 +08:00
yangdx
b848ca49e6
Fix linting
2025-09-25 16:22:00 +08:00
yangdx
b08b8a6a6a
Add reference list support to query API endpoints with unified result handling
...
• Add include_references param to QueryRequest
• Extend QueryResponse with references field
• Create unified QueryResult data structures
• Refactor kg_query and naive_query functions
• Update streaming to send references first
2025-09-25 16:21:42 +08:00
yangdx
5eb4a4b799
feat: simplify citations, add reference merging, and restructure API response format
2025-09-24 14:30:10 +08:00
yangdx
367f3df038
Fix log message
2025-09-23 11:25:55 +08:00
yangdx
a4442a8613
Optimize log message
2025-09-23 11:22:14 +08:00
yangdx
86186c0c85
Update log message
2025-09-23 11:08:33 +08:00
yangdx
6e2eab5c23
Add ID fields to entities, relations, and chunks in raw data query results
2025-09-21 23:31:35 +08:00
yangdx
18e886d7e9
Improve context item identification with meaningful IDs
...
- Add EN prefix to entitie IDs
- Add RE prefix to relation IDs
-Add DC prefix chunk IDs
- Enhance traceability across contexts
2025-09-21 20:19:14 +08:00
yangdx
8f0fb3c9eb
Include user query in prompt returns
2025-09-21 15:24:20 +08:00
yangdx
6eb37e270a
Refactor query handling and improve RAG response prompts
...
- Move user_prompt to query concatenation
- Remove DEFAULT_USER_PROMPT constant
- Enhance prompt clarity and structure
- Standardize citation formatting
- Improve step-by-step instructions
2025-09-21 15:16:24 +08:00
yangdx
523028f8d0
Remove deprecated truncated fields from token truncation return
...
• Drop truncated_entities field
• Drop truncated_relations field
2025-09-21 11:00:48 +08:00