yangdx
9a2e8be5a7
Fix extraction validation and delimiter comment accuracy
...
• Change < to != for exact length check
• Fix entity validation from 4 to exact 4
• Fix relationship validation to exact 5
• Correct delimiter comment example
2025-09-12 18:13:25 +08:00
yangdx
8088b7e07a
Fix tuple delimiter corruption handling and update documentation
2025-09-12 18:03:37 +08:00
yangdx
8a3e2c03a9
Fix tuple delimiter corruption patterns with pipes and brackets
...
- Handle <||S||> malformed delimiters
- Fix <||> empty pipe sequences
- Repair <|| incomplete patterns
- Process ||S|| missing brackets
- Improve delimiter normalization
2025-09-12 17:45:32 +08:00
yangdx
43f6fcea6c
Fix linting
2025-09-12 17:00:53 +08:00
yangdx
1ee1fe895b
Merge branch 'qdrant1.7' into optimize-extraction
2025-09-12 16:40:53 +08:00
yangdx
69ca447f45
Sort description by timestamp then description length to improves merge consistency
2025-09-12 13:59:26 +08:00
yangdx
668a7c1f16
Bump API vesrion to 0220
2025-09-12 12:32:42 +08:00
yangdx
0221213b9b
Improve entity summarization with JSONL format and fix tuple delimiters
...
• Convert descriptions to JSONL format
• Add token-based truncation helper
• Enhance entity name consistency rules
• Improve summarization prompt clarity
• Fix tuple delimiter corruption patterns
2025-09-12 12:32:08 +08:00
yangdx
1892ed23cc
Change tuple delimiter from <|SEP|> to <|S|> across codebase
...
• Update prompt instruction clarity
• Correct utility function examples
• Update regex pattern comments
2025-09-12 08:57:46 +08:00
yangdx
b96f1484ec
Shorten tuple delimiter to <|S|> and refine relationship extraction text
...
• Remove redundant "within input text"
• Clarify relationship extraction scope
2025-09-12 08:36:43 +08:00
yangdx
c07bcbff44
Fix tuple delimiter corruption patterns and add missing edge cases
2025-09-12 08:35:37 +08:00
yangdx
8660bf34e4
Add timestamp tracking for LLM responses and entity/relationship data
...
- Track timestamps for cache hits/misses
- Add timestamp to entity/relationship objects
- Sort descriptions by timestamp order
- Preserve temporal ordering in merges
2025-09-12 04:34:12 +08:00
yangdx
40688def20
Refactor tuple delimiter corruption fix into reusable utility function
...
- Extract regex fixes to utils module
- Add case-insensitive delimiter handling
2025-09-12 04:10:14 +08:00
yangdx
b9f80263b8
Simplify tuple delimiter regex patterns for LLM output fixing
...
• Consolidate 6 regex patterns into 3
• More efficient pattern matching
• Clearer comments and examples
• Same functionality, less code
• Better maintainability
2025-09-12 00:56:40 +08:00
yangdx
78eadc1d6c
Rename function to clarify rebuild vs process extraction contexts
2025-09-11 23:21:27 +08:00
yangdx
b32bd993e1
Bump API version to 0219
2025-09-11 22:47:22 +08:00
yangdx
4ce823b4dd
Handle empty context in mix mode and improve query logging
2025-09-11 18:58:37 +08:00
yangdx
87f1b47218
Update env.examples
2025-09-11 15:50:16 +08:00
yangdx
c8a17f7ea5
Improve extraction failure log message formatting and consistency
2025-09-11 14:03:21 +08:00
yangdx
7f83a58497
Refactor extraction delimiters from ## to newlines and change tuple delimiter to <|SEP|>
...
• Add robust delimiter fixing logic
• Update prompts for single-line format
2025-09-11 13:44:44 +08:00
luxiang
fb4166ba2a
chore: compatible wit qdrant v1.7.3
2025-09-10 20:07:49 +08:00
yangdx
7fe47fac84
Fix linting
2025-09-10 18:38:21 +08:00
yangdx
db6bba80c9
Log all merges at appropriate level
2025-09-10 18:37:13 +08:00
yangdx
a4bfdb7ddf
Fix logging condition to show merges even when no fragments exist if LLM is used
2025-09-10 18:22:10 +08:00
yangdx
02e7462645
feat: enhance LLM output format tolerance for bracket processing
...
- Expand bracket tolerance to support additional characters: < > " '
- Implement symmetric handling for both leading and trailing characters
- Replace simple string matching with robust regex-based pattern detection
- Maintain full backward compatibility with existing bracket formats
2025-09-10 18:10:06 +08:00
yangdx
00de0a4be8
Handle backtick-wrapped brackets in extraction result parsing
...
* Support `( and `( start patterns
* Support )` and )` end patterns
* Graceful fallback to warning logs
* Strip 2 chars for backtick variants
* Maintain existing bracket logic
2025-09-10 17:15:03 +08:00
yangdx
19014c6471
feat: enhance entity/relationship merging with description length comparison
...
- Implement description length comparison in gleaning merge logic (extract_entities)
- Apply same logic to knowledge graph reconstruction (_rebuild_knowledge_from_chunks)
- Prioritize entities/relationships with longer descriptions for better quality
- Use list() instead of extend() for performance optimization when replacing
2025-09-10 17:06:57 +08:00
yangdx
4a21b7f53f
Update OpenAI API config docs for max_tokens and max_completion_tokens
...
• Clarify max_tokens vs max_completion_tokens
• Add Gemini exception note
• Update parameter descriptions
• Add new completion tokens option
2025-09-10 16:23:10 +08:00
yangdx
e3ebf45a18
Add logging for missing brackets in extraction result processing
2025-09-10 16:10:42 +08:00
yangdx
24242c5bb8
Fix indentation for logging and status updates in merge functions
2025-09-10 15:26:35 +08:00
yangdx
c4506438cd
Only log merge messages when there are existing fragments to merge
2025-09-10 15:14:33 +08:00
yangdx
50fddeebbf
fix: Remove conversation history from prompt template
...
- Delete history section from prompt
- Simplify user query response format
- Remove {history} placeholder variable
2025-09-10 12:07:34 +08:00
yangdx
a49c8e4a0d
Refactor JSON serialization to use newline-separated format
...
- Replace json.dumps with line-by-line format
- Apply to entities, relations, text units
- Update truncation key functions
- Maintain ensure_ascii=False setting
- Improve context readability
2025-09-10 11:59:25 +08:00
yangdx
2dd143c935
Refactor conversation history handling to use LLM native message format
...
• Remove get_conversation_turns utility
• Pass history_messages to LLM directly
• Clean up prompt template formatting
2025-09-10 11:56:58 +08:00
yangdx
e078ab7103
Fix cache handling and context return logic for query parameters
...
• Skip cache when only_need_prompt is set
• Update only_need_context condition logic
• Prevent cache bypass in prompt-only mode
2025-09-10 11:31:48 +08:00
yangdx
6774058670
Merge branch 'main' into tongda/main
2025-09-09 22:43:17 +08:00
Daniel.y
9c9d55b697
Merge pull request #2086 from danielaskdd/reasoning_content
...
feat: Add Deepseek Sytle CoT Support for Open AI Compatible LLM Provider
2025-09-09 22:42:09 +08:00
yangdx
077d9be5d7
Add Deepseek Style Chain of Thought (CoT) Support for OpenAI Compatible LLM providers
...
- Add enable_cot parameter to all LLM APIs
- Implement CoT for OpenAI with <think> tags
- Log warnings for unsupported providers
- Enable CoT in query operations
- Handle streaming and non-streaming CoT
2025-09-09 22:34:36 +08:00
yangdx
3477e9f919
Merge branch 'main' into tongda/main
2025-09-09 18:27:56 +08:00
yangdx
09abb656b8
Improve log message formatting for better readability
2025-09-09 17:41:09 +08:00
Daniel.y
f064b950fc
Merge pull request #2027 from Matt23-star/main
...
Refactor: PostgreSQL
2025-09-09 15:12:35 +08:00
Daniel.y
92058187f7
Merge pull request #2082 from danielaskdd/prompt-optimization
...
Prompt Optimization: remove angle brackets from entity and relationship output formats
2025-09-09 12:11:39 +08:00
yangdx
564850aa9d
Update webui assets and bump api version to 0218
2025-09-09 11:41:02 +08:00
yangdx
f1d6d949f1
Fix assistant message display content fallback logic
...
- Handle undefined vs empty string cases
- Prevent COT content keep rendering before </think> tag recieved
2025-09-09 11:39:59 +08:00
yangdx
06db511f3b
Remove angle brackets from entity and relationship output formats
2025-09-09 09:21:23 +08:00
Daniel.y
569ed94d15
Merge pull request #2079 from danielaskdd/fix-cot-render-fall-back
...
Fix assistant message display with content fallback
2025-09-08 23:48:07 +08:00
yangdx
6157318408
Update webui assets and bump api to 0217
2025-09-08 23:37:34 +08:00
yangdx
3912b7d281
Fix assistant message display with content fallback
...
• Add content fallback for compatibility
• Update comment for clarity
• Prevent empty assistant messages
2025-09-08 23:35:31 +08:00
yangdx
ff6c061aa9
Remove conditional check for latest Docker tag
...
- Remove is_default_branch condition
- Always apply latest tag
2025-09-08 23:23:13 +08:00
yangdx
3059089e7d
Fix logging order in pipeline history trimming
2025-09-08 23:00:44 +08:00