yangdx
ed5b9b414c
Add automatic version extraction from git tags to PyPI workflow
...
* Fetch full git history for tags
* Extract version from latest git tag
* Update __init__.py with tag version
* Display updated version for verification
2025-09-05 01:48:53 +08:00
yangdx
09334ca8db
Fix git tag detection in Docker publish workflow
...
- Fetch full git history for tags
- Add debug output for found tag
- Enable proper tag resolution
2025-09-05 01:11:48 +08:00
yangdx
e16c302f5f
Use git tag for Docker image versioning instead of semver
...
• Add step to get latest git tag
• Replace semver with raw tag value
• Maintain latest tag for default branch
• Fix tag resolution in CI pipeline
2025-09-05 01:00:24 +08:00
yangdx
be3f0ebbe5
Simplify entity extraction prompt instructions and remove delimiter
2025-09-04 23:42:11 +08:00
yangdx
3f56c6820c
Reorder language and completion delimiter instructions in prompt
2025-09-04 23:05:16 +08:00
yangdx
2c551cb5db
Add support for Chinese book title marks in normalize_extracted_info
2025-09-04 18:51:57 +08:00
Daniel.y
ae65676b4e
Merge pull request #2060 from danielaskdd/fix-worksapce-dir
...
Fix incorrect variable name in NetworkXStorage file path
2025-09-04 18:36:48 +08:00
yangdx
f19cce16be
Fix incorrect variable name in NetworkXStorage file path
...
- Fix working_dir -> workspace_dir typo
- Correct GraphML file path generation
2025-09-04 18:31:53 +08:00
yangdx
50adf64fab
Fix linting in prompt
2025-09-04 15:22:36 +08:00
yangdx
94114df995
Improve prompt clarity and structure
2025-09-04 14:53:27 +08:00
yangdx
83b54975a2
fix: resolve "Task exception was never retrieved" warnings in async task handling
...
- Handle multiple simultaneous exceptions correctly
- Maintain fast-fail behavior while ensuring proper exception cleanup to
prevent asyncio warnings
2025-09-04 12:40:41 +08:00
yangdx
c903b14849
Bump AIP version to 0214 and update env.example
2025-09-04 12:04:50 +08:00
yangdx
de972f6222
Rename method for clarity and improve code readability
...
- Rename _process_entity_relation_graph to _process_extract_entities
2025-09-04 11:48:31 +08:00
yangdx
9b516a8a53
Hot Fix: Preserve whitespace chars in text sanitization
...
• Keep \t, \n, \r in control char removal
2025-09-04 10:58:29 +08:00
yangdx
7b35657e32
Refactor entity extraction prompt formatting and clarity
...
- Remove quotes from tuple format strings
- Simplify relationship extraction text
- Add relationships to quality guidelines
2025-09-04 10:47:57 +08:00
Daniel.y
ead821aafa
Merge pull request #2055 from danielaskdd/db-retry
...
Add VDB error handling with retries for data consistency
2025-09-03 21:59:32 +08:00
yangdx
a25ce7f078
Fix linting
2025-09-03 21:58:30 +08:00
yangdx
7ef2f0dff6
Add VDB error handling with retries for data consistency
...
- Add safe_vdb_operation_with_exception util
- Wrap VDB ops in entity/relationship code
- Ensure exceptions propagate on failure
- Add retry logic with configurable delays
2025-09-03 21:15:09 +08:00
Daniel.y
61fb2444f0
Merge pull request #2051 from danielaskdd/extract-result-process
...
Enhance KG Extraction for LLM with Small Parameters
2025-09-03 17:59:09 +08:00
yangdx
0b07c022d6
Update webui assets and bump api version to 0213
2025-09-03 12:51:08 +08:00
yangdx
5a5d5e4a34
Add document translation key to all locale files
2025-09-03 12:50:27 +08:00
yangdx
78abb397bf
Reorder entity types and add Document type to extraction
2025-09-03 12:44:40 +08:00
yangdx
95c08cc7dc
Improve entity extraction prompt clarity by replacing pronouns with specific nouns
2025-09-03 12:35:52 +08:00
yangdx
c86f863fa4
feat: optimize entity extraction for smaller LLMs
...
Simplify entity relationship extraction process to improve compatibility
and performance with smaller, less capable language models.
Changes:
- Remove iterative gleaning loop with LLM-based continuation decisions
- Simplify to single gleaning pass when entity_extract_max_gleaning > 0
- Streamline entity extraction prompts with clearer instructions
- Add explicit completion delimiter signals in all examples
2025-09-03 10:33:01 +08:00
yangdx
9d81cd724a
Fix typo: change "Equiment" to "Equipment" in entity types
2025-09-02 03:19:31 +08:00
yangdx
476b64c9d4
Update webui assets
2025-09-02 03:03:19 +08:00
yangdx
4e37ff5f2f
Bump API verstion to 0212
2025-09-02 03:02:39 +08:00
yangdx
4db43c43f3
Add product and other entity type translations
...
- Add "product" translations
- Add "other" translations
- Update 5 locale files
- Extend entity type coverage
2025-09-02 03:01:59 +08:00
yangdx
5b2deccbef
Improve text normalization and add entity type capitalization
...
- Capitalize entity types with .title()
- Add non-breaking space handling
- Add narrow non-breaking space regex
2025-09-02 02:51:41 +08:00
yangdx
29f0ecc88c
Refactor entity extraction prompts and remove completion delimiter
...
• Update prompt structure and wording
• Remove deprecated completion delimiter
• Add quality guidelines section
• Improve instruction clarity
• Enhance continue extraction prompt
2025-09-02 02:14:14 +08:00
yangdx
3f8a9abe7e
Refactor extraction result processing to reduce code duplication
...
• Extract shared processing logic
• Add delimiter pattern fixes
• Improve bracket standardization
2025-09-02 01:22:29 +08:00
yangdx
3cdc98f366
Improve extraction parsing with better bracket handling and delimiter fixes
...
• Standardize Chinese/English brackets
• Fix incomplete tuple delimiters
• Remove duplicate delimiter fix code
• Support mixed bracket formats
• Enhance record parsing robustness
2025-09-02 00:26:04 +08:00
yangdx
8bbf307aeb
Fix regex to match multiline content in extraction parsing
...
• Remove non-greedy quantifier
• Add DOTALL flag for multiline matching
• Apply to both parsing functions
• Enable cross-line content extraction
2025-09-01 10:35:06 +08:00
yangdx
7baeb186c6
Fix regex to use non-greedy matching for parentheses extraction
2025-09-01 10:10:45 +08:00
yangdx
692357fbf3
Add conflict resolution instruction to entity summarization prompt
...
- Add conflict handling step
- Handle entities with same name
- Separate then consolidate summaries
2025-09-01 08:51:19 +08:00
yangdx
e95622ca7b
fix(utils): enhance remove_think_tags to handle orphaned </think> closing tags
...
The function now properly handles cases where text contains </think> closing tags
without corresponding <think> opening tags, which can occur due to content
truncation or processing errors.
2025-09-01 07:17:30 +08:00
Daniel.y
5e73896c40
Merge pull request #2035 from danielaskdd/fix-llm-output
...
Fix LLM output instability for <|> tuple delimiter
2025-09-01 01:25:24 +08:00
yangdx
30be70991d
Bump API version to 0211
2025-09-01 01:23:22 +08:00
yangdx
5fd7682f16
Fix LLM output instability for <|> tuple delimiter
...
- Replace <||> with <|>
- Replace < | > with <|>
- Apply fix in both functions
- Handle delimiter variations
- Improve parsing reliability
2025-09-01 01:22:27 +08:00
Daniel.y
cdc4570cfe
Merge pull request #2034 from danielaskdd/fix-entity-type-env
...
Fix ENTITY_TYPES Environment Variable Handling
2025-09-01 00:43:46 +08:00
yangdx
ec059d1b5d
Fix typo and clarify delimiter formatting in relationship extraction Prompt
...
- Fix "feild" → "field" typo
- Clarify delimiter spacing rules
2025-09-01 00:42:59 +08:00
yangdx
c8c59c38b0
Fix entity types configuration to support JSON list parsing
...
- Add JSON parsing for list env vars
- Update entity types example format
- Add list type support to get_env_value
2025-09-01 00:14:57 +08:00
yangdx
1a015a7015
Add queue_name parameter to priority_limit_async_func_call for better logging
...
• Add queue_name parameter to decorator
• Update all log messages with queue names
• Pass specific names for LLM and embedding
2025-08-31 23:47:22 +08:00
yangdx
57fe1403c3
Update default entity types in env.example configuration
2025-08-31 22:33:34 +08:00
yangdx
4e751e0653
refac: Enhance extraction with improved prompts and parser
...
- **Prompts**: Restructured prompts with clearer steps and quality guidelines. Simplified the relationship tuple by removing `relationship_strength`
- **Model**: Updated default entity types to be more comprehensive and consistently capitalized (e.g., `Location`, `Product`)
2025-08-31 22:24:11 +08:00
yangdx
75de40da41
Fix typo in relationship extraction log messages
2025-08-31 17:45:16 +08:00
yangdx
97c9600085
Improve extraction error handling and field validation
...
• Add field count validation warnings
• Fix relationship field count (5→6)
• Change error logs to warnings
2025-08-31 17:33:42 +08:00
yangdx
b747417961
feat: enhance text extraction text sanitization and normalization
...
- Improve reduntant quotes in entity and relation name, type and keywords
- Add HTML tag cleaning and Chinese symbol conversion
- Filter out short numeric content and malformed text
- Enhance entity type validation with character filtering
2025-08-31 13:17:20 +08:00
yangdx
d4bbc5dea9
refactor: Merge multi-step text sanitization into single function
2025-08-31 10:36:56 +08:00
Daniel.y
68f18eacf8
Merge pull request #2030 from danielaskdd/fix-leading-white-space
...
Fix: Preserve Leading Spaces in Graph Label Selection
2025-08-31 03:02:23 +08:00