Albert Gil López
c66fc3483a
fix: Implement PipelineNotInitializedError usage in get_namespace_data
...
- Add PipelineNotInitializedError import to shared_storage.py
- Raise PipelineNotInitializedError when accessing uninitialized pipeline_status namespace
- This provides clear error messages to users about initialization requirements
- Other namespaces continue to be created dynamically as before
Addresses review feedback from PR #1978 about unused exception class
2025-08-22 02:52:51 +00:00
Albert Gil López
3a64b267cb
Merge upstream/main and resolve conflicts
2025-08-21 16:56:11 +00:00
Daniel.y
0019a3adc6
Merge pull request #1989 from OnesoftQwQ/patch-1
...
Update README-zh.md
2025-08-21 23:20:21 +08:00
yangdx
16a1ef1178
Update summary_max_tokens default from 10k to 30k tokens
2025-08-21 23:16:07 +08:00
Daniel.y
dce678642a
Merge pull request #1992 from danielaskdd/doc-list-refresh
...
Fix: Preserve Document List Pagination During Pipeline Status Changes
2025-08-21 23:14:08 +08:00
yangdx
105fb43a54
Updat webui assets and bump api version to 0206
2025-08-21 22:56:44 +08:00
yangdx
ec1bf43667
fix(webui): preserve current page when pipeline status changes
...
- Add intelligent refresh function to handle boundary cases
- Replace manual refresh with smart page preservation logic
- Auto-redirect to last page when current page becomes invalid
- Maintain user's browsing position during pipeline start/stop
- Fix issue where document list would reset to first page after pipeline operations
2025-08-21 22:53:24 +08:00
Onesoft
b4db084816
Update README-zh.md
2025-08-21 20:01:41 +08:00
yangdx
8c6b5f4a3a
Update README
2025-08-21 18:14:27 +08:00
yangdx
718025dbea
Update embedding configuration docs and add aws_bedrock option
2025-08-21 17:55:04 +08:00
yangdx
b5c230abdd
optimize: avoid duplicate embedding calls in _build_query_context
...
Reduces API costs and improves query performance while maintaining backward compatibility.
2025-08-21 16:49:24 +08:00
Daniel.y
10bcf1479f
Merge pull request #1987 from danielaskdd/llm-optimization
...
feat: Support turning off thinking on OpenRouter/vLLM
2025-08-21 15:22:34 +08:00
yangdx
62cdc7d7eb
Update documentation with LLM selection guidelines and API improvements
2025-08-21 13:59:14 +08:00
yangdx
4b2ef71c25
feat: Add extra_body parameter support for OpenRouter/vLLM compatibility
...
- Enhanced add_args function to handle dict types with JSON parsing
- Added reasoning and extra_body parameters for OpenRouter/vLLM compatibility
- Updated env.example with OpenRouter/vLLM parameter examples
2025-08-21 13:06:28 +08:00
yangdx
5d34007f2c
Add presence penalty config option for smaller models
...
- Add OPENAI_LLM_PRESENCE_PENALTY setting
- Recommend 1.5 for Qwen3 <32B params
- Update max completion tokens comment
2025-08-21 11:35:23 +08:00
yangdx
0dd245e847
Add OpenAI reasoning effort and max completion tokens config options
2025-08-21 11:04:06 +08:00
yangdx
0e67ead8fa
Rename MAX_TOKENS to SUMMARY_MAX_TOKENS for clarity
2025-08-21 10:15:20 +08:00
yangdx
aa22772721
Refactor LLM temperature handling to be provider-specific
...
• Remove global temperature parameter
• Add provider-specific temp configs
• Update env example with new settings
• Fix Bedrock temperature handling
• Clean up splash screen display
2025-08-20 23:52:33 +08:00
yangdx
df7bcb1e3d
Add LLM_TIMEOUT configuration for all LLM providers
...
- Add LLM_TIMEOUT env variable
- Apply timeout to all LLM bindings
2025-08-20 23:50:57 +08:00
yangdx
4c556d8aae
Set default TIMEOUT value to 150, and gunicorn timeout to TIMEOUT+30
2025-08-20 22:04:32 +08:00
yangdx
9b7ed84e05
Improve document deletion error handling and message consistency
...
- Standardize deletion log messages
- Add try-catch for file operations
- Improve enqueued file error handling
2025-08-20 11:01:24 +08:00
yangdx
a4c4b1182a
Fix logging level usage in Redis retry decorator
...
* Replace string with logging.WARNING constant
2025-08-20 05:21:15 +08:00
yangdx
485c4b7de7
Change document deletion warnings to info level logging
2025-08-20 03:28:42 +08:00
Daniel.y
ac9647d117
Merge pull request #1983 from danielaskdd/santitize-text
...
Fix: resolved UTF-8 encoding error during document processing
2025-08-20 02:52:19 +08:00
Daniel.y
a98b814df5
Merge pull request #1982 from danielaskdd/pipeline-remove-enqueued-file
...
Fix(UI): Implement XLSX format upload support for web UI
2025-08-19 19:58:18 +08:00
yangdx
ced3aef7cb
refactor: simplify text encoding by removing redundant safe_encode_for_llm
2025-08-19 19:37:46 +08:00
yangdx
806081645f
Refactor text cleaning to use sanitize_text_for_encoding consistently
...
• Replace clean_text with sanitize_text
• Remove deprecated clean_text function
• Add whitespace trimming to sanitizer
• Improve UTF-8 encoding safety
• Consolidate text cleaning logic
2025-08-19 19:20:01 +08:00
yangdx
f9cf544805
Add text sanitization to prevent UTF-8 encoding errors in LLM calls
...
• Remove surrogate characters
• Clean control characters
• Sanitize input and history messages
• Add comprehensive error handling
• Log sanitization activities
2025-08-19 18:50:52 +08:00
yangdx
64015548df
Refactor MD5 hash functions and consolidate Unicode error handling
2025-08-19 17:49:23 +08:00
yangdx
64058c771f
Refactor: Harden compute_args_hash against Unicode errors
2025-08-19 17:19:39 +08:00
yangdx
2603e99005
Enhance file deletion to remove files from both input and enqueued dirs
2025-08-19 17:13:58 +08:00
yangdx
1f86543772
Update i18n translation and webui assets
2025-08-19 16:23:05 +08:00
yangdx
c6b30f1a03
Fix file type mappings for proper MIME type handling
2025-08-19 15:26:21 +08:00
yangdx
950221db59
Refactor keyword extraction rules and remove overlap constraint
...
• Require content in both keyword categories
• Remove no-overlap rule between lists
• Simplify edge case handling
• Clarify source of truth requirement
2025-08-19 15:12:15 +08:00
yangdx
0aa1bc8bf9
Update webui assets and bump api version to 0205
2025-08-19 15:11:34 +08:00
yangdx
e38df464ea
Ensure front-end file type uploads are synchronized with back-end
2025-08-19 15:10:13 +08:00
yangdx
ac33cf693d
Refactor keyword extraction rules and remove overlap constraint
...
• Require content in both keyword categories
• Remove no-overlap rule between lists
• Simplify edge case handling
• Clarify source of truth requirement
2025-08-19 15:07:40 +08:00
Albert Gil López
f35963c020
feat: Add clear error messages for uninitialized storage
...
- Add StorageNotInitializedError and PipelineNotInitializedError exceptions
- Update JsonDocStatusStorage to raise clear errors when not initialized
- Update JsonKVStorage to raise clear errors when not initialized
- Error messages now include complete initialization instructions
- Helps users understand and fix initialization issues quickly
Addresses feedback from issue #1933 about improving error clarity
2025-08-19 06:41:52 +00:00
yangdx
9ed5b93467
Add [File Extraction] prefix to error messages and logs
2025-08-19 11:33:28 +08:00
Daniel.y
ce35b1dfd4
Merge pull request #1977 from danielaskdd/keywork-extract
...
Optimize keyword extraction prompt, and remove conversation history from keyword extraction
2025-08-19 00:47:02 +08:00
yangdx
92c0ad0076
Fix linting
2025-08-19 00:45:29 +08:00
yangdx
23334e7e51
Update prompt.py
2025-08-19 00:29:33 +08:00
yangdx
2a7fec2873
Optimize keyword extraction prompt, and remove conversation history from keywork extraction.
...
- Remove history context processing
- Update prompt to focus on single query
- Clarify high/low level keyword types
- Improve JSON output instructions
- Add edge case handling guidance
2025-08-18 23:35:04 +08:00
yangdx
ee15629f26
Merge branch 'pg-optimization'
2025-08-18 22:34:08 +08:00
yangdx
cdfbd2114f
Merge branch 'main' into pg-optimization
2025-08-18 22:24:37 +08:00
yangdx
d54c8f973b
Merge branch 'Matt23-star/main' into pg-optimization
2025-08-18 22:23:47 +08:00
yangdx
1c4d6fde58
Change log level from info to debug for document storage message
2025-08-18 20:04:29 +08:00
Daniel.y
5fc2400a70
Merge pull request #1976 from danielaskdd/kg-context-file-path
...
Refactor: Remove file_path and created_at from entity and relation query context send to LLM
2025-08-18 19:40:54 +08:00
yangdx
368d2b00d6
Update webui assets and bump api version to 0204
2025-08-18 19:33:46 +08:00
yangdx
d5e8f1e860
Update default query parameters for better performance
...
- Increase chunk_top_k from 10 to 20
- Reduce max_entity_tokens to 6000
- Reduce max_relation_tokens to 8000
- Update web UI default values
- Fix max_total_tokens to 30000
2025-08-18 19:32:11 +08:00