Commit graph

65 commits

Author SHA1 Message Date
yangdx
598eecd06d Refactor: Rename llm_model_max_token_size to summary_max_tokens
This commit renames the parameter 'llm_model_max_token_size' to 'summary_max_tokens' for better clarity, as it specifically controls the token limit for entity relation summaries.
2025-07-28 00:49:08 +08:00
yangdx
80f7e37168 Fix default workspace name for PostgreSQL AGE graph storage 2025-07-16 19:16:22 +08:00
yangdx
7f9b15dcf3 Fix linting 2025-07-16 11:11:30 +08:00
yangdx
1c53c5c764 Update README.md 2025-07-16 11:10:56 +08:00
yangdx
47341d3a71 Merge branch 'main' into rerank 2025-07-15 16:12:33 +08:00
yangdx
e8e1f6ab56 feat: centralize environment variable defaults in constants.py 2025-07-15 16:11:50 +08:00
yangdx
ccc2a20071 feat: remove deprecated MAX_TOKEN_SUMMARY parameter to prevent LLM output truncation
- Remove MAX_TOKEN_SUMMARY parameter and related configurations
- Eliminate forced token-based truncation in entity/relationship descriptions
- Switch to fragment-count based summarization logic using FORCE_LLM_SUMMARY_ON_MERGE
- Update FORCE_LLM_SUMMARY_ON_MERGE default from 6 to 4 for better summarization
- Clean up documentation, environment examples, and API display code
- Preserve backward compatibility by graceful parameter removal

This change resolves issues where LLMs were forcibly truncating entity relationship
descriptions mid-sentence, leading to incomplete and potentially inaccurate knowledge
graph content. The new approach allows LLMs to generate complete descriptions while
still providing summarization when multiple fragments need to be merged.

Breaking Change: None - parameter removal is backward compatible
Fixes: Entity relationship description truncation issues
2025-07-15 12:26:33 +08:00
zrguo
7c882313bb remove chunk_rerank_top_k 2025-07-15 11:52:34 +08:00
zrguo
4e425b1b59 Revert "update from main"
This reverts commit 1d0376d6a9.
2025-07-14 16:29:00 +08:00
zrguo
1d0376d6a9 update from main 2025-07-14 16:27:49 +08:00
zrguo
c9cbd2d3e0 Merge branch 'main' into rerank 2025-07-14 16:24:29 +08:00
zrguo
ef2115d437 Update token limit 2025-07-14 15:53:48 +08:00
yangdx
b03bb48e24 feat: Refine summary logic and add dedicated Ollama num_ctx config
- Refactor the trigger condition for LLM-based summarization of entities and relations. Instead of relying on character length, the summary is now triggered when the number of merged description fragments exceeds a configured threshold. This provides a more robust and logical condition for consolidation.
- Introduce the `OLLAMA_NUM_CTX` environment variable to explicitly configure the context window size (`num_ctx`) for Ollama models. This decouples the model's context length from the `MAX_TOKENS` parameter, which is now specifically used to limit input for summary generation, making the configuration clearer and more flexible.
- Updated `README` files, `env.example`, and default values to reflect these changes.
2025-07-14 01:55:04 +08:00
yangdx
9aa2ed0837 Merge branch 'main' into rerank 2025-07-09 15:33:39 +08:00
yangdx
e457374224 Fix linting 2025-07-09 15:33:05 +08:00
yangdx
bfa0844ecb Update README 2025-07-09 15:17:05 +08:00
yangdx
5d4484882a Merge branch 'main' into rerank 2025-07-09 03:59:04 +08:00
zrguo
04a57445da update chunks truncation method 2025-07-08 13:31:05 +08:00
frankj
9a9674d590
Fix incorrect file path (404 Not Found)
Issue Description
A 404 error occurred when accessing the repository link pointing to README_zh.md. Upon inspection, the actual file path is README-zh.md, indicating an incorrect path reference in the original link.

Fix Details
Corrected the broken link from README_zh.md to the correct path README-zh.md.

Verification Method
After modification, the target file opens normally in the browser.

Hope this fix helps users access the Chinese documentation properly—thanks for the review!
2025-07-08 10:24:19 +08:00
yangdx
f417118e27 Center banner text dynamically 2025-07-07 17:28:59 +08:00
yangdx
13b2c93eec Update README.md 2025-07-07 16:44:30 +08:00
yangdx
bdfd2d53c7 Fix linting 2025-07-05 11:43:45 +08:00
yangdx
2e2b9f3b48 Refactor setup.py to utilize pyproject.toml for project installation. 2025-07-05 11:19:00 +08:00
zrguo
5b6ac84cd0 Update README 2025-06-26 18:08:21 +08:00
zrguo
afba3df01a Update README 2025-06-26 16:20:39 +08:00
zrguo
fc7a0329df Update README 2025-06-26 16:17:00 +08:00
zrguo
4937de8809 Update 2025-06-22 15:12:09 +08:00
zrguo
03dd99912d RAG-Anything Integration 2025-06-17 01:16:02 +08:00
zrguo
dc97b2b84f Update README.md 2025-06-05 17:51:04 +08:00
zrguo
8a726f6e08 MinerU integration 2025-06-05 17:02:48 +08:00
yangdx
e26a013fc3 Fix linting 2025-05-16 09:28:17 +08:00
yangdx
0a613208c1 Update README 2025-05-16 09:28:08 +08:00
yangdx
b9c25dfeb0 Update README 2025-05-14 14:42:52 +08:00
yangdx
db125c3764 Update README 2025-05-14 11:29:46 +08:00
yangdx
b836d02cac Optimize Ollama LLM driver 2025-05-14 01:13:03 +08:00
yangdx
5a3bf5ecc8 Fix linting 2025-05-11 10:25:59 +08:00
yangdx
4e1caf1e40 Fix lingting 2025-05-09 10:43:37 +08:00
yangdx
11fa70f7d1 Update README.md 2025-05-09 10:43:19 +08:00
yangdx
4a03218450 Update README.md 2025-05-08 05:26:59 +08:00
yangdx
9aedf1b38a Update README for QueryParam description 2025-05-05 18:30:49 +08:00
yangdx
7ccc3ffdd7 Fix linting 2025-04-30 14:10:50 +08:00
yangdx
946e55115d Update README 2025-04-30 11:07:38 +08:00
yangdx
c48ead6d8e Update README 2025-04-23 09:43:24 +08:00
yangdx
6716e19d5c Fix linting 2025-04-21 01:22:23 +08:00
yangdx
bd18c9c8ad Update sample code in README.md 2025-04-21 01:22:04 +08:00
yangdx
908953924a Update README 2025-04-21 00:25:26 +08:00
yangdx
dd4f92dae2 Update README.md 2025-04-20 20:33:01 +08:00
yangdx
39540f3f8b Fix linting 2025-04-20 14:33:33 +08:00
yangdx
5f2cd871a8 Update sample code and README 2025-04-20 14:33:16 +08:00
yangdx
48e49fbe34 Merge branch 'drahnreb/add-custom-tokenizer' 2025-04-20 12:22:10 +08:00